Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TTreeCache.cxx
Go to the documentation of this file.
1// @(#)root/tree:$Id$
2// Author: Rene Brun 04/06/2006
3
4/*************************************************************************
5 * Copyright (C) 1995-2018, Rene Brun and Fons Rademakers. *
6 * All rights reserved. *
7 * *
8 * For the licensing terms see $ROOTSYS/LICENSE. *
9 * For the list of contributors see $ROOTSYS/README/CREDITS. *
10 *************************************************************************/
11
12/** \class TTreeCache
13\ingroup tree
14\brief A cache to speed-up the reading of ROOT datasets
15
16# A cache to speed-up the reading of ROOT datasets
17
18## Table of Contents
19- [Motivation](\ref motivation)
20- [General Description](\ref description)
21- [Changes in behaviour](\ref changesbehaviour)
22- [Self-optimization](\ref cachemisses)
23- [Examples of usage](\ref examples)
24- [Check performance and stats](\ref checkPerf)
25
26\anchor motivation
27## Motivation: why having a cache is needed?
28
29When writing a TTree, the branch buffers are kept in memory.
30A typical branch buffersize (before compression) is typically 32 KBytes.
31After compression, the zipped buffer may be just a few Kbytes.
32The branch buffers cannot be much larger in case of TTrees with several
33hundred or thousand branches.
34
35When writing, this does not generate a performance problem because branch
36buffers are always written sequentially and, thanks to OS optimisations,
37content is flushed to the output file when a few MBytes of data are available.
38On the other hand, when reading, one may hit performance problems because of
39latencies e.g imposed by network.
40For example in a WAN with 10ms latency, reading 1000 buffers of 10 KBytes each
41with no cache will imply 10s penalty where a local read of the 10 MBytes would
42take about 1 second.
43
44The TreeCache tries to prefetch all the buffers for the selected branches
45in order to transfer a few multi-Megabytes large buffers instead of many
46multi-kilobytes small buffers. In addition, TTreeCache can sort the blocks to
47be read in increasing order such that the file is read sequentially.
48
49Systems like xrootd, dCache or httpd take advantage of the TTreeCache in
50reading ahead as much data as they can and return to the application
51the maximum data specified in the cache and have the next chunk of data ready
52when the next request comes.
53
54### Are there cases for which the usage of TTreeCache is detrimental for performance?
55Yes, some corner cases. For example, when reading only a small fraction of all
56entries such that not all branch buffers are read.
57
58\anchor description
59## General Description
60This class acts as a file cache, registering automatically the baskets from
61the branches being processed via direct manipulation of TTrees or with tools
62such as TTree::Draw, TTree::Process, TSelector, TTreeReader and RDataFrame
63when in the learning phase. The learning phase is by default 100 entries.
64It can be changed via TTreeCache::SetLearnEntries.
65
66The usage of a TTreeCache can considerably improve the runtime performance at
67the price of a modest investment in memory, in particular when the TTree is
68accessed remotely, e.g. via a high latency network.
69
70For each TTree being processed a TTreeCache object is created.
71This object is automatically deleted when the Tree is deleted or
72when the file is deleted.
73The user can change the size of the cache with the TTree::SetCacheSize method
74(by default the size is 30 Megabytes). This feature can be controlled with the
75environment variable `ROOT_TTREECACHE_SIZE` or the TTreeCache.Size option.
76The entry range for which the cache is active can also be set with the
77SetEntryRange method.
78
79\anchor changesbehaviour
80## Changes of behavior when using TChain and TEventList
81
82The usage of TChain or TEventList have influence on the behaviour of the cache:
83
84- Special case of a TChain
85 Once the training is done on the first Tree, the list of branches
86 in the cache is kept for the following files.
87
88- Special case of a TEventlist
89 if the Tree or TChain has a TEventlist, only the buffers
90 referenced by the list are put in the cache.
91
92The learning phase is started or restarted when:
93 - TTree automatically creates a cache.
94 - TTree::SetCacheSize is called with a non-zero size and a cache
95 did not previously exist
96 - TTreeCache::StartLearningPhase is called.
97 - TTreeCache::SetEntryRange is called
98 * and the learning is not yet finished
99 * and has not been set to manual
100 * and the new minimun entry is different.
101
102The learning period is stopped (and prefetching is started) when:
103 - TTreeCache::StopLearningPhase is called.
104 - An entry outside the 'learning' range is requested
105 The 'learning range is from fEntryMin (default to 0) to
106 fEntryMin + fgLearnEntries.
107 - A 'cached' TChain switches over to a new file.
108
109
110\anchor cachemisses
111## Self-optimization in presence of cache misses
112
113The TTreeCache can optimize its behavior on a cache miss. When
114miss optimization is enabled (see the SetOptimizeMisses method),
115it tracks all branches utilized after the learning phase which caused a cache
116miss.
117When one cache miss occurs, all the utilized branches are be prefetched
118for that event. This optimization utilizes the observation that infrequently
119accessed branches are often accessed together.
120An example scenario where such behavior is desirable, is an analysis where
121a set of collections are read only for a few events in which a certain
122condition is respected, e.g. a trigger fired.
123
124### Additional memory and CPU usage when optimizing for cache misses
125When this mode is enabled, the memory dedicated to the cache can increase
126by at most a factor two in the case of cache miss.
127Additionally, on the first miss of an event, we must iterate through all the
128"active branches" for the miss cache and find the correct basket.
129This can be potentially a CPU-expensive operation compared to, e.g., the
130latency of a SSD. This is why the miss cache is currently disabled by default.
131
132\anchor examples
133## Example usages of TTreeCache
134
135A few use cases are discussed below. A cache may be created with automatic
136sizing when a TTree is used:
137
138In some applications, e.g. central processing workflows of experiments, the list
139of branches to read is known a priori. For these cases, the TTreeCache can be
140instructed about the branches which will be read via explicit calls to the TTree
141or TTreeCache interfaces.
142In less streamlined applications such as analysis, predicting the branches which
143will be read can be difficult. In such cases, ROOT I/O flags used branches
144automatically when a branch buffer is read during the learning phase.
145
146In the examples below, portions of analysis code are shown.
147The few statements involving the TreeCache are marked with `//<<<`
148
149### ROOT::RDataFrame and TTreeReader Examples
150
151If you use RDataFrame or TTreeReader, the system will automatically cache the
152best set of branches: no action is required by the user.
153
154### TTree::Draw Example
155
156The TreeCache is automatically used by TTree::Draw. The method knows
157which branches are used in the query and it puts automatically these branches
158in the cache. The entry range is also inferred automatically.
159
160### TTree::Process and TSelectors Examples
161
162The user must enable the cache and tell the system which branches to cache
163and also specify the entry range. It is important to specify the entry range
164in case only a subset of the events is processed to avoid wasteful caching.
165
166#### Reading all branches
167
168~~~ {.cpp}
169 TTree *T;
170 f->GetObject(T, "mytree");
171 auto nentries = T->GetEntries();
172 auto cachesize = 10000000U; // 10 MBytes
173 T->SetCacheSize(cachesize); //<<<
174 T->AddBranchToCache("*", true); //<<< add all branches to the cache
175 T->Process("myselector.C+");
176 // In the TSelector::Process function we read all branches
177 T->GetEntry(i);
178 // ... Here the entry is processed
179~~~
180
181#### Reading a subset of all branches
182
183In the Process function we read a subset of the branches.
184Only the branches used in the first entry will be put in the cache
185~~~ {.cpp}
186 TTree *T;
187 f->GetObject(T, "mytree");
188 // We want to process only the 200 first entries
189 auto nentries=200UL;
190 auto efirst = 0;
191 auto elast = efirst+nentries;
192 auto cachesize = 10000000U; // 10 MBytes
193 TTreeCache::SetLearnEntries(1); //<<< we can take the decision after 1 entry
194 T->SetCacheSize(cachesize); //<<<
195 T->SetCacheEntryRange(efirst,elast); //<<<
196 T->Process("myselector.C+","",nentries,efirst);
197 // In the TSelector::Process we read only 2 branches
198 auto b1 = T->GetBranch("branch1");
199 b1->GetEntry(i);
200 if (somecondition) return;
201 auto b2 = T->GetBranch("branch2");
202 b2->GetEntry(i);
203 ... Here the entry is processed
204~~~
205### Custom event loop
206
207#### Always using the same two branches
208
209In this example, exactly two branches are always used: those need to be
210prefetched.
211~~~ {.cpp}
212 TTree *T;
213 f->GetObject(T, "mytree");
214 auto b1 = T->GetBranch("branch1");
215 auto b2 = T->GetBranch("branch2");
216 auto nentries = T->GetEntries();
217 auto cachesize = 10000000U; //10 MBytes
218 T->SetCacheSize(cachesize); //<<<
219 T->AddBranchToCache(b1, true); //<<< add branch1 and branch2 to the cache
220 T->AddBranchToCache(b2, true); //<<<
221 T->StopCacheLearningPhase(); //<<< we do not need the system to guess anything
222 for (auto i : TSeqL(nentries)) {
223 T->LoadTree(i); //<<< important call when calling TBranch::GetEntry after
224 b1->GetEntry(i);
225 if (some condition not met) continue;
226 b2->GetEntry(i);
227 if (some condition not met) continue;
228 // Here we read the full event only in some rare cases.
229 // There is no point in caching the other branches as it might be
230 // more economical to read only the branch buffers really used.
231 T->GetEntry(i);
232 ... Here the entry is processed
233 }
234~~~
235#### Always using at least the same two branches
236
237In this example, two branches are always used: in addition, some analysis
238functions are invoked and those may trigger the reading of other branches which
239are a priori not known.
240There is no point in prefetching branches that will be used very rarely: we can
241rely on the system to cache the right branches.
242~~~ {.cpp}
243 TTree *T;
244 f->GetObject(T, "mytree");
245 auto nentries = T->GetEntries();
246 auto cachesize = 10000000; //10 MBytes
247 T->SetCacheSize(cachesize); //<<<
248 T->SetCacheLearnEntries(5); //<<< we can take the decision after 5 entries
249 auto b1 = T->GetBranch("branch1");
250 auto b2 = T->GetBranch("branch2");
251 for (auto i : TSeqL(nentries)) {
252 T->LoadTree(i);
253 b1->GetEntry(i);
254 if (some condition not met) continue;
255 b2->GetEntry(i);
256 // At this point we may call a user function where a few more branches
257 // will be read conditionally. These branches will be put in the cache
258 // if they have been used in the first 10 entries
259 if (some condition not met) continue;
260 // Here we read the full event only in some rare cases.
261 // There is no point in caching the other branches as it might be
262 // more economical to read only the branch buffers really used.
263 T->GetEntry(i);
264 .. process the rare but interesting cases.
265 ... Here the entry is processed
266 }
267~~~
268
269\anchor checkPerf
270## How can the usage and performance of TTreeCache be verified?
271
272Once the event loop terminated, the number of effective system reads for a
273given file can be checked with a code like the following:
274~~~ {.cpp}
275 printf("Reading %lld bytes in %d transactions\n",myTFilePtr->GetBytesRead(), f->GetReadCalls());
276~~~
277
278Another handy command is:
279~~~ {.cpp}
280myTreeOrChain.GetTree()->PrintCacheStats();
281~~~
282
283*/
284
285#include "TSystem.h"
286#include "TEnv.h"
287#include "TTreeCache.h"
288#include "TChain.h"
289#include "TList.h"
290#include "TBranch.h"
291#include "TBranchElement.h"
292#include "TEventList.h"
293#include "TObjArray.h"
294#include "TObjString.h"
295#include "TRegexp.h"
296#include "TLeaf.h"
297#include "TFriendElement.h"
298#include "TFile.h"
299#include "TMath.h"
300#include "TBranchCacheInfo.h"
301#include "TVirtualPerfStats.h"
302#include <climits>
303
304#include <memory>
305
307
309
310////////////////////////////////////////////////////////////////////////////////
311/// Default Constructor.
312
313TTreeCache::TTreeCache() : TFileCacheRead(), fPrefillType(GetConfiguredPrefillType())
314{
315}
316
317////////////////////////////////////////////////////////////////////////////////
318/// Constructor.
319
321 : TFileCacheRead(tree->GetCurrentFile(), buffersize, tree), fEntryMax(tree->GetEntriesFast()), fEntryNext(0),
322 fBrNames(new TList), fTree(tree), fPrefillType(GetConfiguredPrefillType())
323{
325 Int_t nleaves = tree->GetListOfLeaves()->GetEntriesFast();
326 fBranches = new TObjArray(nleaves);
327}
328
329////////////////////////////////////////////////////////////////////////////////
330/// Destructor. (in general called by the TFile destructor)
331
333{
334 // Informe the TFile that we have been deleted (in case
335 // we are deleted explicitly by legacy user code).
336 if (fFile) fFile->SetCacheRead(nullptr, fTree);
337
338 delete fBranches;
339 if (fBrNames) {fBrNames->Delete(); delete fBrNames; fBrNames=nullptr;}
340}
341
342////////////////////////////////////////////////////////////////////////////////
343/// Add a branch discovered by actual usage to the list of branches to be stored
344/// in the cache this function is called by TBranch::GetBasket
345/// If we are not longer in the training phase this is an error.
346/// Returns:
347/// - 0 branch added or already included
348/// - -1 on error
349
350Int_t TTreeCache::LearnBranch(TBranch *b, bool subbranches /*= false*/)
351{
352 if (!fIsLearning) {
353 return -1;
354 }
355
356 // Reject branch that are not from the cached tree.
357 if (!b || fTree->GetTree() != b->GetTree()) return -1;
358
359 // Is this the first addition of a branch (and we are learning and we are in
360 // the expected TTree), then prefill the cache. (We expect that in future
361 // release the Prefill-ing will be the default so we test for that inside the
362 // LearnPrefill call).
364
365 return AddBranch(b, subbranches);
366}
367
368////////////////////////////////////////////////////////////////////////////////
369/// Add a branch to the list of branches to be stored in the cache
370/// this function is called by the user via TTree::AddBranchToCache.
371/// The branch is added even if we are outside of the training phase.
372/// Returns:
373/// - 0 branch added or already included
374/// - -1 on error
375
376Int_t TTreeCache::AddBranch(TBranch *b, bool subbranches /*= false*/)
377{
378 // Reject branch that are not from the cached tree.
379 if (!b || fTree->GetTree() != b->GetTree()) return -1;
380
381 //Is branch already in the cache?
382 bool isNew = true;
383 for (int i=0;i<fNbranches;i++) {
384 if (fBranches->UncheckedAt(i) == b) {isNew = false; break;}
385 }
386 if (isNew) {
387 fTree = b->GetTree();
389 const char *bname = b->GetName();
390 if (fTree->IsA() == TChain::Class()) {
391 // If we have a TChain, we will need to use the branch name
392 // and we better disambiguate them (see atlasFlushed.root for example)
393 // in order to cache all the requested branches.
394 // We do not do this all the time as GetMother is slow (it contains
395 // a linear search from list of top level branch).
396 TString build;
397 const char *mothername = b->GetMother()->GetName();
398 if (b != b->GetMother() && mothername[strlen(mothername)-1] != '.') {
399 // Maybe we ought to prefix the name to avoid ambiguity.
400 auto bem = dynamic_cast<TBranchElement*>(b->GetMother());
401 if (bem->GetType() < 3) {
402 // Not a collection.
403 build = mothername;
404 build.Append(".");
405 if (strncmp(bname,build.Data(),build.Length()) != 0) {
406 build.Append(bname);
407 bname = build.Data();
408 }
409 }
410 }
411 }
412 fBrNames->Add(new TObjString(bname));
413 fNbranches++;
414 if (gDebug > 0) printf("Entry: %lld, registering branch: %s\n",b->GetTree()->GetReadEntry(),b->GetName());
415 }
416
417 // process subbranches
418 Int_t res = 0;
419 if (subbranches) {
420 TObjArray *lb = b->GetListOfBranches();
421 Int_t nb = lb->GetEntriesFast();
422 for (Int_t j = 0; j < nb; j++) {
423 TBranch* branch = (TBranch*) lb->UncheckedAt(j);
424 if (!branch) continue;
425 if (AddBranch(branch, subbranches)<0) {
426 res = -1;
427 }
428 }
429 }
430 return res;
431}
432
433////////////////////////////////////////////////////////////////////////////////
434/// Add a branch to the list of branches to be stored in the cache
435/// this is to be used by user (thats why we pass the name of the branch).
436/// It works in exactly the same way as TTree::SetBranchStatus so you
437/// probably want to look over there for details about the use of bname
438/// with regular expressions.
439/// The branches are taken with respect to the Owner of this TTreeCache
440/// (i.e. the original Tree)
441/// NB: if bname="*" all branches are put in the cache and the learning phase stopped
442/// Returns:
443/// - 0 branch added or already included
444/// - -1 on error
445
446Int_t TTreeCache::AddBranch(const char *bname, bool subbranches /*= false*/)
447{
448 TBranch *branch, *bcount;
449 TLeaf *leaf, *leafcount;
450
451 Int_t i;
452 Int_t nleaves = (fTree->GetListOfLeaves())->GetEntriesFast();
453 TRegexp re(bname,true);
454 Int_t nb = 0;
455 Int_t res = 0;
456
457 // first pass, loop on all branches
458 // for leafcount branches activate/deactivate in function of status
459 bool all = false;
460 if (!strcmp(bname,"*")) all = true;
461 for (i=0;i<nleaves;i++) {
462 leaf = (TLeaf*)(fTree->GetListOfLeaves())->UncheckedAt(i);
463 branch = (TBranch*)leaf->GetBranch();
464 TString s = branch->GetName();
465 if (!all) { //Regexp gives wrong result for [] in name
466 TString longname;
467 longname.Form("%s.%s",fTree->GetName(),branch->GetName());
468 if (strcmp(bname,branch->GetName())
469 && longname != bname
470 && s.Index(re) == kNPOS) continue;
471 }
472 nb++;
473 if (AddBranch(branch, subbranches)<0) {
474 res = -1;
475 }
476 leafcount = leaf->GetLeafCount();
477 if (leafcount && !all) {
478 bcount = leafcount->GetBranch();
479 if (AddBranch(bcount, subbranches)<0) {
480 res = -1;
481 }
482 }
483 }
484 if (nb==0 && strchr(bname,'*')==nullptr) {
485 branch = fTree->GetBranch(bname);
486 if (branch) {
487 if (AddBranch(branch, subbranches)<0) {
488 res = -1;
489 }
490 ++nb;
491 }
492 }
493
494 //search in list of friends
495 UInt_t foundInFriend = 0;
496 if (fTree->GetListOfFriends()) {
497 TIter nextf(fTree->GetListOfFriends());
498 TFriendElement *fe;
500 while ((fe = (TFriendElement*)nextf())) {
501 TTree *t = fe->GetTree();
502 if (t==nullptr) continue;
503
504 // If the alias is present replace it with the real name.
505 char *subbranch = (char*)strstr(bname,fe->GetName());
506 if (subbranch!=bname) subbranch = nullptr;
507 if (subbranch) {
508 subbranch += strlen(fe->GetName());
509 if ( *subbranch != '.' ) subbranch = nullptr;
510 else subbranch ++;
511 }
512 if (subbranch) {
513 name.Form("%s.%s",t->GetName(),subbranch);
514 if (name != bname && AddBranch(name, subbranches)<0) {
515 res = -1;
516 }
517 ++foundInFriend;
518 }
519 }
520 }
521 if (!nb && !foundInFriend) {
522 if (gDebug > 0) printf("AddBranch: unknown branch -> %s \n", bname);
523 Error("AddBranch", "unknown branch -> %s", bname);
524 return -1;
525 }
526 //if all branches are selected stop the learning phase
527 if (*bname == '*') {
528 fEntryNext = -1; // We are likely to have change the set of branches, so for the [re-]reading of the cluster.
530 }
531 return res;
532}
533
534////////////////////////////////////////////////////////////////////////////////
535/// Remove a branch to the list of branches to be stored in the cache
536/// this function is called by TBranch::GetBasket.
537/// Returns:
538/// - 0 branch dropped or not in cache
539/// - -1 on error
540
541Int_t TTreeCache::DropBranch(TBranch *b, bool subbranches /*= false*/)
542{
543 if (!fIsLearning) {
544 return -1;
545 }
546
547 // Reject branch that are not from the cached tree.
548 if (!b || fTree->GetTree() != b->GetTree()) return -1;
549
550 //Is branch already in the cache?
551 if (fBranches->Remove(b)) {
552 --fNbranches;
553 if (gDebug > 0) printf("Entry: %lld, un-registering branch: %s\n",b->GetTree()->GetReadEntry(),b->GetName());
554 }
555 delete fBrNames->Remove(fBrNames->FindObject(b->GetName()));
556
557 // process subbranches
558 Int_t res = 0;
559 if (subbranches) {
560 TObjArray *lb = b->GetListOfBranches();
561 Int_t nb = lb->GetEntriesFast();
562 for (Int_t j = 0; j < nb; j++) {
563 TBranch* branch = (TBranch*) lb->UncheckedAt(j);
564 if (!branch) continue;
565 if (DropBranch(branch, subbranches)<0) {
566 res = -1;
567 }
568 }
569 }
570 return res;
571}
572
573////////////////////////////////////////////////////////////////////////////////
574/// Remove a branch to the list of branches to be stored in the cache
575/// this is to be used by user (thats why we pass the name of the branch).
576/// It works in exactly the same way as TTree::SetBranchStatus so you
577/// probably want to look over there for details about the use of bname
578/// with regular expressions.
579/// The branches are taken with respect to the Owner of this TTreeCache
580/// (i.e. the original Tree)
581/// NB: if bname="*" all branches are put in the cache and the learning phase stopped
582/// Returns:
583/// - 0 branch dropped or not in cache
584/// - -1 on error
585
586Int_t TTreeCache::DropBranch(const char *bname, bool subbranches /*= false*/)
587{
588 TBranch *branch, *bcount;
589 TLeaf *leaf, *leafcount;
590
591 Int_t i;
592 Int_t nleaves = (fTree->GetListOfLeaves())->GetEntriesFast();
593 TRegexp re(bname,true);
594 Int_t nb = 0;
595 Int_t res = 0;
596
597 // first pass, loop on all branches
598 // for leafcount branches activate/deactivate in function of status
599 bool all = false;
600 if (!strcmp(bname,"*")) all = true;
601 for (i=0;i<nleaves;i++) {
602 leaf = (TLeaf*)(fTree->GetListOfLeaves())->UncheckedAt(i);
603 branch = (TBranch*)leaf->GetBranch();
604 TString s = branch->GetName();
605 if (!all) { //Regexp gives wrong result for [] in name
606 TString longname;
607 longname.Form("%s.%s",fTree->GetName(),branch->GetName());
608 if (strcmp(bname,branch->GetName())
609 && longname != bname
610 && s.Index(re) == kNPOS) continue;
611 }
612 nb++;
613 if (DropBranch(branch, subbranches)<0) {
614 res = -1;
615 }
616 leafcount = leaf->GetLeafCount();
617 if (leafcount && !all) {
618 bcount = leafcount->GetBranch();
619 if (DropBranch(bcount, subbranches)<0) {
620 res = -1;
621 }
622 }
623 }
624 if (nb==0 && strchr(bname,'*')==nullptr) {
625 branch = fTree->GetBranch(bname);
626 if (branch) {
627 if (DropBranch(branch, subbranches)<0) {
628 res = -1;
629 }
630 ++nb;
631 }
632 }
633
634 //search in list of friends
635 UInt_t foundInFriend = 0;
636 if (fTree->GetListOfFriends()) {
637 TIter nextf(fTree->GetListOfFriends());
638 TFriendElement *fe;
640 while ((fe = (TFriendElement*)nextf())) {
641 TTree *t = fe->GetTree();
642 if (t==nullptr) continue;
643
644 // If the alias is present replace it with the real name.
645 char *subbranch = (char*)strstr(bname,fe->GetName());
646 if (subbranch!=bname) subbranch = nullptr;
647 if (subbranch) {
648 subbranch += strlen(fe->GetName());
649 if ( *subbranch != '.' ) subbranch = nullptr;
650 else subbranch ++;
651 }
652 if (subbranch) {
653 name.Form("%s.%s",t->GetName(),subbranch);
654 if (DropBranch(name, subbranches)<0) {
655 res = -1;
656 }
657 ++foundInFriend;
658 }
659 }
660 }
661 if (!nb && !foundInFriend) {
662 if (gDebug > 0) printf("DropBranch: unknown branch -> %s \n", bname);
663 Error("DropBranch", "unknown branch -> %s", bname);
664 return -1;
665 }
666 //if all branches are selected stop the learning phase
667 if (*bname == '*') {
668 fEntryNext = -1; // We are likely to have change the set of branches, so for the [re-]reading of the cluster.
669 }
670 return res;
671}
672
673////////////////////////////////////////////////////////////////////////////////
674/// Start of methods for the miss cache.
675////////////////////////////////////////////////////////////////////////////////
676
677////////////////////////////////////////////////////////////////////////////////
678/// Enable / disable the miss cache.
679///
680/// The first time this is called on a TTreeCache object, the corresponding
681/// data structures will be allocated. Subsequent enable / disables will
682/// simply turn the functionality on/off.
684{
685
686 if (opt && !fMissCache) {
688 }
689 fOptimizeMisses = opt;
690}
691
692////////////////////////////////////////////////////////////////////////////////
693/// Reset all the miss cache training.
694///
695/// The contents of the miss cache will be emptied as well as the list of
696/// branches used.
698{
699
700 fLastMiss = -1;
701 fFirstMiss = -1;
702
703 if (!fMissCache) {
704 fMissCache = std::make_unique<MissCache>();
705 }
706 fMissCache->clear();
707}
708
709////////////////////////////////////////////////////////////////////////////////
710/// For the event currently being fetched into the miss cache, find the IO
711/// (offset / length tuple) to pull in the current basket for a given branch.
712///
713/// Returns:
714/// - IOPos describing the IO operation necessary for the basket on this branch
715/// - On failure, IOPos.length will be set to 0.
717{
718 if (R__unlikely(b.GetDirectory() == nullptr)) {
719 // printf("Branch at %p has no valid directory.\n", &b);
720 return IOPos{0, 0};
721 }
722 if (R__unlikely(b.GetDirectory()->GetFile() != fFile)) {
723 // printf("Branch at %p is in wrong file (branch file %p, my file %p).\n", &b, b.GetDirectory()->GetFile(),
724 // fFile);
725 return IOPos{0, 0};
726 }
727
728 // printf("Trying to find a basket for branch %p\n", &b);
729 // Pull in metadata about branch; make sure it is valid
730 Int_t *lbaskets = b.GetBasketBytes();
731 Long64_t *entries = b.GetBasketEntry();
732 if (R__unlikely(!lbaskets || !entries)) {
733 // printf("No baskets or entries.\n");
734 return IOPos{0, 0};
735 }
736 // Int_t blistsize = b.GetListOfBaskets()->GetSize();
737 Int_t blistsize = b.GetWriteBasket();
738 if (R__unlikely(blistsize <= 0)) {
739 // printf("Basket list is size 0.\n");
740 return IOPos{0, 0};
741 }
742
743 // Search for the basket that contains the event of interest. Unlike the primary cache, we
744 // are only interested in a single basket per branch - we don't try to fill the cache.
745 Long64_t basketOffset = TMath::BinarySearch(blistsize, entries, entry);
746 if (basketOffset < 0) { // No entry found.
747 // printf("No entry offset found for entry %ld\n", fTree->GetReadEntry());
748 return IOPos{0, 0};
749 }
750
751 // Check to see if there's already a copy of this basket in memory. If so, don't fetch it
752 if ((basketOffset < blistsize) && b.GetListOfBaskets()->UncheckedAt(basketOffset)) {
753
754 // printf("Basket is already in memory.\n");
755 return IOPos{0, 0};
756 }
757
758 Long64_t pos = b.GetBasketSeek(basketOffset);
759 Int_t len = lbaskets[basketOffset];
760 if (R__unlikely(pos <= 0 || len <= 0)) {
761 /*printf("Basket returned was invalid (basketOffset=%ld, pos=%ld, len=%d).\n", basketOffset, pos, len);
762 for (int idx=0; idx<blistsize; idx++) {
763 printf("Basket entry %d, first event %d, pos %ld\n", idx, entries[idx], b.GetBasketSeek(idx));
764 }*/
765 return IOPos{0, 0};
766 } // Sanity check
767 // Do not cache a basket if it is bigger than the cache size!
769 // printf("Basket size is greater than the cache size.\n");
770 return IOPos{0, 0};
771 }
772
773 return {pos, len};
774}
775
776////////////////////////////////////////////////////////////////////////////////
777/// Given a particular IO description (offset / length) representing a 'miss' of
778/// the TTreeCache's primary cache, calculate all the corresponding IO that
779/// should be performed.
780///
781/// `all` indicates that this function should search the set of _all_ branches
782/// in this TTree. When set to false, we only search through branches that
783/// have previously incurred a miss.
784///
785/// Returns:
786/// - TBranch pointer corresponding to the basket that will be retrieved by
787/// this IO operation.
788/// - If no corresponding branch could be found (or an error occurs), this
789/// returns nullptr.
791{
792 if (R__unlikely((pos < 0) || (len < 0))) {
793 return nullptr;
794 }
795
796 int count = all ? (fTree->GetListOfLeaves())->GetEntriesFast() : fMissCache->fBranches.size();
797 fMissCache->fEntries.reserve(count);
798 fMissCache->fEntries.clear();
799 bool found_request = false;
800 TBranch *resultBranch = nullptr;
801 Long64_t entry = fTree->GetReadEntry();
802
803 std::vector<std::pair<size_t, Int_t>> basketsInfo;
804 auto perfStats = GetTree()->GetPerfStats();
805
806 // printf("Will search %d branches for basket at %ld.\n", count, pos);
807 for (int i = 0; i < count; i++) {
808 TBranch *b =
809 all ? static_cast<TBranch *>(static_cast<TLeaf *>((fTree->GetListOfLeaves())->UncheckedAt(i))->GetBranch())
810 : fMissCache->fBranches[i];
811 IOPos iopos = FindBranchBasketPos(*b, entry);
812 if (iopos.fLen == 0) { // Error indicator
813 continue;
814 }
815 if (iopos.fPos == pos && iopos.fLen == len) {
816 found_request = true;
817 resultBranch = b;
818 // Note that we continue to iterate; fills up the rest of the entries in the cache.
819 }
820 // At this point, we are ready to push back a new offset
821 fMissCache->fEntries.emplace_back(iopos);
822
823 if (R__unlikely(perfStats)) {
824 Int_t blistsize = b->GetWriteBasket();
825 Int_t basketNumber = -1;
826 for (Int_t bn = 0; bn < blistsize; ++bn) {
827 if (iopos.fPos == b->GetBasketSeek(bn)) {
828 basketNumber = bn;
829 break;
830 }
831 }
832 if (basketNumber >= 0)
833 basketsInfo.emplace_back((size_t)i, basketNumber);
834 }
835 }
836 if (R__unlikely(!found_request)) {
837 // We have gone through all the branches in this file and the requested basket
838 // doesn't appear to be in any of them. Likely a logic error / bug.
839 fMissCache->fEntries.clear();
840 }
841 if (R__unlikely(perfStats)) {
842 for (auto &info : basketsInfo) {
843 perfStats->SetLoadedMiss(info.first, info.second);
844 }
845 }
846 return resultBranch;
847}
848
849////////////////////////////////////////////////////////////////////////////////
850///
851/// Process a cache miss; (pos, len) isn't in the buffer.
852///
853/// The first time we have a miss, we buffer as many baskets we can (up to the
854/// maximum size of the TTreeCache) in memory from all branches that are not in
855/// the prefetch list.
856///
857/// Subsequent times, we fetch all the buffers corresponding to branches that
858/// had previously seen misses. If it turns out the (pos, len) isn't in the
859/// list of branches, we treat this as if it was the first miss.
860///
861/// Returns true if we were able to pull the data into the miss cache.
862///
864{
865
866 bool firstMiss = false;
867 if (fFirstMiss == -1) {
869 firstMiss = true;
870 }
872 // The first time this is executed, we try to pull in as much data as we can.
873 TBranch *b = CalculateMissEntries(pos, len, firstMiss);
874 if (!b) {
875 if (!firstMiss) {
876 // TODO: this recalculates for *all* branches, throwing away the above work.
877 b = CalculateMissEntries(pos, len, true);
878 }
879 if (!b) {
880 // printf("ProcessMiss: pos %ld does not appear to correspond to a buffer in this file.\n", pos);
881 // We have gone through all the branches in this file and the requested basket
882 // doesn't appear to be in any of them. Likely a logic error / bug.
883 fMissCache->fEntries.clear();
884 return false;
885 }
886 }
887 // TODO: this should be a set.
888 fMissCache->fBranches.push_back(b);
889
890 // OK, sort the entries
891 std::sort(fMissCache->fEntries.begin(), fMissCache->fEntries.end());
892
893 // Now, fetch the buffer.
894 std::vector<Long64_t> positions;
895 positions.reserve(fMissCache->fEntries.size());
896 std::vector<Int_t> lengths;
897 lengths.reserve(fMissCache->fEntries.size());
898 ULong64_t cumulative = 0;
899 for (auto &mcentry : fMissCache->fEntries) {
900 positions.push_back(mcentry.fIO.fPos);
901 lengths.push_back(mcentry.fIO.fLen);
902 mcentry.fIndex = cumulative;
903 cumulative += mcentry.fIO.fLen;
904 }
905 fMissCache->fData.reserve(cumulative);
906 // printf("Reading %lu bytes into miss cache for %lu entries.\n", cumulative, fEntries->size());
907 fNMissReadPref += fMissCache->fEntries.size();
908 fFile->ReadBuffers(&(fMissCache->fData[0]), &(positions[0]), &(lengths[0]), fMissCache->fEntries.size());
910
911 return true;
912}
913
914////////////////////////////////////////////////////////////////////////////////
915/// Given an IO operation (pos, len) that was a cache miss in the primary TTC,
916/// try the operation again with the miss cache.
917///
918/// Returns true if the IO operation was successful and the contents of buf
919/// were populated with the requested data.
920///
921bool TTreeCache::CheckMissCache(char *buf, Long64_t pos, int len)
922{
923
924 if (!fOptimizeMisses) {
925 return false;
926 }
927 if (R__unlikely((pos < 0) || (len < 0))) {
928 return false;
929 }
930
931 // printf("Checking the miss cache for offset=%ld, length=%d\n", pos, len);
932
933 // First, binary search to see if the desired basket is already cached.
934 MissCache::Entry mcentry{IOPos{pos, len}};
935 auto iter = std::lower_bound(fMissCache->fEntries.begin(), fMissCache->fEntries.end(), mcentry);
936
937 if (iter != fMissCache->fEntries.end()) {
938 if (len > iter->fIO.fLen) {
940 return false;
941 }
942 auto offset = iter->fIndex;
943 memcpy(buf, &(fMissCache->fData[offset]), len);
944 // printf("Returning data from pos=%ld in miss cache.\n", offset);
945 ++fNMissReadOk;
946 return true;
947 }
948
949 // printf("Data not in miss cache.\n");
950
951 // Update the cache, looking for this (pos, len).
952 if (!ProcessMiss(pos, len)) {
953 // printf("Unable to pull data into miss cache.\n");
955 return false;
956 }
957
958 // OK, we updated the cache with as much information as possible. Search again for
959 // the entry we want.
960 iter = std::lower_bound(fMissCache->fEntries.begin(), fMissCache->fEntries.end(), mcentry);
961
962 if (iter != fMissCache->fEntries.end()) {
963 auto offset = iter->fIndex;
964 // printf("Expecting data at offset %ld in miss cache.\n", offset);
965 memcpy(buf, &(fMissCache->fData[offset]), len);
966 ++fNMissReadOk;
967 return true;
968 }
969
970 // This must be a logic bug. ProcessMiss should return false if (pos, len)
971 // wasn't put into fEntries.
973 return false;
974}
975
976////////////////////////////////////////////////////////////////////////////////
977/// End of methods for miss cache.
978////////////////////////////////////////////////////////////////////////////////
979
980namespace {
981struct BasketRanges {
982 struct Range {
983 Long64_t fMin; ///< Inclusive minimum
984 Long64_t fMax; ///< Inclusive maximum
985
986 Range() : fMin(-1), fMax(-1) {}
987
988 void UpdateMin(Long64_t min)
989 {
990 if (fMin == -1 || min < fMin)
991 fMin = min;
992 }
993
994 void UpdateMax(Long64_t max)
995 {
996 if (fMax == -1 || fMax < max)
997 fMax = max;
998 }
999
1000 bool Contains(Long64_t entry) { return (fMin <= entry && entry <= fMax); }
1001 };
1002
1003 std::vector<Range> fRanges;
1004 std::map<Long64_t,size_t> fMinimums;
1005 std::map<Long64_t,size_t> fMaximums;
1006
1007 BasketRanges(size_t nBranches) { fRanges.resize(nBranches); }
1008
1009 void Update(size_t branchNumber, Long64_t min, Long64_t max)
1010 {
1011 Range &range = fRanges.at(branchNumber);
1012 auto old(range);
1013
1014 range.UpdateMin(min);
1015 range.UpdateMax(max);
1016
1017 if (old.fMax != range.fMax) {
1018 if (old.fMax != -1) {
1019 auto maxIter = fMaximums.find(old.fMax);
1020 if (maxIter != fMaximums.end()) {
1021 if (maxIter->second == 1) {
1022 fMaximums.erase(maxIter);
1023 } else {
1024 --(maxIter->second);
1025 }
1026 }
1027 }
1028 ++(fMaximums[max]);
1029 }
1030 }
1031
1032 void Update(size_t branchNumber, size_t basketNumber, Long64_t *entries, size_t nb, size_t max)
1033 {
1034 Update(branchNumber, entries[basketNumber],
1035 (basketNumber < (nb - 1)) ? (entries[basketNumber + 1] - 1) : max - 1);
1036 }
1037
1038 // Check that fMaximums and fMinimums are properly set
1039 bool CheckAllIncludeRange()
1040 {
1041 Range result;
1042 for (const auto &r : fRanges) {
1043 if (result.fMin == -1 || result.fMin < r.fMin) {
1044 if (r.fMin != -1)
1045 result.fMin = r.fMin;
1046 }
1047 if (result.fMax == -1 || r.fMax < result.fMax) {
1048 if (r.fMax != -1)
1049 result.fMax = r.fMax;
1050 }
1051 }
1052 // if (result.fMax < result.fMin) {
1053 // // No overlapping range.
1054 // }
1055
1056 Range allIncludedRange(AllIncludedRange());
1057
1058 return (result.fMin == allIncludedRange.fMin && result.fMax == allIncludedRange.fMax);
1059 }
1060
1061 // This returns a Range object where fMin is the maximum of all the minimun entry
1062 // number loaded for each branch and fMax is the minimum of all the maximum entry
1063 // number loaded for each branch.
1064 // As such it is valid to have fMin > fMax, this is the case where there
1065 // are no overlap between the branch's range. For example for 2 branches
1066 // where we have for one the entry [50,99] and for the other [0,49] then
1067 // we will have fMin = max(50,0) = 50 and fMax = min(99,49) = 49
1068 Range AllIncludedRange()
1069 {
1070 Range result;
1071 if (!fMinimums.empty())
1072 result.fMin = fMinimums.rbegin()->first;
1073 if (!fMaximums.empty())
1074 result.fMax = fMaximums.begin()->first;
1075 return result;
1076 }
1077
1078 // Returns the number of branches with at least one baskets registered.
1079 UInt_t BranchesRegistered()
1080 {
1081 UInt_t result = 0;
1082 for (const auto &r : fRanges) {
1083 if (r.fMin != -1 && r.fMax != -1)
1084 ++result;
1085 }
1086 return result;
1087 }
1088
1089 // Returns true if at least one of the branch's range contains
1090 // the entry.
1091 bool Contains(Long64_t entry)
1092 {
1093 for (const auto &r : fRanges) {
1094 if (r.fMin != -1 && r.fMax != -1)
1095 if (r.fMin <= entry && entry <= r.fMax)
1096 return true;
1097 }
1098 return false;
1099 }
1100
1101 void Print()
1102 {
1103 for (size_t i = 0; i < fRanges.size(); ++i) {
1104 if (fRanges[i].fMin != -1 || fRanges[i].fMax != -1)
1105 Printf("Range #%zu : %lld to %lld", i, fRanges[i].fMin, fRanges[i].fMax);
1106 }
1107 }
1108};
1109} // Anonymous namespace.
1110
1111////////////////////////////////////////////////////////////////////////////////
1112/// Fill the cache buffer with the branches in the cache.
1113
1115{
1116
1117 if (fNbranches <= 0) return false;
1118 TTree *tree = ((TBranch*)fBranches->UncheckedAt(0))->GetTree();
1119 Long64_t entry = tree->GetReadEntry();
1120 Long64_t fEntryCurrentMax = 0;
1121
1122 if (entry != -1 && (entry < fEntryMin || fEntryMax < entry))
1123 return false;
1124
1125 if (fEnablePrefetching) { // Prefetching mode
1126 if (fIsLearning) { // Learning mode
1127 if (fEntryNext >= 0 && entry >= fEntryNext) {
1128 // entry is outside the learn range, need to stop the learning
1129 // phase. Doing so may trigger a recursive call to FillBuffer in
1130 // the process of filling both prefetching buffers
1132 fIsManual = false;
1133 }
1134 }
1135 if (fIsLearning) { // Learning mode
1136 // The learning phase should start from the minimum entry in the cache
1137 entry = fEntryMin;
1138 }
1139 if (fFirstTime) {
1140 //try to detect if it is normal or reverse read
1141 fFirstEntry = entry;
1142 }
1143 else {
1144 if (fFirstEntry == entry) return false;
1145 // Set the read direction
1146 if (!fReadDirectionSet) {
1147 if (entry < fFirstEntry) {
1148 fReverseRead = true;
1149 fReadDirectionSet = true;
1150 }
1151 else if (entry > fFirstEntry) {
1152 fReverseRead =false;
1153 fReadDirectionSet = true;
1154 }
1155 }
1156
1157 if (fReverseRead) {
1158 // Reverse reading with prefetching
1159 if (fEntryCurrent >0 && entry < fEntryNext) {
1160 // We can prefetch the next buffer
1161 if (entry >= fEntryCurrent) {
1162 entry = fEntryCurrent - tree->GetAutoFlush() * fFillTimes;
1163 }
1164 if (entry < 0) entry = 0;
1165 }
1166 else if (fEntryCurrent >= 0) {
1167 // We are still reading from the oldest buffer, no need to prefetch a new one
1168 return false;
1169 }
1170 if (entry < 0) return false;
1172 }
1173 else {
1174 // Normal reading with prefetching
1175 if (fEnablePrefetching) {
1176 if (entry < 0 && fEntryNext > 0) {
1177 entry = fEntryCurrent;
1178 } else if (entry >= fEntryCurrent) {
1179 if (entry < fEntryNext) {
1180 entry = fEntryNext;
1181 }
1182 }
1183 else {
1184 // We are still reading from the oldest buffer,
1185 // no need to prefetch a new one
1186 return false;
1187 }
1189 }
1190 }
1191 }
1192 }
1193
1194 // Set to true to enable all debug output without having to set gDebug
1195 // Replace this once we have a per module and/or per class debugging level/setting.
1196 static constexpr bool showMore = false;
1197
1198 static const auto PrintAllCacheInfo = [](TObjArray *branches) {
1199 for (Int_t i = 0; i < branches->GetEntries(); i++) {
1200 TBranch *b = (TBranch *)branches->UncheckedAt(i);
1201 b->PrintCacheInfo();
1202 }
1203 };
1204
1205 if (showMore || gDebug > 6)
1206 Info("FillBuffer", "***** Called for entry %lld", entry);
1207
1208 if (!fIsLearning && fEntryCurrent <= entry && entry < fEntryNext) {
1209 // Check if all the basket in the cache have already be used and
1210 // thus we can reuse the cache.
1211 bool allUsed = true;
1212 for (Int_t i = 0; i < fNbranches; ++i) {
1214 if (!b->fCacheInfo.AllUsed()) {
1215 allUsed = false;
1216 break;
1217 }
1218 }
1219 if (allUsed) {
1220 fEntryNext = entry;
1221 if (showMore || gDebug > 5)
1222 Info("FillBuffer", "All baskets used already, so refresh the cache early at entry %lld", entry);
1223 }
1224 if (gDebug > 8)
1225 PrintAllCacheInfo(fBranches);
1226 }
1227
1228 // If the entry is in the range we previously prefetched, there is
1229 // no point in retrying. Note that this will also return false
1230 // during the training phase (fEntryNext is then set intentional to
1231 // the end of the training phase).
1232 if (fEntryCurrent <= entry && entry < fEntryNext) return false;
1233
1234 // Triggered by the user, not the learning phase
1235 if (entry == -1)
1236 entry = 0;
1237
1238 bool resetBranchInfo = false;
1239 if (entry < fCurrentClusterStart || fNextClusterStart <= entry) {
1240 // We are moving on to another set of clusters.
1241 resetBranchInfo = true;
1242 if (showMore || gDebug > 6)
1243 Info("FillBuffer", "*** Will reset the branch information about baskets");
1244 } else if (showMore || gDebug > 6) {
1245 Info("FillBuffer", "*** Info we have on the set of baskets");
1246 PrintAllCacheInfo(fBranches);
1247 }
1248
1249 fEntryCurrentMax = fEntryCurrent;
1250 TTree::TClusterIterator clusterIter = tree->GetClusterIterator(entry);
1251
1252 auto entryCurrent = clusterIter();
1253 auto entryNext = clusterIter.GetNextEntry();
1254
1255 if (entryNext < fEntryMin || fEntryMax < entryCurrent) {
1256 // There is no overlap between the cluster we found [entryCurrent, entryNext[
1257 // and the authorized range [fEntryMin, fEntryMax]
1258 // so we have nothing to do
1259 return false;
1260 }
1261
1262 // If there is overlap between the found cluster and the authorized range
1263 // update the cache data members with the information about the current cluster.
1264 fEntryCurrent = entryCurrent;
1265 fEntryNext = entryNext;
1266
1267 auto firstClusterEnd = fEntryNext;
1268 if (showMore || gDebug > 6)
1269 Info("FillBuffer", "Looking at cluster spanning from %lld to %lld", fEntryCurrent, fEntryNext);
1270
1272 if (fEntryMax <= 0) fEntryMax = tree->GetEntries();
1274
1275 if ( fEnablePrefetching ) {
1276 if ( entry == fEntryMax ) {
1277 // We are at the end, no need to do anything else
1278 return false;
1279 }
1280 }
1281
1282 if (resetBranchInfo) {
1283 // We earlier thought we were onto the next set of clusters.
1284 if (fCurrentClusterStart != -1 || fNextClusterStart != -1) {
1285 if (!(fEntryCurrent < fCurrentClusterStart || fEntryCurrent >= fNextClusterStart)) {
1286 Error("FillBuffer", "Inconsistency: fCurrentClusterStart=%lld fEntryCurrent=%lld fNextClusterStart=%lld "
1287 "but fEntryCurrent should not be in between the two",
1289 }
1290 }
1291
1292 // Start the next cluster set.
1294 fNextClusterStart = firstClusterEnd;
1295 }
1296
1297 // Check if owner has a TEventList set. If yes we optimize for this
1298 // Special case reading only the baskets containing entries in the
1299 // list.
1300 TEventList *elist = fTree->GetEventList();
1301 Long64_t chainOffset = 0;
1302 if (elist) {
1303 if (fTree->IsA() ==TChain::Class()) {
1304 TChain *chain = (TChain*)fTree;
1305 Int_t t = chain->GetTreeNumber();
1306 chainOffset = chain->GetTreeOffset()[t];
1307 }
1308 }
1309
1310 //clear cache buffer
1311 Int_t ntotCurrentBuf = 0;
1312 if (fEnablePrefetching){ //prefetching mode
1313 if (fFirstBuffer) {
1315 ntotCurrentBuf = fNtot;
1316 }
1317 else {
1319 ntotCurrentBuf = fBNtot;
1320 }
1321 }
1322 else {
1324 ntotCurrentBuf = fNtot;
1325 }
1326
1327 //store baskets
1328 BasketRanges ranges((showMore || gDebug > 6) ? fNbranches : 0);
1329 BasketRanges reqRanges(fNbranches);
1330 BasketRanges memRanges((showMore || gDebug > 6) ? fNbranches : 0);
1331 Int_t clusterIterations = 0;
1332 Long64_t minEntry = fEntryCurrent;
1333 Int_t prevNtot;
1334 Long64_t maxReadEntry = minEntry; // If we are stopped before the end of the 2nd pass, this marker will where we need to start next time.
1335 Int_t nReadPrefRequest = 0;
1336 auto perfStats = GetTree()->GetPerfStats();
1337
1338 struct collectionInfo {
1339 Int_t fClusterStart{-1}; // First basket belonging to the current cluster
1340 Int_t fCurrent{-1}; // Currently visited basket
1341 bool fLoadedOnce{false};
1342
1343 void Rewind() { fCurrent = (fClusterStart >= 0) ? fClusterStart : 0; }
1344 };
1345 std::vector<collectionInfo> cursor(fNbranches);
1346
1347 // Main loop to fill the cache, inside each loop we will loop over
1348 // all the cached branch and collect the baskets within the 'current'
1349 // range/cluster. If there is still space in the cache after that, we
1350 // will do another iteration to add one more cluster to the cache.
1351 // i.e. essentially loop over the clusters.
1352 do {
1353 prevNtot = ntotCurrentBuf;
1354 Long64_t lowestMaxEntry = fEntryMax; // The lowest maximum entry in the TTreeCache for each branch for each pass.
1355
1356 bool reachedEnd = false;
1357 bool skippedFirst = false;
1358 bool oncePerBranch = false;
1359 Int_t nDistinctLoad = 0;
1360 bool progress = true;
1361 enum ENarrow {
1362 kFull = 0,
1363 kNarrow = 1
1364 };
1365 enum EPass {
1366 kStart = 1,
1367 kRegular = 2,
1368 kRewind = 3
1369 };
1370
1371 auto CollectBaskets = [this, elist, chainOffset, entry, clusterIterations, resetBranchInfo, perfStats,
1372 &cursor, &lowestMaxEntry, &maxReadEntry, &minEntry,
1373 &reachedEnd, &skippedFirst, &oncePerBranch, &nDistinctLoad, &progress,
1374 &ranges, &memRanges, &reqRanges,
1375 &ntotCurrentBuf, &nReadPrefRequest](EPass pass, ENarrow narrow, Long64_t maxCollectEntry) {
1376 // The first pass we add one basket per branches around the requested entry
1377 // then in the second pass we add the other baskets of the cluster.
1378 // This is to support the case where the cache is too small to hold a full cluster.
1379 Int_t nReachedEnd = 0;
1380 Int_t nSkipped = 0;
1381 auto oldnReadPrefRequest = nReadPrefRequest;
1382 std::vector<Int_t> potentialVetoes;
1383
1384 if (showMore || gDebug > 7)
1385 Info("CollectBaskets", "Called with pass=%d narrow=%d maxCollectEntry=%lld", pass, narrow, maxCollectEntry);
1386
1387 bool filled = false;
1388 for (Int_t i = 0; i < fNbranches; ++i) {
1390 if (b->GetDirectory()==nullptr || b->TestBit(TBranch::kDoNotProcess))
1391 continue;
1392 if (b->GetDirectory()->GetFile() != fFile)
1393 continue;
1394 potentialVetoes.clear();
1395 if (pass == kStart && !cursor[i].fLoadedOnce && resetBranchInfo) {
1396 // First check if we have any cluster that is currently in the
1397 // cache but was not used and would be reloaded in the next
1398 // cluster.
1399 b->fCacheInfo.GetUnused(potentialVetoes);
1400 if (showMore || gDebug > 7) {
1401 TString vetolist;
1402 for(auto v : potentialVetoes) {
1403 vetolist += v;
1404 vetolist.Append(' ');
1405 }
1406 if (!potentialVetoes.empty())
1407 Info("FillBuffer", "*** Potential Vetos for branch #%d: %s", i, vetolist.Data());
1408 }
1409 b->fCacheInfo.Reset();
1410 }
1411 Int_t nb = b->GetMaxBaskets();
1412 Int_t *lbaskets = b->GetBasketBytes();
1413 Long64_t *entries = b->GetBasketEntry();
1414 if (!lbaskets || !entries)
1415 continue;
1416 //we have found the branch. We now register all its baskets
1417 // from the requested offset to the basket below fEntryMax
1418 Int_t blistsize = b->GetListOfBaskets()->GetSize();
1419
1420 auto maxOfBasket = [this, nb, entries](int j) {
1421 return ((j < (nb - 1)) ? (entries[j + 1] - 1) : fEntryMax - 1);
1422 };
1423
1424 if (pass == kRewind)
1425 cursor[i].Rewind();
1426 else if (cursor[i].fCurrent == -1) {
1427 auto start = TMath::BinarySearch(b->GetWriteBasket() + 1, entries, minEntry);
1428 cursor[i].fCurrent = (start < 0) ? 0 : start;
1429 }
1430 for (auto &j = cursor[i].fCurrent; j < nb; j++) {
1431 // This basket has already been read, skip it
1432
1433 if (j < blistsize && b->GetListOfBaskets()->UncheckedAt(j)) {
1434
1435 if (showMore || gDebug > 6) {
1436 ranges.Update(i, entries[j], maxOfBasket(j));
1437 memRanges.Update(i, entries[j], maxOfBasket(j));
1438 }
1439 if (entries[j] <= entry && entry <= maxOfBasket(j)) {
1440 b->fCacheInfo.SetIsInCache(j);
1441 b->fCacheInfo.SetUsed(j);
1442 if (narrow) {
1443 // In narrow mode, we would select 'only' this basket,
1444 // so we are done for this round, let's 'consume' this
1445 // basket and go.
1446 ++nReachedEnd;
1447 ++j;
1448 break;
1449 }
1450 }
1451 continue;
1452 }
1453
1454 // Important: do not try to read maxCollectEntry, otherwise we might jump to the next autoflush
1455 if (entries[j] >= maxCollectEntry) {
1456 ++nReachedEnd;
1457 break; // break out of the for each branch loop.
1458 }
1459
1460 Long64_t pos = b->GetBasketSeek(j);
1461 Int_t len = lbaskets[j];
1462 if (pos <= 0 || len <= 0)
1463 continue;
1464 if (len > fBufferSizeMin) {
1465 // Do not cache a basket if it is bigger than the cache size!
1466 if ((showMore || gDebug > 7) &&
1467 (!(entries[j] < minEntry && (j < nb - 1 && entries[j + 1] <= minEntry))))
1468 Info("FillBuffer", "Skipping branch %s basket %d is too large for the cache: %d > %d",
1469 b->GetName(), j, len, fBufferSizeMin);
1470 continue;
1471 }
1472
1473 if (nReadPrefRequest && entries[j] > (reqRanges.AllIncludedRange().fMax + 1)) {
1474 // There is a gap between this basket and the max of the 'lowest' already loaded basket
1475 // If we are tight in memory, reading this basket may prevent reading the basket (for the other branches)
1476 // that covers this gap, forcing those baskets to be read uncached (because the cache wont be reloaded
1477 // until we use this basket).
1478 // eg. We could end up with the cache contain
1479 // b1: [428, 514[ // 'this' basket and we can assume [321 to 428[ is already in memory
1480 // b2: [400, 424[
1481 // and when reading entry 425 we will read b2's basket uncached.
1482
1483 if (showMore || gDebug > 8)
1484 Info("FillBuffer", "Skipping for now due to gap %d/%d with %lld > %lld", i, j, entries[j],
1485 (reqRanges.AllIncludedRange().fMax + 1));
1486 break; // Without consuming the basket.
1487 }
1488
1489 if (entries[j] < minEntry && (j<nb-1 && entries[j+1] <= minEntry))
1490 continue;
1491
1492 // We are within the range
1493 if (cursor[i].fClusterStart == -1)
1494 cursor[i].fClusterStart = j;
1495
1496 if (elist) {
1497 Long64_t emax = fEntryMax;
1498 if (j<nb-1)
1499 emax = entries[j + 1] - 1;
1500 if (!elist->ContainsRange(entries[j]+chainOffset,emax+chainOffset))
1501 continue;
1502 }
1503
1504 if (b->fCacheInfo.HasBeenUsed(j) || b->fCacheInfo.IsInCache(j) || b->fCacheInfo.IsVetoed(j)) {
1505 // We already cached and used this basket during this cluster range,
1506 // let's not redo it
1507 if (showMore || gDebug > 7)
1508 Info("FillBuffer", "Skipping basket to avoid redo: %d/%d veto: %d", i, j, b->fCacheInfo.IsVetoed(j));
1509 continue;
1510 }
1511
1512 if (std::find(std::begin(potentialVetoes), std::end(potentialVetoes), j) != std::end(potentialVetoes)) {
1513 // This basket was in the previous cache/cluster and was not used,
1514 // let's not read it again. I.e. we bet that it will continue to not
1515 // be used. At worst it will be used and thus read by itself.
1516 // Usually in this situation the basket is large so the penalty for
1517 // (re)reading it uselessly is high and the penalty to read it by
1518 // itself is 'small' (i.e. size bigger than latency).
1519 b->fCacheInfo.Veto(j);
1520 if (showMore || gDebug > 7)
1521 Info("FillBuffer", "Veto-ing cluster %d [%lld,%lld[ in branch %s #%d", j, entries[j],
1522 maxOfBasket(j) + 1, b->GetName(), i);
1523 continue;
1524 }
1525
1526 if (narrow) {
1527 if ((((entries[j] > entry)) || (j < nb - 1 && entries[j + 1] <= entry))) {
1528 // Keep only the basket that contains the entry
1529 if (j == cursor[i].fClusterStart && entry > entries[j])
1530 ++nSkipped;
1531 if (entries[j] > entry)
1532 break;
1533 else
1534 continue;
1535 }
1536 }
1537
1538 if ((ntotCurrentBuf + len) > fBufferSizeMin) {
1539 // Humm ... we are going to go over the requested size.
1540 if (clusterIterations > 0 && cursor[i].fLoadedOnce) {
1541 // We already have a full cluster and now we would go over the requested
1542 // size, let's stop caching (and make sure we start next time from the
1543 // end of the previous cluster).
1544 if (showMore || gDebug > 5) {
1545 Info(
1546 "FillBuffer",
1547 "Breaking early because %d is greater than %d at cluster iteration %d will restart at %lld",
1548 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, minEntry);
1549 }
1550 fEntryNext = minEntry;
1551 filled = true;
1552 break;
1553 } else {
1554 if (pass == kStart || !cursor[i].fLoadedOnce) {
1555 if ((ntotCurrentBuf + len) > 4 * fBufferSizeMin) {
1556 // Okay, so we have not even made one pass and we already have
1557 // accumulated request for more than twice the memory size ...
1558 // So stop for now, and will restart at the same point, hoping
1559 // that the basket will still be in memory and not asked again ..
1560 fEntryNext = maxReadEntry;
1561
1562 if (showMore || gDebug > 5) {
1563 Info("FillBuffer", "Breaking early because %d is greater than 4*%d at cluster iteration "
1564 "%d pass %d will restart at %lld",
1565 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, pass, fEntryNext);
1566 }
1567 filled = true;
1568 break;
1569 }
1570 } else {
1571 // We have made one pass through the branches and thus already
1572 // requested one basket per branch, let's stop prefetching
1573 // now.
1574 if ((ntotCurrentBuf + len) > 2 * fBufferSizeMin) {
1575 fEntryNext = maxReadEntry;
1576 if (showMore || gDebug > 5) {
1577 Info("FillBuffer", "Breaking early because %d is greater than 2*%d at cluster iteration "
1578 "%d pass %d will restart at %lld",
1579 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, pass, fEntryNext);
1580 }
1581 filled = true;
1582 break;
1583 }
1584 }
1585 }
1586 }
1587
1588 ++nReadPrefRequest;
1589
1590 reqRanges.Update(i, j, entries, nb, fEntryMax);
1591 if (showMore || gDebug > 6)
1592 ranges.Update(i, j, entries, nb, fEntryMax);
1593
1594 b->fCacheInfo.SetIsInCache(j);
1595
1596 if (showMore || gDebug > 6)
1597 Info("FillBuffer", "*** Registering branch %d basket %d %s", i, j, b->GetName());
1598
1599 if (!cursor[i].fLoadedOnce) {
1600 cursor[i].fLoadedOnce = true;
1601 ++nDistinctLoad;
1602 }
1603 if (R__unlikely(perfStats)) {
1604 perfStats->SetLoaded(i, j);
1605 }
1606
1607 // Actual registering the basket for loading from the file.
1608 if (fEnablePrefetching){
1609 if (fFirstBuffer) {
1611 ntotCurrentBuf = fNtot;
1612 }
1613 else {
1615 ntotCurrentBuf = fBNtot;
1616 }
1617 }
1618 else {
1620 ntotCurrentBuf = fNtot;
1621 }
1622
1623 if ( ( j < (nb-1) ) && entries[j+1] > maxReadEntry ) {
1624 // Info("FillBuffer","maxCollectEntry incremented from %lld to %lld", maxReadEntry, entries[j+1]);
1625 maxReadEntry = entries[j+1];
1626 }
1627 if (ntotCurrentBuf > 4 * fBufferSizeMin) {
1628 // Humm something wrong happened.
1629 Warning("FillBuffer", "There is more data in this cluster (starting at entry %lld to %lld, "
1630 "current=%lld) than usual ... with %d %.3f%% of the branches we already have "
1631 "%d bytes (instead of %d)",
1632 fEntryCurrent, fEntryNext, entries[j], i, (100.0 * i) / ((float)fNbranches), ntotCurrentBuf,
1634 }
1635 if (pass == kStart) {
1636 // In the first pass, we record one basket per branch and move on to the next branch.
1637 auto high = maxOfBasket(j);
1638 if (high < lowestMaxEntry)
1639 lowestMaxEntry = high;
1640 // 'Consume' the baskets (i.e. avoid looking at it during a subsequent pass)
1641 ++j;
1642 break;
1643 } else if ((j + 1) == nb || entries[j + 1] >= maxReadEntry || entries[j + 1] >= lowestMaxEntry) {
1644 // In the other pass, load the baskets until we get to the maximum loaded so far.
1645 auto high = maxOfBasket(j);
1646 if (high < lowestMaxEntry)
1647 lowestMaxEntry = high;
1648 // 'Consume' the baskets (i.e. avoid looking at it during a subsequent pass)
1649 ++j;
1650 break;
1651 }
1652 }
1653
1654 if (cursor[i].fCurrent == nb) {
1655 ++nReachedEnd;
1656 }
1657
1658 if (gDebug > 0)
1659 Info("CollectBaskets",
1660 "Entry: %lld, registering baskets branch %s, fEntryNext=%lld, fNseek=%d, ntotCurrentBuf=%d",
1661 minEntry, ((TBranch *)fBranches->UncheckedAt(i))->GetName(), fEntryNext, fNseek, ntotCurrentBuf);
1662 }
1663 reachedEnd = (nReachedEnd == fNbranches);
1664 skippedFirst = (nSkipped > 0);
1665 oncePerBranch = (nDistinctLoad == fNbranches);
1666 progress = nReadPrefRequest - oldnReadPrefRequest;
1667 return filled;
1668 };
1669
1670 // First collect all the basket containing the request entry.
1671 bool full = false;
1672
1673 full = CollectBaskets(kStart, kNarrow, fEntryNext);
1674
1675 // Then fill out from all but the 'largest' branch to even out
1676 // the range across branches;
1677 while (!full && !reachedEnd && progress) { // used to be restricted to !oncePerBranch
1678 full = CollectBaskets(kStart, kFull, std::min(maxReadEntry, fEntryNext));
1679 }
1680
1681 resetBranchInfo = false; // Make sure the 2nd cluster iteration does not erase the info.
1682
1683 // Then fill out to the end of the cluster.
1684 if (!full && !fReverseRead) {
1685 do {
1686 full = CollectBaskets(kRegular, kFull, fEntryNext);
1687 } while (!full && !reachedEnd && progress);
1688 }
1689
1690 // The restart from the start of the cluster.
1691 if (!full && skippedFirst) {
1692 full = CollectBaskets(kRewind, kFull, fEntryNext);
1693 while (!full && !reachedEnd && progress) {
1694 full = CollectBaskets(kRegular, kFull, fEntryNext);
1695 }
1696 }
1697
1698 clusterIterations++;
1699
1700 minEntry = clusterIter.Next();
1701 if (fIsLearning) {
1702 fFillTimes++;
1703 }
1704
1705 // Continue as long as we still make progress (prevNtot < ntotCurrentBuf), that the next entry range to be looked
1706 // at,
1707 // which start at 'minEntry', is not past the end of the requested range (minEntry < fEntryMax)
1708 // and we guess that we not going to go over the requested amount of memory by asking for another set
1709 // of entries (fBufferSizeMin > ((Long64_t)ntotCurrentBuf*(clusterIterations+1))/clusterIterations).
1710 // ntotCurrentBuf / clusterIterations is the average size we are accumulated so far at each loop.
1711 // and thus (ntotCurrentBuf / clusterIterations) * (clusterIterations+1) is a good guess at what the next total
1712 // size
1713 // would be if we run the loop one more time. ntotCurrentBuf and clusterIterations are Int_t but can sometimes
1714 // be 'large' (i.e. 30Mb * 300 intervals) and can overflow the numerical limit of Int_t (i.e. become
1715 // artificially negative). To avoid this issue we promote ntotCurrentBuf to a long long (64 bits rather than 32
1716 // bits)
1717 if (!((fBufferSizeMin > ((Long64_t)ntotCurrentBuf * (clusterIterations + 1)) / clusterIterations) &&
1718 (prevNtot < ntotCurrentBuf) && (minEntry < fEntryMax))) {
1719 if (showMore || gDebug > 6)
1720 Info("FillBuffer", "Breaking because %d <= %lld || (%d >= %d) || %lld >= %lld", fBufferSizeMin,
1721 ((Long64_t)ntotCurrentBuf * (clusterIterations + 1)) / clusterIterations, prevNtot, ntotCurrentBuf,
1722 minEntry, fEntryMax);
1723 break;
1724 }
1725
1726 //for the reverse reading case
1727 if (!fIsLearning && fReverseRead) {
1728 if (clusterIterations >= fFillTimes)
1729 break;
1730 if (minEntry >= fEntryCurrentMax && fEntryCurrentMax > 0)
1731 break;
1732 }
1733 fEntryNext = clusterIter.GetNextEntry();
1736 } while (true);
1737
1738 if (showMore || gDebug > 6) {
1739 Info("FillBuffer", "Mem ranges");
1740 memRanges.Print();
1741 Info("FillBuffer", "Combined ranges");
1742 ranges.Print();
1743 Info("FillBuffer", "Requested ranges");
1744 reqRanges.Print();
1745 PrintAllCacheInfo(fBranches);
1746 }
1747
1748 if (nReadPrefRequest == 0) {
1749 // Nothing was added in the cache. This usually indicates that the baskets
1750 // contains the requested entry are either already in memory or are too large
1751 // on their own to fit in the cache.
1752 if (showMore || gDebug > 5) {
1753 Info("FillBuffer", "For entry %lld, nothing was added to the cache.", entry);
1754 }
1755 } else if (fEntryNext < firstClusterEnd && !reqRanges.Contains(entry)) {
1756 // Something went very wrong and even-though we searched for the baskets
1757 // holding 'entry' we somehow ended up with a range of entries that does
1758 // validate. So we must have been unable to find or fit the needed basket.
1759 // And thus even-though, we know the corresponding baskets wont be in the cache,
1760 // Let's make it official that 'entry' is within the range of this TTreeCache ('s search.)
1761
1762 // Without this, the next read will be flagged as 'out-of-range' and then we start at
1763 // the exact same point as this FillBuffer execution resulting in both the requested
1764 // entry still not being part of the cache **and** the beginning of the cluster being
1765 // read **again**.
1766
1767 if (showMore || gDebug > 5) {
1768 Error("FillBuffer", "Reset the next entry because the currently loaded range does not contains the request "
1769 "entry: %lld. fEntryNext updated from %lld to %lld. %d",
1770 entry, fEntryNext, firstClusterEnd, nReadPrefRequest);
1771 reqRanges.Print();
1772 }
1773
1774 fEntryNext = firstClusterEnd;
1775 } else {
1776 if (showMore || gDebug > 5) {
1777 Info("FillBuffer", "Complete adding %d baskets from %d branches taking in memory %d out of %d",
1778 nReadPrefRequest, reqRanges.BranchesRegistered(), ntotCurrentBuf, fBufferSizeMin);
1779 }
1780 }
1781
1782 fNReadPref += nReadPrefRequest;
1783 if (fEnablePrefetching) {
1784 if (fIsLearning) {
1786 }
1787 if (!fIsLearning && fFirstTime){
1788 // First time we add autoFlush entries , after fFillTimes * autoFlush
1789 // only in reverse prefetching mode
1790 fFirstTime = false;
1791 }
1792 }
1793 fIsLearning = false;
1794 return true;
1795}
1796
1797////////////////////////////////////////////////////////////////////////////////
1798/// Return the desired prefill type from the environment or resource variable
1799/// - 0 - No prefill
1800/// - 1 - All branches
1801
1803{
1804 const char *stcp;
1805 Int_t s = 0;
1806
1807 if (!(stcp = gSystem->Getenv("ROOT_TTREECACHE_PREFILL")) || !*stcp) {
1808 s = gEnv->GetValue("TTreeCache.Prefill", 1);
1809 } else {
1810 s = TString(stcp).Atoi();
1811 }
1812
1813 return static_cast<TTreeCache::EPrefillType>(s);
1814}
1815
1816////////////////////////////////////////////////////////////////////////////////
1817/// Give the total efficiency of the primary cache... defined as the ratio
1818/// of blocks found in the cache vs. the number of blocks prefetched
1819/// ( it could be more than 1 if we read the same block from the cache more
1820/// than once )
1821///
1822/// Note: This should eb used at the end of the processing or we will
1823/// get incomplete stats
1824
1826{
1827 if ( !fNReadPref )
1828 return 0;
1829
1830 return ((Double_t)fNReadOk / (Double_t)fNReadPref);
1831}
1832
1833////////////////////////////////////////////////////////////////////////////////
1834/// The total efficiency of the 'miss cache' - defined as the ratio
1835/// of blocks found in the cache versus the number of blocks prefetched
1836
1838{
1839 if (!fNMissReadPref) {
1840 return 0;
1841 }
1842 return static_cast<double>(fNMissReadOk) / static_cast<double>(fNMissReadPref);
1843}
1844
1845////////////////////////////////////////////////////////////////////////////////
1846/// This will indicate a sort of relative efficiency... a ratio of the
1847/// reads found in the cache to the number of reads so far
1848
1850{
1851 if ( !fNReadOk && !fNReadMiss )
1852 return 0;
1853
1854 return ((Double_t)fNReadOk / (Double_t)(fNReadOk + fNReadMiss));
1855}
1856
1857////////////////////////////////////////////////////////////////////////////////
1858/// Relative efficiency of the 'miss cache' - ratio of the reads found in cache
1859/// to the number of reads so far.
1860
1862{
1863 if (!fNMissReadOk && !fNMissReadMiss) {
1864 return 0;
1865 }
1866
1867 return static_cast<double>(fNMissReadOk) / static_cast<double>(fNMissReadOk + fNMissReadMiss);
1868}
1869
1870////////////////////////////////////////////////////////////////////////////////
1871/// Static function returning the number of entries used to train the cache
1872/// see SetLearnEntries
1873
1875{
1876 return fgLearnEntries;
1877}
1878
1879////////////////////////////////////////////////////////////////////////////////
1880/// Print cache statistics. Like:
1881///
1882/// ~~~ {.cpp}
1883/// ******TreeCache statistics for file: cms2.root ******
1884/// Number of branches in the cache ...: 1093
1885/// Cache Efficiency ..................: 0.997372
1886/// Cache Efficiency Rel...............: 1.000000
1887/// Learn entries......................: 100
1888/// Reading............................: 72761843 bytes in 7 transactions
1889/// Readahead..........................: 256000 bytes with overhead = 0 bytes
1890/// Average transaction................: 10394.549000 Kbytes
1891/// Number of blocks in current cache..: 210, total size: 6280352
1892/// ~~~
1893///
1894/// - if option = "a" the list of blocks in the cache is printed
1895/// see also class TTreePerfStats.
1896/// - if option contains 'cachedbranches', the list of branches being
1897/// cached is printed.
1898
1900{
1901 TString opt = option;
1902 opt.ToLower();
1903 printf("******TreeCache statistics for tree: %s in file: %s ******\n",fTree ? fTree->GetName() : "no tree set",fFile ? fFile->GetName() : "no file set");
1904 if (fNbranches <= 0) return;
1905 printf("Number of branches in the cache ...: %d\n",fNbranches);
1906 printf("Cache Efficiency ..................: %f\n",GetEfficiency());
1907 printf("Cache Efficiency Rel...............: %f\n",GetEfficiencyRel());
1908 printf("Secondary Efficiency ..............: %f\n", GetMissEfficiency());
1909 printf("Secondary Efficiency Rel ..........: %f\n", GetMissEfficiencyRel());
1910 printf("Learn entries......................: %d\n",TTreeCache::GetLearnEntries());
1911 if ( opt.Contains("cachedbranches") ) {
1912 opt.ReplaceAll("cachedbranches","");
1913 printf("Cached branches....................:\n");
1914 const TObjArray *cachedBranches = this->GetCachedBranches();
1915 Int_t nbranches = cachedBranches->GetEntriesFast();
1916 for (Int_t i = 0; i < nbranches; ++i) {
1917 TBranch* branch = (TBranch*) cachedBranches->UncheckedAt(i);
1918 printf("Branch name........................: %s\n",branch->GetName());
1919 }
1920 }
1922}
1923
1924////////////////////////////////////////////////////////////////////////////////
1925/// Old method ReadBuffer before the addition of the prefetch mechanism.
1926
1928 //Is request already in the cache?
1929 if (TFileCacheRead::ReadBuffer(buf,pos,len) == 1){
1930 fNReadOk++;
1931 return 1;
1932 }
1933
1934 static const auto recordMiss = [](TVirtualPerfStats *perfStats, TObjArray *branches, bool bufferFilled,
1935 Long64_t basketpos) {
1936 if (gDebug > 6)
1937 ::Info("TTreeCache::ReadBufferNormal", "Cache miss after an %s FillBuffer: pos=%lld",
1938 bufferFilled ? "active" : "inactive", basketpos);
1939 for (Int_t i = 0; i < branches->GetEntries(); ++i) {
1940 TBranch *b = (TBranch *)branches->UncheckedAt(i);
1941 Int_t blistsize = b->GetListOfBaskets()->GetSize();
1942 for (Int_t j = 0; j < blistsize; ++j) {
1943 if (basketpos == b->GetBasketSeek(j)) {
1944 if (gDebug > 6)
1945 ::Info("TTreeCache::ReadBufferNormal", " Missing basket: %d for %s", j, b->GetName());
1946 perfStats->SetMissed(i, j);
1947 }
1948 }
1949 }
1950 };
1951
1952 //not found in cache. Do we need to fill the cache?
1953 bool bufferFilled = FillBuffer();
1954 if (bufferFilled) {
1955 Int_t res = TFileCacheRead::ReadBuffer(buf,pos,len);
1956
1957 if (res == 1)
1958 fNReadOk++;
1959 else if (res == 0) {
1960 fNReadMiss++;
1961 auto perfStats = GetTree()->GetPerfStats();
1962 if (perfStats)
1963 recordMiss(perfStats, fBranches, bufferFilled, pos);
1964 }
1965
1966 return res;
1967 }
1968
1969 if (CheckMissCache(buf, pos, len)) {
1970 return 1;
1971 }
1972
1973 fNReadMiss++;
1974 auto perfStats = GetTree()->GetPerfStats();
1975 if (perfStats)
1976 recordMiss(perfStats, fBranches, bufferFilled, pos);
1977
1978 return 0;
1979}
1980
1981////////////////////////////////////////////////////////////////////////////////
1982/// Used to read a chunk from a block previously fetched. It will call FillBuffer
1983/// even if the cache lookup succeeds, because it will try to prefetch the next block
1984/// as soon as we start reading from the current block.
1985
1987{
1988 if (TFileCacheRead::ReadBuffer(buf, pos, len) == 1){
1989 //call FillBuffer to prefetch next block if necessary
1990 //(if we are currently reading from the last block available)
1991 FillBuffer();
1992 fNReadOk++;
1993 return 1;
1994 }
1995
1996 //keep on prefetching until request is satisfied
1997 // try to prefetch a couple of times and if request is still not satisfied then
1998 // fall back to normal reading without prefetching for the current request
1999 Int_t counter = 0;
2000 while (true) {
2001 if(TFileCacheRead::ReadBuffer(buf, pos, len)) {
2002 break;
2003 }
2004 FillBuffer();
2005 fNReadMiss++;
2006 counter++;
2007 if (counter>1) {
2008 return 0;
2009 }
2010 }
2011
2012 fNReadOk++;
2013 return 1;
2014}
2015
2016////////////////////////////////////////////////////////////////////////////////
2017/// Read buffer at position pos if the request is in the list of
2018/// prefetched blocks read from fBuffer.
2019/// Otherwise try to fill the cache from the list of selected branches,
2020/// and recheck if pos is now in the list.
2021/// Returns:
2022/// - -1 in case of read failure,
2023/// - 0 in case not in cache,
2024/// - 1 in case read from cache.
2025/// This function overloads TFileCacheRead::ReadBuffer.
2026
2028{
2029 if (!fEnabled) return 0;
2030
2032 return TTreeCache::ReadBufferPrefetch(buf, pos, len);
2033 else
2034 return TTreeCache::ReadBufferNormal(buf, pos, len);
2035}
2036
2037////////////////////////////////////////////////////////////////////////////////
2038/// This will simply clear the cache
2039
2041{
2042 for (Int_t i = 0; i < fNbranches; ++i) {
2044 if (b->GetDirectory()==nullptr || b->TestBit(TBranch::kDoNotProcess))
2045 continue;
2046 if (b->GetDirectory()->GetFile() != fFile)
2047 continue;
2048 b->fCacheInfo.Reset();
2049 }
2050 fEntryCurrent = -1;
2051 fEntryNext = -1;
2053 fNextClusterStart = -1;
2054
2056
2057 if (fEnablePrefetching) {
2058 fFirstTime = true;
2060 }
2061}
2062
2063////////////////////////////////////////////////////////////////////////////////
2064/// Change the underlying buffer size of the cache.
2065/// If the change of size means some cache content is lost, or if the buffer
2066/// is now larger, setup for a cache refill the next time there is a read
2067/// Returns:
2068/// - 0 if the buffer content is still available
2069/// - 1 if some or all of the buffer content has been made unavailable
2070/// - -1 on error
2071
2073{
2074 Int_t prevsize = GetBufferSize();
2075 Int_t res = TFileCacheRead::SetBufferSize(buffersize);
2076 if (res < 0) {
2077 return res;
2078 }
2079
2080 if (res == 0 && buffersize <= prevsize) {
2081 return res;
2082 }
2083
2084 // if content was removed from the buffer, or the buffer was enlarged then
2085 // empty the prefetch lists and prime to fill the cache again
2086
2088 if (fEnablePrefetching) {
2090 }
2091
2092 fEntryCurrent = -1;
2093 if (!fIsLearning) {
2094 fEntryNext = -1;
2095 }
2096
2097 return 1;
2098}
2099
2100////////////////////////////////////////////////////////////////////////////////
2101/// Set the minimum and maximum entry number to be processed
2102/// this information helps to optimize the number of baskets to read
2103/// when prefetching the branch buffers.
2104
2106{
2107 // This is called by TTreePlayer::Process in an automatic way...
2108 // don't restart it if the user has specified the branches.
2109 bool needLearningStart = (fEntryMin != emin) && fIsLearning && !fIsManual;
2110
2111 fEntryMin = emin;
2112 fEntryMax = emax;
2114 if (gDebug > 0)
2115 Info("SetEntryRange", "fEntryMin=%lld, fEntryMax=%lld, fEntryNext=%lld",
2117
2118 if (needLearningStart) {
2119 // Restart learning
2121 }
2122}
2123
2124////////////////////////////////////////////////////////////////////////////////
2125/// Change the file that is being cached.
2126
2128{
2129 // The infinite recursion is 'broken' by the fact that
2130 // TFile::SetCacheRead remove the entry from fCacheReadMap _before_
2131 // calling SetFile (and also by setting fFile to zero before the calling).
2132 if (fFile) {
2133 TFile *prevFile = fFile;
2134 fFile = nullptr;
2135 prevFile->SetCacheRead(nullptr, fTree, action);
2136 }
2137 TFileCacheRead::SetFile(file, action);
2138}
2139
2140////////////////////////////////////////////////////////////////////////////////
2141/// Static function to set the number of entries to be used in learning mode
2142/// The default value for n is 10. n must be >= 1
2143
2145{
2146 if (n < 1) n = 1;
2147 fgLearnEntries = n;
2148}
2149
2150////////////////////////////////////////////////////////////////////////////////
2151/// Set whether the learning period is started with a prefilling of the
2152/// cache and which type of prefilling is used.
2153/// The two value currently supported are:
2154/// - TTreeCache::kNoPrefill disable the prefilling
2155/// - TTreeCache::kAllBranches fill the cache with baskets from all branches.
2156/// The default prefilling behavior can be controlled by setting
2157/// TTreeCache.Prefill or the environment variable ROOT_TTREECACHE_PREFILL.
2158
2160{
2162}
2163
2164////////////////////////////////////////////////////////////////////////////////
2165/// The name should be enough to explain the method.
2166/// The only additional comments is that the cache is cleaned before
2167/// the new learning phase.
2168
2170{
2171 fIsLearning = true;
2172 fIsManual = false;
2173 fNbranches = 0;
2174 if (fBrNames) fBrNames->Delete();
2175 fIsTransferred = false;
2176 fEntryCurrent = -1;
2177}
2178
2179////////////////////////////////////////////////////////////////////////////////
2180/// This is the counterpart of StartLearningPhase() and can be used to stop
2181/// the learning phase. It's useful when the user knows exactly what branches
2182/// they are going to use.
2183/// For the moment it's just a call to FillBuffer() since that method
2184/// will create the buffer lists from the specified branches.
2185
2187{
2188 if (fIsLearning) {
2189 // This will force FillBuffer to read the buffers.
2190 fEntryNext = -1;
2191 fIsLearning = false;
2192 }
2193 fIsManual = true;
2194
2195 auto perfStats = GetTree()->GetPerfStats();
2196 if (perfStats)
2197 perfStats->UpdateBranchIndices(fBranches);
2198
2199 //fill the buffers only once during learning
2200 if (fEnablePrefetching && !fOneTime) {
2201 fIsLearning = true;
2202 FillBuffer();
2203 fOneTime = true;
2204 }
2205}
2206
2207////////////////////////////////////////////////////////////////////////////////
2208/// Update pointer to current Tree and recompute pointers to the branches in the cache.
2209
2211{
2212
2213 fTree = tree;
2214
2215 fEntryMin = 0;
2217
2218 fEntryCurrent = -1;
2219
2220 if (fBrNames->GetEntries() == 0 && fIsLearning) {
2221 // We still need to learn.
2223 } else {
2224 // We learnt from a previous file.
2225 fIsLearning = false;
2226 fEntryNext = -1;
2227 }
2228 fNbranches = 0;
2229
2230 TIter next(fBrNames);
2231 TObjString *os;
2232 while ((os = (TObjString*)next())) {
2233 TBranch *b = fTree->GetBranch(os->GetName());
2234 if (!b) {
2235 continue;
2236 }
2238 fNbranches++;
2239 }
2240
2241 auto perfStats = GetTree()->GetPerfStats();
2242 if (perfStats)
2243 perfStats->UpdateBranchIndices(fBranches);
2244}
2245
2246////////////////////////////////////////////////////////////////////////////////
2247/// Perform an initial prefetch, attempting to read as much of the learning
2248/// phase baskets for all branches at once
2249
2251{
2252 // This is meant for the learning phase
2253 if (!fIsLearning) return;
2254
2255 // This should be called before reading entries, otherwise we'll
2256 // always exit here, since TBranch adds itself before reading
2257 if (fNbranches > 0) return;
2258
2259 // Is the LearnPrefill enabled (using an Int_t here to allow for future
2260 // extension to alternative Prefilling).
2261 if (fPrefillType == kNoPrefill) return;
2262
2263 Long64_t entry = fTree ? fTree->GetReadEntry() : 0;
2264
2265 // Return early if we are out of the requested range.
2266 if (entry < fEntryMin || entry > fEntryMax) return;
2267
2268 fLearnPrefilling = true;
2269
2270
2271 // Force only the learn entries to be cached by temporarily setting min/max
2272 // to the learning phase entry range
2273 // But save all the old values, so we can restore everything to how it was
2274 Long64_t eminOld = fEntryMin;
2275 Long64_t emaxOld = fEntryMax;
2276 Long64_t ecurrentOld = fEntryCurrent;
2277 Long64_t enextOld = fEntryNext;
2278 auto currentClusterStartOld = fCurrentClusterStart;
2279 auto nextClusterStartOld = fNextClusterStart;
2280
2281 fEntryMin = std::max(fEntryMin, fEntryCurrent);
2282 fEntryMax = std::min(fEntryMax, fEntryNext);
2283
2284 // We check earlier that we are within the authorized range, but
2285 // we might still be out of the (default) learning range and since
2286 // this is called before any branch is added to the cache, this means
2287 // that the user's first GetEntry is this one which is outside of the
2288 // learning range ... so let's do something sensible-ish.
2289 // Note: we probably should also fix the learning range but we may
2290 // or may not have enough information to know if we can move it
2291 // (for example fEntryMin (eminOld right now) might be the default or user provided)
2292 if (entry < fEntryMin) fEntryMin = entry;
2293 if (entry > fEntryMax) fEntryMax = entry;
2294
2295 // Add all branches to be cached. This also sets fIsManual, stops learning,
2296 // and makes fEntryNext = -1 (which forces a cache fill, which is good)
2297 AddBranch("*");
2298 fIsManual = false; // AddBranch sets fIsManual, so we reset it
2299
2300 // Now, fill the buffer with the learning phase entry range
2301 FillBuffer();
2302
2303 // Leave everything the way we found it
2304 fIsLearning = true;
2305 DropBranch("*"); // This doesn't work unless we're already learning
2306
2307 // Restore entry values
2308 fEntryMin = eminOld;
2309 fEntryMax = emaxOld;
2310 fEntryCurrent = ecurrentOld;
2311 fEntryNext = enextOld;
2312 fCurrentClusterStart = currentClusterStartOld;
2313 fNextClusterStart = nextClusterStartOld;
2314
2315 fLearnPrefilling = false;
2316}
#define R__unlikely(expr)
Definition RConfig.hxx:586
#define b(i)
Definition RSha256.hxx:100
int Int_t
Definition RtypesCore.h:45
unsigned int UInt_t
Definition RtypesCore.h:46
constexpr Ssiz_t kNPOS
Definition RtypesCore.h:124
long long Long64_t
Definition RtypesCore.h:80
unsigned long long ULong64_t
Definition RtypesCore.h:81
const char Option_t
Definition RtypesCore.h:66
#define ClassImp(name)
Definition Rtypes.h:377
R__EXTERN TEnv * gEnv
Definition TEnv.h:170
Option_t Option_t option
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t cursor
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h offset
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t r
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t result
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t type
char name[80]
Definition TGX11.cxx:110
void Print(GNN_Data &d, std::string txt="")
Int_t gDebug
Definition TROOT.cxx:597
void Printf(const char *fmt,...)
Formats a string in a circular formatting buffer and prints the string.
Definition TString.cxx:2503
R__EXTERN TSystem * gSystem
Definition TSystem.h:555
A Branch for the case of an object.
A TTree is a list of TBranches.
Definition TBranch.h:93
@ kDoNotProcess
Active bit for branches.
Definition TBranch.h:105
void PrintCacheInfo() const
Print the information we have about which basket is currently cached and whether they have been 'used...
Definition TBranch.cxx:2445
Long64_t fEntries
Number of entries.
Definition TBranch.h:134
A chain is a collection of files containing TTree objects.
Definition TChain.h:33
static TClass * Class()
Long64_t * GetTreeOffset() const
Definition TChain.h:121
Int_t GetTreeNumber() const override
Definition TChain.h:120
virtual Int_t GetEntries() const
virtual Int_t GetValue(const char *name, Int_t dflt) const
Returns the integer value for a resource.
Definition TEnv.cxx:491
<div class="legacybox"><h2>Legacy Code</h2> TEventList is a legacy interface: there will be no bug fi...
Definition TEventList.h:31
virtual bool ContainsRange(Long64_t entrymin, Long64_t entrymax)
Return TRUE if list contains entries from entrymin to entrymax included.
A cache when reading files over the network.
virtual Int_t SetBufferSize(Int_t buffersize)
Sets the buffer size.
virtual Int_t ReadBuffer(char *buf, Long64_t pos, Int_t len)
Read buffer at position pos.
virtual void SecondPrefetch(Long64_t, Int_t)
Bool_t fEnablePrefetching
reading by prefetching asynchronously
Int_t fNtot
Total size of prefetched blocks.
virtual void Prefetch(Long64_t pos, Int_t len)
Add block of length len at position pos in the list of blocks to be prefetched.
Int_t fBufferSizeMin
Original size of fBuffer.
Bool_t fIsTransferred
True when fBuffer contains something valid.
TFile * fFile
Pointer to file.
Int_t fNseek
Number of blocks to be prefetched.
void Print(Option_t *option="") const override
Print cache statistics.
virtual Int_t GetBufferSize() const
virtual void SetFile(TFile *file, TFile::ECacheAction action=TFile::kDisconnect)
Set the file using this cache and reset the current blocks (if any).
A ROOT file is an on-disk file, usually with extension .root, that stores objects in a file-system-li...
Definition TFile.h:53
virtual void SetCacheRead(TFileCacheRead *cache, TObject *tree=nullptr, ECacheAction action=kDisconnect)
Set a pointer to the read cache.
Definition TFile.cxx:2365
virtual Bool_t ReadBuffers(char *buf, Long64_t *pos, Int_t *len, Int_t nbuf)
Read the nbuf blocks described in arrays pos and len.
Definition TFile.cxx:1821
ECacheAction
TTreeCache flushing semantics.
Definition TFile.h:70
A TFriendElement TF describes a TTree object TF in a file.
virtual TTree * GetTree()
Return pointer to friend TTree.
A TLeaf describes individual elements of a TBranch See TBranch structure in TTree.
Definition TLeaf.h:57
virtual TLeaf * GetLeafCount() const
If this leaf stores a variable-sized array or a multi-dimensional array whose last dimension has vari...
Definition TLeaf.h:121
TBranch * GetBranch() const
Definition TLeaf.h:116
A doubly linked list.
Definition TList.h:38
TObject * FindObject(const char *name) const override
Find an object in this list using its name.
Definition TList.cxx:576
void Add(TObject *obj) override
Definition TList.h:83
TObject * Remove(TObject *obj) override
Remove object from the list.
Definition TList.cxx:820
void Delete(Option_t *option="") override
Remove all objects from the list AND delete all heap based objects.
Definition TList.cxx:468
const char * GetName() const override
Returns name of object.
Definition TNamed.h:47
An array of TObjects.
Definition TObjArray.h:31
Int_t GetEntriesFast() const
Definition TObjArray.h:58
void AddAt(TObject *obj, Int_t idx) override
Add object at position ids.
virtual void AddAtAndExpand(TObject *obj, Int_t idx)
Add object at position idx.
TObject * UncheckedAt(Int_t i) const
Definition TObjArray.h:84
TObject * Remove(TObject *obj) override
Remove object from array.
Collectable string class.
Definition TObjString.h:28
const char * GetName() const override
Returns name of object.
Definition TObjString.h:38
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:973
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:987
virtual void Info(const char *method, const char *msgfmt,...) const
Issue info message.
Definition TObject.cxx:961
Regular expression class.
Definition TRegexp.h:31
Basic string class.
Definition TString.h:139
Ssiz_t Length() const
Definition TString.h:417
void ToLower()
Change string to lower-case.
Definition TString.cxx:1182
Int_t Atoi() const
Return integer value of string.
Definition TString.cxx:1988
const char * Data() const
Definition TString.h:376
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition TString.h:704
TString & Append(const char *cs)
Definition TString.h:572
void Form(const char *fmt,...)
Formats a string using a printf style format descriptor.
Definition TString.cxx:2356
Bool_t Contains(const char *pat, ECaseCompare cmp=kExact) const
Definition TString.h:632
Ssiz_t Index(const char *pat, Ssiz_t i=0, ECaseCompare cmp=kExact) const
Definition TString.h:651
virtual const char * Getenv(const char *env)
Get environment variable.
Definition TSystem.cxx:1665
A cache to speed-up the reading of ROOT datasets.
Definition TTreeCache.h:32
virtual void SetLearnPrefill(EPrefillType type=kNoPrefill)
Set whether the learning period is started with a prefilling of the cache and which type of prefillin...
virtual void UpdateBranches(TTree *tree)
Update pointer to current Tree and recompute pointers to the branches in the cache.
Double_t GetEfficiencyRel() const
This will indicate a sort of relative efficiency... a ratio of the reads found in the cache to the nu...
bool fEnabled
! cache enabled for cached reading
Definition TTreeCache.h:63
Long64_t fLastMiss
! set to the event # of the last miss.
Definition TTreeCache.h:74
Int_t fNMissReadPref
Number of blocks read into the secondary ("miss") cache.
Definition TTreeCache.h:50
~TTreeCache() override
Destructor. (in general called by the TFile destructor)
Int_t fNMissReadOk
Number of blocks read, not found in the primary cache, and found in the secondary cache.
Definition TTreeCache.h:46
Long64_t fEntryMin
! first entry in the cache
Definition TTreeCache.h:38
void SetOptimizeMisses(bool opt)
Start of methods for the miss cache.
Int_t LearnBranch(TBranch *b, bool subgbranches=false) override
Add a branch discovered by actual usage to the list of branches to be stored in the cache this functi...
static void SetLearnEntries(Int_t n=10)
Static function to set the number of entries to be used in learning mode The default value for n is 1...
bool CheckMissCache(char *buf, Long64_t pos, int len)
Check the miss cache for a particular buffer, fetching if deemed necessary.
Long64_t fEntryNext
! next entry number where cache must be filled
Definition TTreeCache.h:41
Int_t fNReadMiss
Number of blocks read and not found in the cache.
Definition TTreeCache.h:47
bool fReverseRead
! reading in reverse mode
Definition TTreeCache.h:58
Double_t GetEfficiency() const
Give the total efficiency of the primary cache... defined as the ratio of blocks found in the cache v...
virtual bool FillBuffer()
Fill the cache buffer with the branches in the cache.
TTree * GetTree() const
Definition TTreeCache.h:149
Long64_t fNextClusterStart
! End+1 of the cluster(s) where the current content was picked out
Definition TTreeCache.h:43
bool ProcessMiss(Long64_t pos, int len)
! Given a file read not in the miss cache, handle (possibly) loading the data.
const TObjArray * GetCachedBranches() const
Definition TTreeCache.h:139
bool fIsLearning
! true if cache is in learning mode
Definition TTreeCache.h:54
Int_t fNMissReadMiss
Number of blocks read and not found in either cache.
Definition TTreeCache.h:48
IOPos FindBranchBasketPos(TBranch &, Long64_t entry)
Given a branch and an entry, determine the file location (offset / size) of the corresponding basket.
virtual void SetEntryRange(Long64_t emin, Long64_t emax)
Set the minimum and maximum entry number to be processed this information helps to optimize the numbe...
TTreeCache()
Default Constructor.
std::unique_ptr< MissCache > fMissCache
! Cache contents for misses
Definition TTreeCache.h:105
bool fFirstBuffer
! true if first buffer is used for prefetching
Definition TTreeCache.h:56
bool fOptimizeMisses
! true if we should optimize cache misses.
Definition TTreeCache.h:72
void StartLearningPhase()
The name should be enough to explain the method.
TBranch * CalculateMissEntries(Long64_t, int, bool)
Given an file read, try to determine the corresponding branch.
Long64_t fCurrentClusterStart
! Start of the cluster(s) where the current content was picked out
Definition TTreeCache.h:42
Double_t GetMissEfficiency() const
The total efficiency of the 'miss cache' - defined as the ratio of blocks found in the cache versus t...
virtual Int_t ReadBufferNormal(char *buf, Long64_t pos, Int_t len)
Old method ReadBuffer before the addition of the prefetch mechanism.
virtual void ResetCache()
This will simply clear the cache.
bool fOneTime
! used in the learning phase
Definition TTreeCache.h:57
EPrefillType fPrefillType
Whether a pre-filling is enabled (and if applicable which type)
Definition TTreeCache.h:64
bool fLearnPrefilling
! true if we are in the process of executing LearnPrefill
Definition TTreeCache.h:68
virtual void LearnPrefill()
Perform an initial prefetch, attempting to read as much of the learning phase baskets for all branche...
virtual Int_t DropBranch(TBranch *b, bool subbranches=false)
Remove a branch to the list of branches to be stored in the cache this function is called by TBranch:...
Long64_t fEntryMax
! last entry in the cache
Definition TTreeCache.h:39
static Int_t fgLearnEntries
number of entries used for learning mode
Definition TTreeCache.h:65
virtual Int_t ReadBufferPrefetch(char *buf, Long64_t pos, Int_t len)
Used to read a chunk from a block previously fetched.
Long64_t fEntryCurrent
! current lowest entry number in the cache
Definition TTreeCache.h:40
void ResetMissCache()
Reset all the miss cache training.
Long64_t fFirstMiss
! set to the event # of the first miss.
Definition TTreeCache.h:73
Int_t ReadBuffer(char *buf, Long64_t pos, Int_t len) override
Read buffer at position pos if the request is in the list of prefetched blocks read from fBuffer.
bool fReadDirectionSet
! read direction established
Definition TTreeCache.h:62
bool fIsManual
! true if cache is StopLearningPhase was used
Definition TTreeCache.h:55
Double_t GetMissEfficiencyRel() const
Relative efficiency of the 'miss cache' - ratio of the reads found in cache to the number of reads so...
Int_t SetBufferSize(Int_t buffersize) override
Change the underlying buffer size of the cache.
Long64_t fFirstEntry
! save the value of the first entry
Definition TTreeCache.h:61
Int_t fFillTimes
! how many times we can fill the current buffer
Definition TTreeCache.h:59
Int_t fNReadPref
Number of blocks that were prefetched.
Definition TTreeCache.h:49
TTree * fTree
! pointer to the current Tree
Definition TTreeCache.h:53
EPrefillType GetConfiguredPrefillType() const
Return the desired prefill type from the environment or resource variable.
static Int_t GetLearnEntries()
Static function returning the number of entries used to train the cache see SetLearnEntries.
virtual void StopLearningPhase()
This is the counterpart of StartLearningPhase() and can be used to stop the learning phase.
void SetFile(TFile *file, TFile::ECacheAction action=TFile::kDisconnect) override
Change the file that is being cached.
Int_t fNbranches
! Number of branches in the cache
Definition TTreeCache.h:44
Int_t fNReadOk
Number of blocks read and found in the cache.
Definition TTreeCache.h:45
void Print(Option_t *option="") const override
Print cache statistics.
Int_t AddBranch(TBranch *b, bool subgbranches=false) override
Add a branch to the list of branches to be stored in the cache this function is called by the user vi...
TObjArray * fBranches
! List of branches to be stored in the cache
Definition TTreeCache.h:51
TList * fBrNames
! list of branch names in the cache
Definition TTreeCache.h:52
bool fFirstTime
! save the fact that we processes the first entry
Definition TTreeCache.h:60
Helper class to iterate over cluster of baskets.
Definition TTree.h:270
Long64_t Next()
Move on to the next cluster and return the starting entry of this next cluster.
Definition TTree.cxx:650
Long64_t GetNextEntry()
Definition TTree.h:307
A TTree represents a columnar dataset.
Definition TTree.h:79
virtual TVirtualPerfStats * GetPerfStats() const
Definition TTree.h:506
virtual TBranch * GetBranch(const char *name)
Return pointer to the branch with the given name in this tree or its friends.
Definition TTree.cxx:5294
virtual TObjArray * GetListOfLeaves()
Definition TTree.h:489
virtual Long64_t GetEntries() const
Definition TTree.h:463
virtual Long64_t GetReadEntry() const
Definition TTree.h:509
virtual TTree * GetTree() const
Definition TTree.h:517
TEventList * GetEventList() const
Definition TTree.h:473
TClass * IsA() const override
Definition TTree.h:659
virtual TList * GetListOfFriends() const
Definition TTree.h:490
Provides the interface for the PROOF internal performance measurement and event tracing.
virtual void SetMissed(TBranch *b, size_t basketNumber)=0
virtual void UpdateBranchIndices(TObjArray *branches)=0
const Int_t n
Definition legend1.C:16
__device__ AFloat max(AFloat x, AFloat y)
Definition Kernels.cuh:207
Long64_t BinarySearch(Long64_t n, const T *array, T value)
Binary search in an array of n values to locate value.
Definition TMathBase.h:347
Int_t fLen
Position in file of cache entry.
Definition TTreeCache.h:82
Ta Range(0, 0, 1, 1)