[root] / trunk / io / io / inc / TFPBlock.h Repository:
ViewVC logotype

Log of /trunk/io/io/inc/TFPBlock.h

Parent Directory Parent Directory


Links to HEAD: (view) (download) (as text) (annotate)
Sticky Revision:

Revision 46419 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Oct 9 20:22:10 2012 UTC (2 years, 3 months ago) by pcanal
File length: 4287 byte(s)
Diff to previous 46361
From Elvin:

Make a cleaner difference between the actual capacity and useful data contained in a recycled block. The previous approach fixed the bug but it was still unclear when recycling blocks as the capacity was confused with the useful data contained in the buffer.

Revision 46361 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Oct 5 18:47:05 2012 UTC (2 years, 3 months ago) by pcanal
File length: 4016 byte(s)
Diff to previous 44761
From Elvin:

I tracked down a small bug in TFPBlock.cxx::ReallocBlock  which was not updating correctly the total size of the block which was recycled. This could only manifest itself in the case where the block was to be re-read from the local cache. 

I fixed this which was an one-liner but I also profited from this occasion to change the type of fullBlockSize from Int_t to Long64_t just to be on the safe size and fix a possible memory leak in TFPBlock. I added the patch to this email. I reran the tests and they all pass.

Revision 44761 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jun 27 13:41:06 2012 UTC (2 years, 6 months ago) by pcanal
File length: 4009 byte(s)
Diff to previous 43276
From Elvin (and Brian):

I looked over the optimisation suggestions that you sent me and I implemented 4/5 of them. Below is the summary.

1) 60% of the time spent in TFileCacheRead::ReadBuffer is from TStorage::ReAllocChar.  ReAllocChar spends 86% of its time in memcpy, 8% in alloc, and 6% in memset.  It appears that, when a buffer is recycled, the contents of the old buffer (which are then overwritten) are
 copied over.

I modified the call to ReAllocChar not to copy the old contents. 
Unfortunately, in my testing, this wasn't enough - later on in ReAllocChar, it zeros out the contents of the array, which has basically the same overhead as copying.
There is no version of TStorage:ReAlloc that would satisfy the current requirements so I'm using the classic realloc for the TFPBlock buffer.

2) There are a few function calls that could be inlined which aren't inlined by the compiler (GCC 4.6.2).  Particularly, TFPBlock::GetLen, TFPBlock::GetBuffer, TFPBlock::GetNoElem, and TFPBlock::GetPos.

Done - I in-lined them explicitly, this should do the trick.

3) TTreeCache and TFilePrefetch both keep a sorted list of buffers that TFilePrefetch maintains.  When TFileCacheRead::ReadBuffer is called, a binary search is called on both.  We can eliminate one of the binary searches and save 3%.

This would require some major changes and it would also affect the normal reading pattern (i.e. when reading without the prefetching enabled). I suggest to keep it as it is for the time being so that we maintain the compatibility with the normal reading without prefetching.

4) TFilePrefetch::ReadBuffer calculates the offset into the block's buffer (ptrInt) on-demand.  You could probably win a few more percent here by pre-calculating these offsets for the TFPBlock.

Done - added a new vector of relative offsets of the pieces in the buffer (in TFPBlock).

5) The deadlock issue.

Done - I moved to a cleaner and simpler way to kill the thread by using cancellation. The deadlock situation was introduced in the last patch that I sent you when I was dealing with the TChain issue. The mutex locking was not related to the condition variable, but with the synchronisation with TChain.

Brian: Thread cancellation scares the heck out of me - it's much harder to get correct than condition variables, and goes against most best practices.  I'd much rather fixing the usage of conditions and have an explicit synchronization for killing the helpers.

Elvin also reverted to classic condition variables and semaphores when killing the worker thread.

Revision 43276 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Mar 7 17:13:42 2012 UTC (2 years, 10 months ago) by pcanal
File length: 2465 byte(s)
Diff to previous 41698
Coverity number 35355,35805,35666,35708,35511,35782,35782,35642,35787,35796,35653,35806,35667,
35670,35809,35810,35671,35673,35812,35688,35283,35824,35689,35825,35690,35691,35826,35827,35692,
35635,35636,35275
about missing operator= and/or copy constructors.

Revision 41698 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Nov 1 21:22:54 2011 UTC (3 years, 2 months ago) by pcanal
File length: 2342 byte(s)
Diff to previous 39296
From Elvin:
Last week Martin Vala from ALICE came to me with a problem that he had
while using the asynchronous prefetching. There were basically two
main problems:

1. Trying to read a root file from an archive. Here the problem was
that when reading from an archive there is an offset of the file which
was not taken into consideration when the file was saved in the cache.
And this lead to a miss when reading the file from cache. I fixed it,
but I had to expose the value of fArchiveOffset from TFile.

2. The second problem was when reading using a TChain. There were some
synchronization issues concerned to the asynchronous thread that
actually does the reading. All this was happening because in the case
of TChain there is only one file cache which is re-utilized as we move
from one file to another. This was a pretty tricky issue.

I attached a patch made against the current trunk which fixes both
this problems. I gave the patch first to Martin to test it, and he was
satisfied with it. There is a small delay when the TChain moves from
one file to another because I have to wait for the async thread to
finish it's worked but over all Martin said that the performance is
way better than before. When I initially did the asyn pre-fetching I
had no idea about these two use cases, so that's why they popped up
now.



Revision 39296 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri May 20 12:42:16 2011 UTC (3 years, 8 months ago) by rdm
File length: 2309 byte(s)
Diff to previous 39275
some code cleanup, added class descriptions and svn ident lines.

Revision 39275 - (view) (download) (as text) (annotate) - [select for diffs]
Added Thu May 19 18:17:37 2011 UTC (3 years, 8 months ago) by pcanal
File length: 1725 byte(s)
From Elvin Alin Sindrilaru:

The prefetching mechanism uses two new classes (TFilePrefetch.h and 
TFPBlock.h) to prefetch in advance a block of entries. There is a second 
thread which takes care of actually transferring the blocks and making 
them available to the main requesting thread. Therefore, the time spent 
by the main thread waiting for the data before processing considerably 
decreases. Besides the prefetching mechanisms there is also a local 
caching option which can be enabled by the user. Both capabilities are 
disabled by default and must be explicitly enabled by the user. 

In order to enable the prefetching the user must define the environment 
variable "TFile.AsyncPrefetching" as follows:
   gEnv->SetValue("TFile.AsyncPrefetching", 1). 
Only when the prefetching is enabled can the user set the local cache 
directory in which the file transferred can be saved. For subsequent 
reads of the same file the system will use the local copy of the file 
from cache. To set up a local cache directory, a client can use the 
following commands:

   TString cachedir="file:/tmp/xcache/";
   // or using xrootd on port 2000 
   // TString cachedir="root://localhost:2000//tmp/xrdcache1/";
   gEnv->SetValue("Cache.Directory", cachedir.Data());  

The "TFilePrefetch" class is responsible with actually reading and storing 
the requests received from the main thread. It also creates the working 
thread which will transfer all the information. Apart from managing the 
block requests, it also deals with caching the blocks on the local machine 
and retrieving them when necessary. 

The "TFPBlock" class represents the encapsulation of a block request. It 
contains the chunks to be prefetched and also serves as a container for 
the information read.

In order to accommodate the new prefetching mechanism the following files 
have suffered considerable modifications: TFileCacheRead.cxx/.h, 
TTreeCache.cxx/.h and to a less extent: TXNetFile.cxx, TFile.h. 
Basically in TFileCacheRead we've added the logic for dealing with the 
second buffer that is prefetched. In TTreeCache during prefeching the 
method FillBuffer is called after each read so that once the main thread 
starts reading from the last available  buffer, the second thread starts 
prefetching the next block.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

Subversion Admin
ViewVC Help
Powered by ViewVC 1.0.9