Log of /trunk/io/io/src/TFilePrefetch.cxx
Parent Directory
Revision
48306 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Jan 16 16:01:47 2013 UTC (2 years ago) by
pcanal
File length: 15721 byte(s)
Diff to
previous 46616
From Elvin:
There was a race condition between the destructor of the TTree object and the functionality to kill the asynchronous thread doing the prefetching.
In more detail: In the the TTree::~TTree line 789 the TFileCacheRead object of the current file is set to 0. All the logic to kill the async thread is done in the destructor of TFilePrefetch which in turn is called from the destructor of TFileCacheRead. In the same file two lines below the destructor of TFileCacheRead is called. And initially TFilePrefetch held a pointer to the file object in TFileCacheRead which now is 0. Therefore, during the destruction of the TFilePrefetch object we don't have any longer a valid TFile pointer. So, we can not wait for the ongoing requests to be satisfied. This was the reason of the crash. To fix this, I removed the killing of the async thread form the destructor of the TFilePrefetch method and I've put it in a separate method called WaitFinishPrefetch. In this way, we avoid the potential scenario of trying to wait for some request for a file while not having a valid pointer to that file.
Revision
46616 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Oct 17 15:08:28 2012 UTC (2 years, 3 months ago) by
pcanal
File length: 15430 byte(s)
Diff to
previous 46419
From Elvin:
Basically, the problem was that the second time the downloading was starting the main thread was signalling the addition of a new block in the pending list of blocks way faster than the async thread could start and block in the corresponding wait. Therefore the signal was missed by the async thread and it would remain blocked waiting for a signal that would never come.
This problem never showed before because we didn't have the second downloading in the same executable and this made all the difference. I attached the patch to this email.
Revision
46419 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Oct 9 20:22:10 2012 UTC (2 years, 3 months ago) by
pcanal
File length: 14716 byte(s)
Diff to
previous 46360
From Elvin:
Make a cleaner difference between the actual capacity and useful data contained in a recycled block. The previous approach fixed the bug but it was still unclear when recycling blocks as the capacity was confused with the useful data contained in the buffer.
Revision
46360 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Fri Oct 5 18:46:58 2012 UTC (2 years, 3 months ago) by
pcanal
File length: 14718 byte(s)
Diff to
previous 45493
From Elvin:
I tracked down a small bug in TFPBlock.cxx::ReallocBlock which was not updating correctly the total size of the block which was recycled. This could only manifest itself in the case where the block was to be re-read from the local cache.
I fixed this which was an one-liner but I also profited from this occasion to change the type of fullBlockSize from Int_t to Long64_t just to be on the safe size and fix a possible memory leak in TFPBlock. I added the patch to this email. I reran the tests and they all pass.
Revision
45493 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Aug 8 19:02:44 2012 UTC (2 years, 5 months ago) by
pcanal
File length: 14691 byte(s)
Diff to
previous 45487
From Elvin:
I made a small modification in TFilePrefetch::ThreadProc and changed
pClass->fSemMasterWorker->TryWait() ==1
to
pClass->fSemMasterWorker->TryWait() !=0
as in the documentation on the web site it says that it can return 1 or errno. The case in which I am interested is that the return value if different from 0.
Revision
44761 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Jun 27 13:41:06 2012 UTC (2 years, 6 months ago) by
pcanal
File length: 14559 byte(s)
Diff to
previous 43301
From Elvin (and Brian):
I looked over the optimisation suggestions that you sent me and I implemented 4/5 of them. Below is the summary.
1) 60% of the time spent in TFileCacheRead::ReadBuffer is from TStorage::ReAllocChar. ReAllocChar spends 86% of its time in memcpy, 8% in alloc, and 6% in memset. It appears that, when a buffer is recycled, the contents of the old buffer (which are then overwritten) are
copied over.
I modified the call to ReAllocChar not to copy the old contents.
Unfortunately, in my testing, this wasn't enough - later on in ReAllocChar, it zeros out the contents of the array, which has basically the same overhead as copying.
There is no version of TStorage:ReAlloc that would satisfy the current requirements so I'm using the classic realloc for the TFPBlock buffer.
2) There are a few function calls that could be inlined which aren't inlined by the compiler (GCC 4.6.2). Particularly, TFPBlock::GetLen, TFPBlock::GetBuffer, TFPBlock::GetNoElem, and TFPBlock::GetPos.
Done - I in-lined them explicitly, this should do the trick.
3) TTreeCache and TFilePrefetch both keep a sorted list of buffers that TFilePrefetch maintains. When TFileCacheRead::ReadBuffer is called, a binary search is called on both. We can eliminate one of the binary searches and save 3%.
This would require some major changes and it would also affect the normal reading pattern (i.e. when reading without the prefetching enabled). I suggest to keep it as it is for the time being so that we maintain the compatibility with the normal reading without prefetching.
4) TFilePrefetch::ReadBuffer calculates the offset into the block's buffer (ptrInt) on-demand. You could probably win a few more percent here by pre-calculating these offsets for the TFPBlock.
Done - added a new vector of relative offsets of the pieces in the buffer (in TFPBlock).
5) The deadlock issue.
Done - I moved to a cleaner and simpler way to kill the thread by using cancellation. The deadlock situation was introduced in the last patch that I sent you when I was dealing with the TChain issue. The mutex locking was not related to the condition variable, but with the synchronisation with TChain.
Brian: Thread cancellation scares the heck out of me - it's much harder to get correct than condition variables, and goes against most best practices. I'd much rather fixing the usage of conditions and have an explicit synchronization for killing the helpers.
Elvin also reverted to classic condition variables and semaphores when killing the worker thread.
Revision
41698 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Nov 1 21:22:54 2011 UTC (3 years, 2 months ago) by
pcanal
File length: 14263 byte(s)
Diff to
previous 39724
From Elvin:
Last week Martin Vala from ALICE came to me with a problem that he had
while using the asynchronous prefetching. There were basically two
main problems:
1. Trying to read a root file from an archive. Here the problem was
that when reading from an archive there is an offset of the file which
was not taken into consideration when the file was saved in the cache.
And this lead to a miss when reading the file from cache. I fixed it,
but I had to expose the value of fArchiveOffset from TFile.
2. The second problem was when reading using a TChain. There were some
synchronization issues concerned to the asynchronous thread that
actually does the reading. All this was happening because in the case
of TChain there is only one file cache which is re-utilized as we move
from one file to another. This was a pretty tricky issue.
I attached a patch made against the current trunk which fixes both
this problems. I gave the patch first to Martin to test it, and he was
satisfied with it. There is a small delay when the TChain moves from
one file to another because I have to wait for the async thread to
finish it's worked but over all Martin said that the performance is
way better than before. When I initially did the asyn pre-fetching I
had no idea about these two use cases, so that's why they popped up
now.
Revision
39724 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Jun 14 19:12:39 2011 UTC (3 years, 7 months ago) by
pcanal
File length: 13563 byte(s)
Diff to
previous 39674
From Elvin:
additional checks before starting the prefetching thread and also I modify the way memory is deallocated when the destructor of the TFilePrefetch class is called. According to the documentation of TThread::Delete, when the object is allocated on the heap one should call directly delete which is what I added in the new patch.
Revision
39673 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Fri Jun 10 16:11:43 2011 UTC (3 years, 7 months ago) by
pcanal
File length: 13576 byte(s)
Diff to
previous 39458
From Elvin
- completely remove the recycle list and I recycle blocks directly from the read list ( the oldest block in the list is recycled first)
- improve the prefetching strategy so that if the user reads sparsely (only one entry from a block) then the prefetching thread won't prefetch the following block as it will never be used. But it will prefect the block corresponding to the new entry requested.
- so now for example if one wants to read only entries 0, 1000, 2000 and 3000 the program will only prefetch 4 blocks (in comparison to 32 as it did before)
- this also leads to smaller run times when reading sparsely
- by removing the recycle list, during any type of execution (sequential, sparse) I only use two TFPBlock thus considerably reducing the memory footprint. (you can see how blocks are created and recycled by putting two prints in TFilePrefetch::CreateObject), and valgrind --tool=massif shows a maximum size of 60 MB allocated for TFPBlock.
Revision
39458 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Fri May 27 15:07:18 2011 UTC (3 years, 7 months ago) by
pcanal
File length: 14142 byte(s)
Diff to
previous 39344
From Elvin:
Disable the normal reading mode as a fall-back method of reading and use only the prefetching mechanism.
For the case when a request is not satisfied from the first try, we now will continue to prefetch until
the request is within the blocks read.
The problem seemed to appear only in the TWebFile plug-in as it was using the same connection for sending
requests regardless of the thread. From what I understood in xrd things are different and this problem
didn't appear while reading using the TXNetFile plug-in.
Also change the type of the prefetching thread from a detached one to a joined one as there were some
issues with synchronization if the main thread finished reading before the worker thread finished prefetching blocks.
Revision
39275 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Added
Thu May 19 18:17:37 2011 UTC (3 years, 8 months ago) by
pcanal
File length: 14054 byte(s)
From Elvin Alin Sindrilaru:
The prefetching mechanism uses two new classes (TFilePrefetch.h and
TFPBlock.h) to prefetch in advance a block of entries. There is a second
thread which takes care of actually transferring the blocks and making
them available to the main requesting thread. Therefore, the time spent
by the main thread waiting for the data before processing considerably
decreases. Besides the prefetching mechanisms there is also a local
caching option which can be enabled by the user. Both capabilities are
disabled by default and must be explicitly enabled by the user.
In order to enable the prefetching the user must define the environment
variable "TFile.AsyncPrefetching" as follows:
gEnv->SetValue("TFile.AsyncPrefetching", 1).
Only when the prefetching is enabled can the user set the local cache
directory in which the file transferred can be saved. For subsequent
reads of the same file the system will use the local copy of the file
from cache. To set up a local cache directory, a client can use the
following commands:
TString cachedir="file:/tmp/xcache/";
// or using xrootd on port 2000
// TString cachedir="root://localhost:2000//tmp/xrdcache1/";
gEnv->SetValue("Cache.Directory", cachedir.Data());
The "TFilePrefetch" class is responsible with actually reading and storing
the requests received from the main thread. It also creates the working
thread which will transfer all the information. Apart from managing the
block requests, it also deals with caching the blocks on the local machine
and retrieving them when necessary.
The "TFPBlock" class represents the encapsulation of a block request. It
contains the chunks to be prefetched and also serves as a container for
the information read.
In order to accommodate the new prefetching mechanism the following files
have suffered considerable modifications: TFileCacheRead.cxx/.h,
TTreeCache.cxx/.h and to a less extent: TXNetFile.cxx, TFile.h.
Basically in TFileCacheRead we've added the logic for dealing with the
second buffer that is prefetched. In TTreeCache during prefeching the
method FillBuffer is called after each read so that once the main thread
starts reading from the last available buffer, the second thread starts
prefetching the next block.
This form allows you to request diffs between any two revisions of this file.
For each of the two "sides" of the diff,
enter a numeric revision.