(retry) PROOF and I/O from Doug Schouten on 2010-07-08 (RootTalk)

From: Doug Schouten <dschoute_at_sfu.ca>
Date: Wed, 7 Jul 2010 16:55:31 -0700

Hi,

I am writing some fairly complicated selectors (TPySelector's actually) and I notice that, particularly when accessing data over NFS, the PROOF slaves quickly become I/O bound, as I see many proofserve.exe processes sitting nearly idle. This also happens using data only on local disk
(RAID-5, 7200 rpm Seagate Barracuda's ... so can't improve things to
much there).

I have tried increasing the TTree caching using t.SetCacheSize(), and I have also slimmed the ROOT files considerably and turned off all the branches with SetBranchStatus() that I don't need at run-time.

However, I still see relatively poor performance in terms of CPU usage. I have 16-core machines (albeit with hyper-threading) and I would like to utilize them better.

So my question is two-fold:

(1) are there some methods/tips/tricks to improve performance? Are there
caching parameters that I can set somewhere to prefetch files/trees in larger chunks? Currently I am processing my datasets at ~ 2.5 MB/s, reported by the PROOF GUI, which is pretty slow IMHO. However, I think this is actually the rate of data being analyzed and not the rate at which I am reading through the files, which I guess are two very different things for large trees with many branches that I am not using. Am I right about this?

(2) anticipating that there are no easy solutions in (1), has anyone
heard of memcached? This is a distributed memory cache which one can use to pool extra RAM from multiple machines. One can then use a FUSE filesystem, memcachefs, to store files in pooled memory. I am wondering how I could possibly interface this with the TDSet infrastructure in PROOF. In particular, I imagine a FIFO buffer manager that pre-fetches files in a TDSet and kicks out already-processed ones, running in a separate thread/process somewhere on my cluster. Somehow, I would have to trick PROOF to not verify the files before running the workers
(because they would only 'arrive' in the cache just before they are
needed), and I would have to have some way of communicating where I am in the TDSet list of files to the cache manager so that I can grab the next N files and place them in the cache. Then, if the memory cache is large enough, or if I can copy files into it ~ as fast as I process them, hopefully I can lessen the I/O constraints since reading from this cache will be constrained only by network latency and some (apparently) very small CPU overhead in memcached.

(Note: there is also a C++ API for memcached which can deal with
arbitrary chunks of data, not restricted to whole files, but I imagine this would be even more low-level and complicated.)

thanks,
Doug Received on Thu Jul 08 2010 - 01:55:38 CEST

This archive was generated by hypermail 2.2.0 : Thu Jul 08 2010 - 11:50:01 CEST