Re: [ROOT] Reading ROOT trees: no read-ahead under linux-2.6.8?

From: Fons Rademakers (Fons.Rademakers@cern.ch)
Date: Wed Oct 06 2004 - 01:08:00 MEST


Hi Konstantin,

   I run 2.6.8.1 on my RH9 machine and have not seen any substantial 
drop in CPU load and performance when reading/writing trees (like in the 
ROOT stress test).

Are you sure your disk is in DMA mode (hdparm /dev/hdX)?

Any such drastic kernel change must have been extensively discussed on 
the different kernel lists and it would not only impact on ROOT but on 
many other applications as well.

Cheers, Fons.




Victor Perevoztchikov wrote:
>> kernel think "oh, this process is just jumping around, no bother
>> with read-ahead".
> 
> I think it is the case. Read I/O is direct access. But probably it is
> possible to add the test, if next reading is just the next part of
> disk, which is usual, then do not call lseek.
> 
> Victor
> 
> Victor M. Perevoztchikov   perev@bnl.gov Brookhaven National
> Laboratory MS 510A PO Box 5000 Upton NY 11973-5000 tel office :
> 631-344-7894; fax 631-344-4206;
> 
> ----- Original Message ----- From: "Konstantin Olchanski"
> <olchansk@sam.triumf.ca> To: <roottalk@pcroot.cern.ch> Sent: Monday,
> October 04, 2004 4:29 PM Subject: [ROOT] Reading ROOT trees: no
> read-ahead under linux-2.6.8?
> 
> 
> 
>> Rooters- we observe suboptimal performance while reading ROOT
>> trees- the ROOT process is nominally I/O bound, but we see low CPU
>> utilization, about 30% (top, vmstat 1), low disk utilization 40%
>> (iostat -x 1) and high wait times (vmstat 1) 15%. This is on
>> Fedora-2 with the Fedora stock 2.6.8 linux kernel.
>> 
>> The observed pattern is consistent with "wait for data from disk,
>> compute some, wait for more data from disk, compute some more,
>> etc...".
>> 
>> Disk-level and file-level read-ahead inside recent 2.6 linux
>> kernels is supposed to prevent the "wait for data" thing, but
>> aparently read-ahead is not happening at all. If we cause "manual"
>> read-ahead (say, concurrently, "dd" the tree file to /dev/null),
>> disk utilization and CPU utilization go to 100%, as they should,
>> and the tree reading code runs about twice faster.
>> 
>> I suspect that the pattern of system calls that ROOT uses to read
>> trees: read() followed by lseek() followed by read(), etc... (as
>> observed by strace) somehow defeats and disables the file-level 
>> read-ahead in the 2.6 linux kernels. Maybe the lseek() calls make
>> the kernel think "oh, this process is just jumping around, no
>> bother with read-ahead".
>> 
>> I can think of several ROOT-level solutions to this performance
>> problem: (apart from doctoring the Linux kernel)
>> 
>> 1) get rid of the lseek() calls (an "optimization": to skip 12
>> Kbytes,
> 
> instead
> 
>> of lseek() do a read() into a dummy buffer, to skip 12 Mbytes, do
> 
> lseek()).
> 
>> 2) do more data buffering inside ROOT (to read 12 Kbytes, read 12
>> Mbytes into an internal buffer (ram is cheap) and return subsequent
>> data from this buffer). 3) maybe use xrootd? Does it do read-ahead
>> data buffering?
>> 
>> Any thoughts?
>> 
>> -- Konstantin Olchanski Data Acquisition Systems: The Bytes Must
>> Flow! Email: olchansk-at-triumf-dot-ca Snail mail: 4004 Wesbrook
>> Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
>> 

-- 
Org:    CERN, European Laboratory for Particle Physics.
Mail:   1211 Geneve 23, Switzerland
E-Mail: Fons.Rademakers@cern.ch              Phone: +41 22 7679248
WWW:    http://www.rademakers.org/fons/      Fax:   +41 22 7679480



This archive was generated by hypermail 2b29 : Sun Jan 02 2005 - 05:50:09 MET