Re: [ROOT] Comparison of speed of accessing data

From: Rene Brun (Rene.Brun@cern.ch)
Date: Thu Nov 21 2002 - 22:06:13 MET


Hi HP,

Could you tell me the total time to process only
   for each stock
      ** tree = (TTree*) root_file.Get(stock),
      activate the Date branch,
      root_file.Delete(stock)

I am suspecting that you may spend a lot of time of your first pass
to select the entries from your branch date.
Do you know about TTree::BuildIndex? This could speed up
considerably your search. You just have to build an index
on the date or date&time once and save the Tree header again.

Also try the above exercise  creating the file with no compression.

Let me know

Rene Brun

On Thu, 21 Nov 2002, HP Wei wrote:

> I am doing a comparison of speed of accessing data
> in a root file and in a custom database file in our company.
> 
> The data file contains 9700 stocks' Intraday price data.
> The root file is organized as follows.
> Each stock has one tree named by the stock's ticker.
> The tree has a few branches for (Date, time, price, size ...)
> This root file is for one month only.
> i.e. each tree contains one month worth of data for one stock.
> 
> The test is a loop which goes through all 9700 stocks
> and extracts (Date, time, price, size) for a given date in a month.
> Here is the pseudo-code:
> ------------------------------------------------------------
>   for each stock
>      ** tree = (TTree*) root_file.Get(stock),
>      activate the Date branch,
>      find the indexes for the desired date
>      set up containers for the requested data fields, (using STL's vector)
>      activate branches (date, time, price, size),
>      for all the targeted indexes
>         tree->GetEntry(i),
>         put the data into the container.
>      root_file.Delete(stock)
> -------------------------------------------------------------
>      
> The result:
>    It takes 76.6 seconds to finish the above loop.
>    In comparison,
>    with our internal database format and doing the same task,
>    the time is 23.2 seconds.
>    
> The mark ** in the above loop may be the bottleneck.
> So, I tried to use:
>    tree = (TTree *) treeMap[stock]->ReadObj();
> where treeMap is a STL map<string, TKey*> which maps the stock ticker
> to its corresponding TKey structure in memory.
> This improves the time a little: 68.7 seconds, which is still
> a factor three behind that for accessing our custom databases.
> 
> ----------------------------------------------------------
> So, the question is:
> Are Get() and ReadObj() very expensive ??
> 
> Could anyone suggest some other ways to organize and to access
> the data in ROOT ????
> 
> Thanks,
> HP
> 
> 
>    
>         
>      
> 



This archive was generated by hypermail 2b29 : Sat Jan 04 2003 - 23:51:20 MET