You are here

How to Write a Tree?

The concept of Tree has been introduced at Input/Output with Trees. An example of Tree is given in the documentation of the class TTree. A more complete example is given in Event.

We will now use variants of the program Event to illustrate the advantages of Trees compared to the object serialization method described in How to Write Objects.

It is very important to note that alternating between the methods described below implies NO changes in the object model.

Using the serialization method and creating a key

The serialization method is perfectly appropriate when it does not make sense to access sub-objects components. For example, you write an histogram object and you read it as a histogram. It is unlikely that you want to read back only one part of an histogram.

This case does not use a Tree at all. Once the event is filled, it is serialized (via event->Write in one buffer and written on the file using a TKey (in this case the string "event" followed by the event number).

On the other hand, benchmarks corroborated by about an exabyte of HEP (and not only) data acquired in the past decade, have shown that the afore mentioned strategy cannot scale for storing entities like particles collision events. We therefore encourage users to approach the saving of their data according to a column-wise storage technique using Trees.

The Serialization method

main(int argc, char **argv)
{
   TFile hfile("Event.root","RECREATE");
 
   // Fill event, header and tracks with some random numbers
   Event  *event;
   char keyname[16];
   for (Int_t ev = 0; ev < 1000; ev++) {
 
      Float_t sigmat, sigmas;
      gRandom->Rannor(sigmat,sigmas);
      Int_t ntrack   = Int_t(600 +5*sigmat);
      Float_t random = gRandom->Rndm(1);
 
      event = new Event();    // Create one event
 
      event->SetHeader(ev, 200, 960312, random);
      event->SetNseg(Int_t(10*ntrack+20*sigmas));
      event->SetNvertex(1);
      event->SetFlag(UInt_t(random+0.5));
      event->SetTemperature(random+20.);
 
      //  Create and Fill the Track objects
      for (Int_t t = 0; t < ntrack; t++) event->AddTrack(random);
 
      //  Encode key name (see TKey) using the event serial number
      sprintf(keyname,"event%d",ev);
 
      //  Serialize event into one single buffer
      event->Write(keyname);
 
      delete event;
   }
   hfile.Close();
   return 0;
}

Granularity_0.gif

Making a Tree with one single branch

The program below is slightly modified compared to the program above. It creates a Tree with one single branch (variable split is set to 0).

When the program will terminate, only one key will be written to the file, corresponding to the Tree header. The tree->Fill statement generates several buffers (baskets) dumped to the file when they are full. The variable bsize specifies the initial basket size. If this size is not enough to accommodate at least one event, it will be resized accordingly.

This method offers only slight advantages compared to the first method.

  • It generates files slightly smaller than the first case.
  • It minimizes the number of keys.
  • Looping on the data set is a bit simpler (see next chapter)
  • One can use the TTree::Draw command to histogram a member function of the Event class.

A Tree with one single branch

main(int argc, char **argv)
{
   TFile hfile("Event.root","RECREATE");
 
   // Create a ROOT Tree
 
   TTree *tree = new TTree("T","An example of a ROOT tree");
 
   Int_t split = 0;
   Int_t bsize = 64000;
   Event  *event = 0;
 
    // Create one branch. If splitlevel is set, event is a superbranch
    // creating a sub branch for each data member of the Event object.
 
   tree->Branch("event", "Event", &event, bsize,split);
 
   // Fill event, header and tracks with some random numbers
   char keyname[16];
   for (Int_t ev = 0; ev < 1000; ev++) {
 
      Float_t sigmat, sigmas;
      gRandom->Rannor(sigmat,sigmas);
      Int_t ntrack   = Int_t(600 +5*sigmat);
      Float_t random = gRandom->Rndm(1);
 
      event = new Event();    // Create one event
 
      event->SetHeader(ev, 200, 960312, random);
      event->SetNseg(Int_t(10*ntrack+20*sigmas));
      event->SetNvertex(1);
      event->SetFlag(UInt_t(random+0.5));
      event->SetTemperature(random+20.);
 
      //  Create and Fill the Track objects
      for (Int_t t = 0; t < ntrack; t++) event->AddTrack(random);
 
      //  Serialize single branch of event into one single basket
      tree->Fill();
 
      delete event;
   }
   hfile.Close();
   return 0;
}

Making a Tree with many branches

If you run the program above and set split = 1, then the statement: tree->Branch("event", &event, bsize, split);

will generate several branches, one for each data member of the Event class. The following rules are applied:

  • If a data member is a basic type, it becomes one basic branch of class TBranch.
  • A data member can be an array of basic types (eg fTable[12]). In this case, one single branch is created for the array.

TBranch

  • If a data member is an object , the data members of this object are also split into branches (up to the limit set by the value of split, i.e. one level in this example). This is for example the case for the data member fEvtHdr. However objects of the classes TArrayX are not supported.
  • A data member is pointer to an object will not be split. This is the case in our example for the data member fH, a pointer to an histogram. The fH branch will be filled by calling the class Streamer function to serialize this object into the branch buffer.
  • A TString or std::string data member will not be split.
  • A data member that is an array of objects will not be split.
  • If a data member is a pointer to a TClonesArray object, its content will be split. In our example, this is the case for the data member fTracks.
  • Note that the split method can quickly generate many branches. Each branch has its own buffer in memory. In case of many branches (say more than 100), you should adapt the buffer size accordingly. A recommended buffer size is 32000 bytes if you have less than 50 branches. Around 16000 bytes if you have less than 100 branches and 4000 bytes if you have more than 500 branches. These numbers should be OK for existing computers with memory size ranging from 32MB to 256MB. If you have more memory, you should specify larger buffer sizes. However, in this case, do not forget that your file might be used on another machine with a smaller memory configuration.

This method provides the highest possible level of granularity. You continue obviously to have direct access to any single event. In addition, you can now read selectively a subset of all the branches in your analysis program. This is the subject of the next chapter.

This method requires more memory, because one buffer must be created for each branch. In case, you have memory problems, you should decrease the buffer size (parameter bsize in the above program.

Remarks about compression when running in split mode

If the file compression is activated (default), ROOT will compress the branch buffers depending on the level of file compression (set by TFile:SetCompressionLevel).

  • If level = 1, only branches containing integers will be compressed.
  • If level >= 2, all branches will be compressed with compression level-1.

Creating branches by hand

If you are not happy with the way the automatic split algorithm works, you can create branches yourself by invoking directly the TTree::Branch function.

Do not forget, if you do so, to set the addresses for every created branch in the filling loop, in case the addresses change.