The ROOT Object I/O System


ROOT I/O basic principles

The version 3.0 of Root currently under development includes several enhancements in the I/O sub system, such as:

The new system is compatible with the current one. To use it, one must specify an option "+" in the pragma statement of the LinkDef.h file. To explain this new facility, we will use a simple example illustrating how I/O is done with the current version and what can be gained by selecting the new option.

To start, we will use the standard Root example in $ROOTSYS/test/Event.h. The same example will be used to also explain the new system.

class Event : public TObject {

private:
   char           fType[20];
   Int_t          fNtrack;
   Int_t          fNseg;
   Int_t          fNvertex;
   UInt_t         fFlag;
   Float_t        fTemperature;
   EventHeader    fEvtHdr;
   TClonesArray  *fTracks;
   TH1F          *fH; 
   Int_t          fMeasures[10];
   Float_t        fMatrix[4][4];
   Float_t       *fClosestDistance;   //[fNvertex] 

   ClassDef(Event,1)  //Event structure
};

To write an instance of the class Event to a file, we can do:

   TFile f("demo.root","new");
   Event *event = new event();
   event->Write("event0");

The statement "event->Write("event0")" writes the object to the file creating a key with name "event0". The Write function creates a TBuffer object and serializes all the members of the Event object to this buffer by invoking the Event::Streamer function. The Streamer function is generated automatically by the preprocessor rootcint.

  rootcint -f EventDict.cxx -c Event.h EventLinkDef.h
where EventLinkDef.h contains the following statements:
  #pragma link off all globals;
  #pragma link off all classes;
  #pragma link off all functions;
  #pragma link C++ class Event;

The file EventDict.cxx contains among other things the function Event::Streamer.

//______________________________________________________________________________
void Event::Streamer(TBuffer &R__b)
{
   // Stream an object of class Event.

   UInt_t R__s, R__c;
   if (R__b.IsReading()) {
      Version_t R__v = R__b.ReadVersion(&R__s, &R__c); if (R__v) { }
      TObject::Streamer(R__b);
      R__b.ReadStaticArray(fType);
      R__b >> fNtrack;
      R__b >> fNseg;
      R__b >> fNvertex;
      R__b >> fFlag;
      R__b >> fTemperature;
      fEvtHdr.Streamer(R__b);
      fTracks->Streamer(R__b);
      R__b >> fH;
      R__b.ReadStaticArray(fMeasures);
      R__b.ReadStaticArray((float*)fMatrix);
      delete []fClosestDistance; 
      fClosestDistance = new Float_t[fNvertex]; 
      R__b.ReadFastArray(fClosestDistance,fNvertex); 
      R__b.CheckByteCount(R__s, R__c, Event::IsA());
   } else {
      R__c = R__b.WriteVersion(Event::IsA(), kTRUE);
      TObject::Streamer(R__b);
      R__b.WriteArray(fType, 20);
      R__b << fNtrack;
      R__b << fNseg;
      R__b << fNvertex;
      R__b << fFlag;
      R__b << fTemperature;
      fEvtHdr.Streamer(R__b);
      fTracks->Streamer(R__b);
      R__b << (TObject*)fH;
      R__b.WriteArray(fMeasures, 10);
      R__b.WriteArray((float*)fMatrix, 16);
      R__b.WriteFastArray(fClosestDistance,fNvertex); 
      R__b.SetByteCount(R__c, kTRUE);
   }
}

The Streamer function above illustrates most of the I/O facilities in Root version 2.25.

In a following session, to read the Event object from the file, one can do:

   TFile f("demo.root");
   Event *event = (Event*)f.Get("event0");

The TFile::Get function searches for the TKey key named "event0". Once the TKey object is found, the TKey::ReadObj function is called to deserialize the Event object from the TKey buffer. TKey::ReadObj creates an instance of the Event class by calling its default constructor, then the Event::Streamer function is called. A short description of this operation is:

The Read part of the Streamer function simply performs the inverse operation of write by reading from the buffer R__b and restoring the class members. The statement "R__b.CheckByteCount(R__s, R__c, Event::IsA())" checks that the read position in the buffer is correct by comparing its actual value with the position at the start of the Streamer plus the byte count. In case of a mismatch, the pointer is set to the start position plus the byte count.

Manual Class Schema Evolution

When a data member is added or removed from a class, the Streamer function must be modified by hand. Root assumes that the user has increased the class version number in the ClassDef statement and introduced the relevant test in the read part of the Streamer. The Streamer function must be moved from the file generated by rootcint to the class implementation file. Because the Streamer function generated by rootcint is not any more adequate, one must now instruct rootcint to not generate this function. This is done by adding the character "-" at the end of the class name in the LinkDef file, eg:

  #pragma link C++ class Event-;
For example, if a new version of the Event class above includes a new member, eg "Int_t fNew;" the ClassDef statement should be changed to ClassDef(Event,2) and the following lines added to the read part of the Streamer:
   if (R__v > 1) {
      R__b >> fNew;
   } else {
      fNew = 0;  // set to some default value
   }

If, in the same new version 2 we remove the member fH, one has to add the following code to read the histogram object into some temporary object and delete it:

   if (R__v) < 2 {
      TH1F *dummy = 0;
      R__b >> dummy;
      delete dummy;
   }

The experience so far with manual schema evolution shows that it is easy to make an error and frequent mismatches between Streamer writers and readers are observed when the number of classes increases.

Automatic Class Schema Evolution

To reduce the number of problems inherent to the manual intervention in the Streamer functions, the new version of Root includes a new facility to automatize this process. To select this new facility, one must specify the "+" option in the LinkDef file, eg:

  #pragma link C++ class Event+;

The version id in the ClassDef statement must still be incremented by the user. In case the user forgets to increase this number, Root includes a class checksum algorithm that will issue a warning message in case one reads an object having the same class version than the current class but with a different definition. Instead of generating the style of Streamer function shown above, a shorter Streamer is generated. For example, in the case of the class Event above, the following code is generated:

//______________________________________________________________________________
void Event::Streamer(TBuffer &R__b)
{
   // Stream an object of class Event.

   if (R__b.IsReading()) {
      Event::Class()->ReadBuffer(R__b, this);
   } else {
      Event::Class()->WriteBuffer(R__b, this);
   }
}

All the I/O is now managed automatically by TClass::WriteBuffer and TClass::ReadBuffer. In turn these two functions invoke the services of a new class TStreamerInfo that is the real I/O manager of the new version of Root.

With this new scheme, one can change at will the class definition, for example:

From Root version 3 onwards, it is possible:

How to use the new system

We will distinguish the following cases:

Streamers with special additions

For some classes, it may be necessary to execute some code before or after the read or write blocks. For example after the execution of the read block, one can initialize some non persistent members.

Support for more C++ constructs in the class definition

The new version supports all the C++ cases that were supported by the old version. In addition , a data member can be;

  • any STL container or pointer to an STL container (vector, list, deque, map,set,multimap,multiset.
  • more cases involving pointers to objects or pointers to pointers as illustrated in the evolution of Event.h shown below.
  • class Event : public TObject {
    
    private:
       enum {kSize=10};
       char                    fType[20];        //array of 20 chars
       Int_t                   fNtrack;          //number of tracks
       Int_t                   fNseg;            //number of segments
       Int_t                   fNvertex;         //number of vertices
       Int_t                   fMeasures[kSize]; //an array where dimension is an enum
       UInt_t                  fFlag;            //bit pattern event flag
       Float_t                 fMatrix[4][4];    //a two-dim array
       Float_t                *fClosestDistance; //[fNvertex] pointer to an array of floats of length fNvertex 
       Float_t                 fTemperature;     //event temerature
       vector<int>             fVectorint;       //STL vector on ints
       vector<short>           fVectorshort;     //STL vector of shorts
       vector<double>          fVectorD[4];      //array of STL vectors of doubles
       vector<TObject>        *fVectorTobject;   //pointer to an STL vector
       vector<TNamed>         *fVectorTnamed[6]; //array of pointers to STL vectors
       deque<TAttLine>         fDeque;           //STL deque
       list<const TObject*>    fVectorTobjectp;  //STL list of pointers to objects
       map<TNamed*,int>        fMapTNamedp;      //STL map
       map<TAxis*,int>        *fMapTAxisp;       //pointer to STL map
       set<TAxis*>             fSetTAxis;        //STL set
       set<TAxis*>            *fSetTAxisp;       //pointer to STL set
       multimap<TNamed*,int>   fMultiMapTNamedp; //STL multimap
       multiset<TAxis*>       *fMultiSetTAxisp;  //pointer to STL multiset
       string                  fString;          //C++ standard string
       string                 *fStringp;         //pointer to standard C++ string
       TString                *fTstringp;        //[fNvertex] array of TString
       TString                 fNames[12];       //array of TString
       TAxis                   fXaxis;           //example of class derived from TObject
       TAxis                   fYaxis[3];        //array of objects
       TAxis                  *fVaxis[3];        //pointer to an array of TAxis
       TAxis                  *fPaxis;           //[fNvertex] pointer to an array of TAxis of length fNvertex
       TAxis                 **fQaxis;           //[fNvertex] pointer to ton array of pointers to TAxis objects
       EventHeader             fEvtHdr;          //example of class not derived from TObject
       TClonesArray           *fTracks;          //-> array of tracks
       TH1F                   *fH;               //-> pointer to an histogram
    

    The StreamerInfo saved in the Root file

    The StreamerInfo is a subset of the CINT class dictionary for persistent data members and base classes only.

    All classes that have at least one object written to a file have their StreamerInfo saved to the file. Each file contains one record (a TKey object called StreamerInfo) describing the classes in the file. This StreamerInfo record is automatically written when a file is closed. Vice versa, when a file is connected, its StreamerInfo record is read in memory. One can see the classes StreamerInfo in a file with:

       TFile f("demo.root");
       f.ShowStreamerInfo();
    

    The TFile::ShowStreamerInfo function prints a list of classes in the file and for each class a list of the data members with their description. When the StreamerInfo record is read in memory, the corresponding information is stored for each class into a linked list of TStreamerInfo objects. One can print the StreamerInfo for a class with, eg:

        gROOT->GetClass("Event")->GetStreamerInfo(classVersion)->ls();
    

    if the argument classVersion is not specified, the StreamerInfo corresponding to the current class is shown. StreamerInfo corresponding to different class versions may exist in different files. When reading an object from a file, the system will use the StreamerInfo of the file to decode an object in the automatic Streamer. In this way, the system can support loops on files containing instances of a class generated with different versions of the class.

    The StreamerInfo record in a file adds a small overhead of typically a few KiloBytes per file. In the current implementation, the StreamerInfo object is visible with functions such as TFile::ls. It is foreseen to hide this object from the list of keys in the final implementation.

    Automatic code generation from the StreamerInfo in a file

    By using the two following statements:

       TFile f("demo.root");
       f.MakeProject("src");
    
    one can generate a subdirectory (here called src) containing a header file for each class in the file. By default, TFile::MakeProject creates a header file for each class unknown to the current executable module. It is also possible to automatically generate a shared lib and link it with the running executable with:
       f.MakeProject("src","*","new++);
    

    When using the "new++" option (one can also specify "recreate++" or "update++"), TFile::MakeProject generates a small script called MAKE and executes it. The MAKE script calls rootcint to generate the class dictionary for all classes in the src directory, compile this file, create a shared lib and link it. For each header file generated, the original information of the data member part of the class is rebuilt, including the comment fields of the members. This makes possible the reading of any old Root file even when the code of the original classes is not available anymore. This facility is in particular useful to browse or analyze data in Root trees.

    Currently TFile::MakeProject generates C++ code only. However, it would be trivial to extend this function to generate Java code instead.