Re: [ROOT] New data member & trees

From: Rene Brun (Rene.Brun@cern.ch)
Date: Tue Aug 08 2000 - 15:15:43 MEST


Hi Ingo,
In the attachment, you will find a text version of a section on I/O of
the Root User's Guide currently in preparation. It explains the
principle of Streamers.

Rene Brun

Ingo Froehlich wrote:
> 
> Hello,
> 
> we added some data members in a class, but now we are not able to read the
> old root trees. The only hint I found in the digests is to add some lines
> in the streamer function like:
> if (R__v > 1) { R__b >> newdata;}
> 
> But I think the streamer function is created automatically, so I have to
> add it after the creation of EventDict.cpp, each? Or is it possible to do
> this version management also automatically?
> 
> Thanks for your help, Ingo
> 
> --
> Ingo Froehlich, II. Physikalisches Institut, Universitaet Giessen
> Heinrich-Buff-Ring 16, D-35392 Giessen           |  Tel.: 0641 - 99 33250
> EMail: Ingo.Froehlich@exp2.physik.uni-giessen.de |  Fax : 0641 - 99 33209
> Sekretariat II. Physik: 0641 - 99 33261          |                    :-)

Streamers 

To follow the discussion on streamers, you need to know what a simple data type 
is. A variable is of a simple data type if it cannot be decomposed into other 
types. Examples of simple data types are longs, shorts, floats, and chars. 
In contrast, a variable is of a composite data type if it can be decomposed. 
For example, classes, structs, and arrays  are composite types. Simple types 
are also called primitive types, basic types, and CINT sometimes calls them 
fundamental types.

When we say, "writing an object to a file", we actually mean writing the 
current values of the data members. The most common way to do this is to 
decompose the object into its data members and write them to disk. 
The decomposition is the job of the streamer. Every class with ambitions to 
be stored in a file has a streamer that decomposes it and  "streams" its 
members into a buffer.  To decompose the parent classes, the streamer calls 
the streamer of the parent classes. It moves up the inheritance tree until 
it reaches an ancestor without a parent.

To decompose the object data members it calls their streamer. They in turn 
move up their own inheritance tree and so forth.

The simple data members are written to the buffer directly. Eventually the 
buffer contains all simple data members of all the classes that make up this 
particular object. 

Let's look at an example. 

An Example  The Event class is defined in $ROOTSYS/test/Event.h. If you have 
a chance, open the file now and follow along. Looking at the class definition, 
we find that indeed it inherits from TObject.  Event has four integer data 
members, one float, an EventHeader object,  a pointer to an array of track 
objects, and a pointer to a histogram object. Note that  fEvtHdr is an object, 
where fTracks and fH are pointers to objects. 

class Event : public TObject {
    private:     
 Int_t          fNtrack;     
 Int_t          fNseg;     
 Int_t          fNvertex;     
 UInt_t         fFlag;
 Float_t        fTemperature;     
 EventHeader    fEvtHdr;       
 TClonesArray  *fTracks;       
 TH1F          *fH;  ... 


The implementation of Event is in $ROOTSYS/test/Event.cxx. Open Event.cxx and 
locate the Event::Streamer method. The Streamer method takes a pointer to a 
TBuffer as a parameter, and first checks to see if this is a case of reading 
or writing the buffer. Let's look at the case of writing the buffer. 

void Event::Streamer(TBuffer &R__b)  
  {  ... //reading part     
  } else { // writing part     
    R__c = R__b.WriteVersion(Event::IsA(), 1);     
    TObject::Streamer(R__b);        
    R__b << fNtrack;        
    R__b << fNseg;        
    R__b << fNvertex;        
    R__b << fFlag;        
    R__b << fTemperature;        
    fEvtHdr.Streamer(R__b);        
    fTracks->Streamer(R__b);        
    fH->Streamer(R__b);     
    R__b.SetByteCount(R__c, 1);         
  } 
} 

First we see a call to TBuffer::WriteVersion. Class versioning is important 
and  deserves a detailed discussion. It is covered a little later in the 
Schema Evolution section. For now let's see how the object is decomposed.  

  A call to TObject::Streamer is made because it is the parent of Event. 
If Event were  to inherit from multiple parents, its streamer would call 
each of the parent's Streamer method here. 

 

TObject::Streamer(R__b); 

Then each data member is added to the buffer. Simple data members are added 
directly.  

R__b << fNtrack;  
R__b << fNseg;  
R__b << fNvertex;  
R__b << fFlag;  
R__b << fTemperature; 

Object data members are added by calling their Streamer. The Event class 
has three  object data members: an Event Header (fEvtHdr), an array of 
tracks (fTracks), and histogram (fH).   

  fEvtHdr.Streamer(R__b);  
  fTracks->Streamer(R__b);  
  fH->Streamer(R__b); 

  Note the difference in the syntax of the method call with an object versus 
making the call  with a pointer to an object. The EventHeader fEvtHdr is an 
object and its streamer is  called with the "." operator:  

fEvtHdr.Streamer(R__b); 

The fTracks and fH are pointer to objects. Their streamer is called with 
the "->"  operator as in:   

  fTracks->Streamer(R__b);  
  fH->Streamer(R__b); 

  The recursive nature of the streamers builds a buffer with only simple 
variables. The buffer will contain: 

  * Data members of all inherited classes 
  * Data members of the class itself 
  * Data members of its object data members 

Byte Count  The last line in the streamer writes the byte count.   

R__b.SetByteCount(R__c, 1);       

When root reads an object and it cannot find its streamer, root has no way of 
interpreting the bytes and reassemble the object. In this case, root skips 
the object and reads the next object. Root can do this because it reads the 
byte count at the beginning of each object. Root skips ahead by the number of 
bytes in the byte count. This allows for  graceful recovery when reading 
undefined objects. It guarantees that even if an object is not readable, 
subsequent objects will still be read. 

The byte count is also used to check that the number of bytes read matches 
the number of bytes expected.

 

Let's look at how the byte count is managed in the streamer.

void Event::Streamer(TBuffer &R__b)  
{    // Stream an object of class Event.    
  Uint_t R__s, R__c;    
  if (R__b.IsReading()) {        
     Version_t R__v = R__b.ReadVersion(&R__s, &R__c);      
     if (R__v) {}        ... < stream in all data members >       
     R__b.CheckByteCount(R__s, R__c, Event::IsA());    
  } else {        
     R__c = R__b.WriteVersion(Event::IsA(), kTRUE);      
     ... < stream out all data members >       
     R__b.SetByteCount(R__c, kTrue);    
  }  
}   

In the writing part, WriteVersion returns the offset where the byte count 
should be placed. The variable is R__c now contains the location just before 
the version number.     R__c = R__b.WriteVersion(Event::IsA(), kTRUE); 

SetByteCount writes the byte count at the reserved location.  

R__b.SetByteCount(R__c, kTRUE); 

Now, let's look at the reading part. ReadVersion returns the location of 
the current position in the input buffer. This spot is the beginning of the 
object description and is  returned in the variable R__s. It also returns 
the expected byte count. The variable R__c now contains the number of bytes 
we expect the object to have.    
Version_t R__v = R__b.ReadVersion(&R__s, &R__c); 

After reading all data members, CheckByteCount is called to check if the 
current position in the buffer matches the expected position by adding the 
byte count to the starting  position.   

R__b.CheckByteCount(R__s, R__c, Event::IsA());   

If there is no match, an error is printed and the input buffer is positioned 
according to the byte count. This allows the system to correctly read the next 
object in the stream.

 

The byte count version of the streamer can read files generated by the streamer
 without a byte count. In addition, a standard streamer can read files produced
 with a byte count streamer.  

Writing Objects  The Streamer decomposes the objects into data members and 
writes them to a buffer. It does not write the buffer to a file, it simply 
populates a buffer with bytes representing the object. This allows us to write 
the buffer to a file or do anything else we could do with the buffer. 
For example, we can write it to a socket to send it over the network. 
This is  beyond the scope of this chapter, but it is worthwhile to emphasize 
the need and advantage of separating the creation of the buffer from it's use. 
Let's look how a buffer is  written to a file.  
A class needs to inherit from TObject to be saved to disk because it needs 
the  TObject::Write method to write itself to the file. The TObject::Write 
method does  the following:

1.  Creates a TKey object in the current directory  
2.  Creates a TBuffer object which is part of the newly created TKey  
3.  Fills the TBuffer with a call to the class::Streamer method  
4.  Creates a second buffer for compression, if needed  
5.  Reserves space by scanning the TFree list.  At this point, the size of the 
    buffer is known.  
6.  Writes the buffer to the file  
7.  Releases the TBuffer part of the key and returns a 60-byte key as a 
   reference to what was written to disk. 

In other words, the TObject::Write calls the Streamer method of the class to 
build  the buffer. The buffer is in the key and the key is written to disk. 
Once written to disk the  memory consumed by the buffer part is released. 
The key part of the TKey is kept and  returned as a parameter. 
The key consumes only 60 bytes, where the buffer since it contains the object 
data can be very large.

 

Generated Streamers by rootcint  A streamer usually calls other streamers, 
the streamer of its parents and data members. This architecture depends on 
all objects having streamers, because eventually they will be called. 
To ensure that a class has a streamer, rootcint automatically creates one in 
the ClassDef macro which is defined in $ROOTSYS/include/RTypes.h.
  Rootcint defines  several methods for any class, and one of them is the 
streamer. The automatically generated streamer is complete and can be used as 
long as no customization is needed. 

 

In our example, the Event class has a custom streamer that we just looked at. 
The  EventHeader, Track, and HistogramManager classes (also defined in Event.h)
 have  rootcint-generated streamers. 
They are in the file $ROOTSYS/test/EventDict.cxx. Below  is the automatically 
generated EventHeader::Streamer from EventDict.cxx:  

void EventHeader::Streamer(TBuffer &R__b)  
{
     // Stream an object of class EventHeader.       
  if (R__b.IsReading()) { 
      Version_t R__v = R__b.ReadVersion(); 
      if (R__v) { }       
      R__b >> fEvtNum;        
      R__b >> fRun;        
      R__b >> fDate;    
  } else {        
      R__b.WriteVersion(EventHeader::IsA());        
      R__b << fEvtNum;        
      R__b << fRun;        
      R__b << fDate;    
  }  
} 

 The EventHeader class has only simple data members, but if we add a histogram, 
an  object data member, it still works. The resulting streamer looks like this.

void EventHeader::Streamer(TBuffer &R__b)  
{     // Stream an object of class EventHeader.     
   if (R__b.IsReading()) { 
       ...        
      R__b >> fH;     
   } else {      
      ...        
      R__b << (TObject*)fH;     
   }  
} 

  At first it looks like the pointer is streamed out, but that is not so. 
The ">>" and "<<"  operators are overwritten to call ReadObject and WriteObject
 respectively.  

OK, now we know we have a choice to let rootcint make a streamer for us or to 
write our own. How do we let CINT know when to generate one and when not?  

The input to the rootcint command (in the makefile) is a list of classes in a 
LinkDef.h file. For example, the list of classes for Event are listed in 
$ROOTSYS/test/EventLinkDef.h.  The  "-" at the end of the class name tells 
rootcint not to generate a streamer.  In the example, you can see the Event 
class is the only one for which rootcint is instructed not  to generate a 
streamer.    
#ifdef __CINT__    
#pragma link off all globals;  
#pragma link off all classes;  
#pragma link off all functions;    
#pragma link C++ class EventHeader;  
#pragma link C++ class Event-;  
#pragma link C++ class HistogramManager;  
#pragma link C++ class Track;    
#endif 

To tell rootcint to add the byte count check when generating a streamer, 
you need to add a "+" after the name of the class in the LinkDef.h file. 
For example to add the byte count  check to the EventHeader streamer, 
add a "+" to the EventHeader entry.  

#pragma link C++ class EventHeader+; 

Streamers and Arrays  When the streamer comes across a data member that is 
a pointer to a simple type, it assumes it is an array. Somehow, rootcint has 
to find how many elements are in the array to reserve enough space in the 
buffer, and write out the appropriate number of elements. This is done in the 
definition of the class. For example, if we wanted to add an  array of floats 
to the Event class we would add the following lines in the Event class 
definition.

class Event : public TObject {  
   private:     
 Int_t   fNtrack;    ... 
 Int_t   fN;    
 Float_t   *fNArray;      //[fN]    ... 

  The array fNArray is defined as a pointer of floats (Float_t is root's type 
for a float).  Then a comment mark (//) , and the length of the array in 
square brackets. In general the syntax is: 

  <simple type>   *<name>  //[<length>] 

The length needs to be an integer, and it needs to be a data member defined 
in the class ahead of its use. It can also be defined in a base class The 
length can be an expression,  as long as the result is an integer.  Now we 
know how to write our own streamers, but why would you have to do that? 
The answer to that question is the subject of the next section on Schema 
Evolution. 

Schema Evolution  Schema evolution means we can use multiple versions of 
the same class. Somewhere in the software is a black box that takes care of 
mapping one version to another.
  In the lifetime of a collaboration, the definition of a class is likely to 
change frequently. Not only can the class itself change, but any of its parent 
classes can also change.  This makes the support for schema evolution necessary.
To do so, ROOT uses class versions. 

When a class is defined in root, it must include the ClassDef macro. 
The ClassDef macro needs to be included as the last line in the header file 
after the class definition. The  syntax of that call is:   

ClassDef(<ClassName>,<VersionNumber>)   

The version number is what identifies this particular version of the class. 
The version  number is written to the file in the streamer by the call 
TBuffer::WriteVersion. You,  as the designer of the class, need to customize 
the streamer to write the appropriate data members for each version.

 

As an example, let's say our Event class has changed. It needs a new integer. 
We add the data member and bump the version number in the ClassDef to two: 

  class Event : public TObject {  
   private:  
 Int_t   fNewInt;  
 Int_t   fNtrack;  
 ...   
 ClassDef(Event,2) 

To correctly read and write this second version of Event, we need to change 
the  streamer to read  fNewInt for all versions greater than version 1, but 
not expect it for  version one. The streamer now looks like this: 

void Event::Streamer(TBuffer &R__b)  
{    
  if (R__b.IsReading()) {      
     Version_t R__v = R__b.ReadVersion();       
     TObject::Streamer(R__b);   // read the new data member for all versions
                                // beyond the first one        
     if (R__v > 1) {           
        R__b >> fNewInt;        
     }       
     R__b >> fNtrack;       
     ...    
  } else {      
     R__b.WriteVersion(Event::IsA());      
     TObject::Streamer(R__b);          
     R__b << fNewInt;    
     ...    
  }  
} 

  In the writing part of the streamer, you will always want to write all 
members. This elegant and simple versioning mechanism allows your objects to 
evolve and be backward compatible.  Note that if you are declaring your own 
classes but are not interested in writing it to a file you can set the version 
number to zero and rootcint will generate an empty streamer. 



This archive was generated by hypermail 2b29 : Tue Jan 02 2001 - 11:50:31 MET