How to Read a ROOT File via Google Storage?

Google Storage is a commercial service provided by Google that allows users, typically developers, to store data over the Google cloud. The service provides users a highly-scalable, reliable, and low-latency data storage infrastructure with a price calculated on a usage basis fashion dependent on the region the data is stored. The data which size range varies from bytes to terabytes can be uploaded via an unlimited number of files that are stored into buckets. Each file is associated with a bucket which is associated with a region. If you are not yet a Google Storage user you can follow the signup service tutorial. If you are an unexperienced Google Storage user, since Google provides several methods for accessing your project to Create a Bucket, Upload Files or perform other operations you can follow either the GSUtil tool tutorial or with check the Google Storage Manager page.

How to access ROOT files via Google Storage. Simple, just create a TGSFile object instead of a TFile object with a URL on the form of gs://server/bucket/file as file name. For example:

 
root [0] TFile *f = TFile::Open("gs://commondatastorage.googleapis.com/roots3/hsimple.root")
root [1] f->ls()
TGSFile**		gs://commondatastorage.googleapis.com/roots3/hsimple.root	Demo ROOT file with histograms
 TGSFile*		gs://commondatastorage.googleapis.com/roots3/hsimple.root	Demo ROOT file with histograms
  KEY: TH1F	hpx;1	This is the px distribution
  KEY: TH2F	hpxpy;1	py vs px
  KEY: TProfile	hprof;1	Profile of pz versus px
  KEY: TNtuple	ntuple;1	Demo ntuple
root [2] hpx->Draw()

Since TGSFile inherits TWebFile which inherits from TFile, all TFile operations work as expected. However, since TWebFile is a readonly file, TGSFile will also be a readonly file. As TWebFile, TGSFile is ideally suited to read relatively small objects (like histograms or other data analysis results). Although possible, you don't want to analyse large TTree's via a TGSFile.

Before you start using Google Storage with ROOT a configuration step is required since communication between ROOT and Google Storage is conducted through HTTP requests that respect a RESTful API provided by Google.

Here follows a step-by-step recipe for authentication configuration between ROOT and Google Storage:

  1. Enable Legacy Access for Google Storage.
  2. Once you have generated both Legacy Storage Access Keys, retrieve both Access Key and Secret.
  3. Open your bash profile file (usually ~/.bashrc or ~/.profile) and add the following lines:
    export GS_ACCESS_ID="your access id"
    export GS_ACCESS_KEY="your access secret key"
  4. Restart your terminal for changes to be applied and confirm that those variables are set by typing printenv.

Those environment variables will be read by ROOT once you try to create a TGSFile to calculate a signature that is appended to each request sent to the Google Storage server. Although communication is safe since the actual signature is an encrypted string, the Secret Access Key that is used to calculate the signature can be retrieved by anyone who has access to your terminal. You should make sure that the system where you configure is safe and regularly renew your Secret Access Key.