How to Read a ROOT File via Amazon S3?
Amazon Simple Storage Service (Amazon S3) is a commercial service provided by Amazon that allows users, typically developers, to store data over the Amazon cloud. The service provides users a highly-scalable, reliable, and low-latency data storage infrastructure with a price calculated on a usage basis fashion dependent on the region the data is stored. The data which size range varies from bytes to terabytes can be uploaded via an unlimited number of files that are stored into buckets. Each file is associated with a bucket which is associated with a region. If you are an unexperienced S3 user follow the official Get Started to learn the basic tasks of Amazon S3 such as Create a Bucket and Upload Files.
How to access ROOT files via Amazon S3. Simple, just create a TS3WebFile object instead of a TFile object with a URL on the form of s3://server/bucket/file as file name. For example:
root [0] TFile *f = TFile::Open("s3://s3-eu-west-1.amazonaws.com/roots3/hsimple.root")
root [1] f->ls()
TS3WebFile** s3://s3-eu-west-1.amazonaws.com/roots3/hsimple.root Demo ROOT file with histograms
TS3WebFile* s3://s3-eu-west-1.amazonaws.com/roots3/hsimple.root Demo ROOT file with histograms
KEY: TH1F hpx;1 This is the px distribution
KEY: TH2F hpxpy;1 py vs px
KEY: TProfile hprof;1 Profile of pz versus px
KEY: TNtuple ntuple;1 Demo ntuple
root [2] hpx->Draw()
Since TS3WebFile inherits from TWebFile which inherits from TFile, all TFile operations work as expected. However, since TWebFile is a readonly file, TS3WebFile will also be a readonly file. As TWebFile, TS3WebFile is ideally suited to read relatively small objects (like histograms or other data analysis results). Although possible, you don't want to analyse large TTree's via a TS3WebFile.
Before you start using Amazon S3 with ROOT a configuration step is required since communication between ROOT and Amazon S3 is conducted through HTTP requests that respect a RESTful API provided by Amazon S3.
Here follows a step-by-step recipe for authentication configuration between ROOT and Amazon S3:
- Go to the main Amazon S3 website and login into your account.
- Once you are logged in, click on Security Credentials on the left panel.
- In the section Access Credentials, tab Access Keys retrieve both Access Key ID and Secret Access Key.
- Open your bash profile file (usually ~/.bashrc or ~/.profile) and add the following lines:
export S3_ACCESS_KEY="your access key id"
export S3_SECRET_KEY="your secret access key" - Restart your terminal for changes to be applied and confirm that those variables are set by typing printenv.
These environment variables will be read by ROOT once you try to create a TS3WebFile to calculate a signature that is appended to each request sent to the Amazon S3 server. Although communication is safe since the actual signature is an encrypted string, the Secret Access Key that is used to calculate the signature can be retrieved by anyone who has access to your terminal. You should make sure that the system where you configure is safe and regularly renew your Secret Access Key.