Abstract interface to write data into an ntuple.
The page sink takes the list of columns and afterwards a series of page commits and cluster commits. The user is responsible to commit clusters at a consistent point, i.e. when all pages corresponding to data up to the given entry number are committed.
Definition at line 169 of file RPageStorage.hxx.
Classes | |
struct | RCounters |
Default I/O performance counters that get registered in fMetrics. More... | |
Public Member Functions | |
RPageSink (const RPageSink &)=delete | |
RPageSink (RPageSink &&)=default | |
RPageSink (std::string_view ntupleName, const RNTupleWriteOptions &options) | |
~RPageSink () override | |
ColumnHandle_t | AddColumn (DescriptorId_t fieldId, const RColumn &column) final |
Register a new column. | |
std::uint64_t | CommitCluster (NTupleSize_t nEntries) |
Finalize the current cluster and create a new one for the following data. | |
void | CommitClusterGroup () |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing). | |
void | CommitDataset () |
Finalize the current cluster and the entrire data set. | |
void | CommitPage (ColumnHandle_t columnHandle, const RPage &page) |
Write a page to the storage. The column must have been added before. | |
void | CommitSealedPage (DescriptorId_t columnId, const RPageStorage::RSealedPage &sealedPage) |
Write a preprocessed page to storage. The column must have been added before. | |
void | CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges) |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before. | |
void | Create (RNTupleModel &model) |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) To do so, Create() calls CreateImpl() after updating the descriptor. | |
void | DropColumn (ColumnHandle_t) final |
Unregisters a column. | |
RNTupleMetrics & | GetMetrics () override |
Returns the default metrics object. Subclasses might alternatively provide their own metrics object by overriding this. | |
EPageStorageType | GetType () final |
Whether the concrete implementation is a sink or a source. | |
const RNTupleWriteOptions & | GetWriteOptions () const |
Returns the sink's write options. | |
RPageSink & | operator= (const RPageSink &)=delete |
RPageSink & | operator= (RPageSink &&)=default |
virtual RPage | ReservePage (ColumnHandle_t columnHandle, std::size_t nElements)=0 |
Get a new, empty page for the given column that can be filled with up to nElements. | |
Public Member Functions inherited from ROOT::Experimental::Detail::RPageStorage | |
RPageStorage (const RPageStorage &other)=delete | |
RPageStorage (RPageStorage &&other)=default | |
RPageStorage (std::string_view name) | |
virtual | ~RPageStorage () |
const std::string & | GetNTupleName () const |
Returns the NTuple name. | |
RPageStorage & | operator= (const RPageStorage &other)=delete |
RPageStorage & | operator= (RPageStorage &&other)=default |
virtual void | ReleasePage (RPage &page)=0 |
Every page store needs to be able to free pages it handed out. | |
void | SetTaskScheduler (RTaskScheduler *taskScheduler) |
Static Public Member Functions | |
static std::unique_ptr< RPageSink > | Create (std::string_view ntupleName, std::string_view location, const RNTupleWriteOptions &options=RNTupleWriteOptions()) |
Guess the concrete derived page source from the file name (location) | |
Protected Member Functions | |
virtual RNTupleLocator | CommitClusterGroupImpl (unsigned char *serializedPageList, std::uint32_t length)=0 |
Returns the locator of the page list envelope of the given buffer that contains the serialized page list. | |
virtual std::uint64_t | CommitClusterImpl (NTupleSize_t nEntries)=0 |
Returns the number of bytes written to storage (excluding metadata) | |
virtual void | CommitDatasetImpl (unsigned char *serializedFooter, std::uint32_t length)=0 |
virtual RNTupleLocator | CommitPageImpl (ColumnHandle_t columnHandle, const RPage &page)=0 |
virtual RNTupleLocator | CommitSealedPageImpl (DescriptorId_t columnId, const RPageStorage::RSealedPage &sealedPage)=0 |
virtual std::vector< RNTupleLocator > | CommitSealedPageVImpl (std::span< RPageStorage::RSealedPageGroup > ranges) |
Vector commit of preprocessed pages. | |
virtual void | CreateImpl (const RNTupleModel &model, unsigned char *serializedHeader, std::uint32_t length)=0 |
void | EnableDefaultMetrics (const std::string &prefix) |
Enables the default set of metrics provided by RPageSink. | |
RSealedPage | SealPage (const RPage &page, const RColumnElementBase &element, int compressionSetting) |
Helper for streaming a page. | |
Static Protected Member Functions | |
static RSealedPage | SealPage (const RPage &page, const RColumnElementBase &element, int compressionSetting, void *buf) |
Seal a page using the provided buffer. | |
Protected Attributes | |
std::unique_ptr< RNTupleCompressor > | fCompressor |
Helper to zip pages and header/footer; includes a 16MB (kMAXZIPBUF) zip buffer. | |
std::unique_ptr< RCounters > | fCounters |
RNTupleDescriptorBuilder | fDescriptorBuilder |
RNTupleMetrics | fMetrics |
std::uint64_t | fNextClusterInGroup = 0 |
Remembers the starting cluster id for the next cluster group. | |
std::vector< RClusterDescriptor::RColumnRange > | fOpenColumnRanges |
Keeps track of the number of elements in the currently open cluster. Indexed by column id. | |
std::vector< RClusterDescriptor::RPageRange > | fOpenPageRanges |
Keeps track of the written pages in the currently open cluster. Indexed by column id. | |
std::unique_ptr< RNTupleWriteOptions > | fOptions |
NTupleSize_t | fPrevClusterNEntries = 0 |
Used to calculate the number of entries in the current cluster. | |
Protected Attributes inherited from ROOT::Experimental::Detail::RPageStorage | |
std::string | fNTupleName |
RTaskScheduler * | fTaskScheduler = nullptr |
Private Attributes | |
Internal::RNTupleSerializer::RContext | fSerializationContext |
Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization. | |
Additional Inherited Members | |
Public Types inherited from ROOT::Experimental::Detail::RPageStorage | |
using | ColumnHandle_t = RColumnHandle |
The column handle identifies a column with the current open page storage. | |
using | SealedPageSequence_t = std::deque< RSealedPage > |
#include <ROOT/RPageStorage.hxx>
ROOT::Experimental::Detail::RPageSink::RPageSink | ( | std::string_view | ntupleName, |
const RNTupleWriteOptions & | options | ||
) |
Definition at line 237 of file RPageStorage.cxx.
|
delete |
|
default |
|
override |
Definition at line 242 of file RPageStorage.cxx.
|
finalvirtual |
Register a new column.
When reading, the column must exist in the ntuple on disk corresponding to the meta-data. When writing, every column can only be attached once.
Implements ROOT::Experimental::Detail::RPageStorage.
Definition at line 272 of file RPageStorage.cxx.
std::uint64_t ROOT::Experimental::Detail::RPageSink::CommitCluster | ( | NTupleSize_t | nEntries | ) |
Finalize the current cluster and create a new one for the following data.
Returns the number of bytes written to storage (excluding meta-data).
Definition at line 368 of file RPageStorage.cxx.
void ROOT::Experimental::Detail::RPageSink::CommitClusterGroup | ( | ) |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
Definition at line 390 of file RPageStorage.cxx.
|
protectedpure virtual |
Returns the locator of the page list envelope of the given buffer that contains the serialized page list.
Typically, the implementation takes care of compressing and writing the provided buffer.
Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.
|
protectedpure virtual |
Returns the number of bytes written to storage (excluding metadata)
Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.
void ROOT::Experimental::Detail::RPageSink::CommitDataset | ( | ) |
Finalize the current cluster and the entrire data set.
Definition at line 419 of file RPageStorage.cxx.
|
protectedpure virtual |
void ROOT::Experimental::Detail::RPageSink::CommitPage | ( | ColumnHandle_t | columnHandle, |
const RPage & | page | ||
) |
Write a page to the storage. The column must have been added before.
Definition at line 317 of file RPageStorage.cxx.
|
protectedpure virtual |
void ROOT::Experimental::Detail::RPageSink::CommitSealedPage | ( | DescriptorId_t | columnId, |
const RPageStorage::RSealedPage & | sealedPage | ||
) |
Write a preprocessed page to storage. The column must have been added before.
Definition at line 328 of file RPageStorage.cxx.
|
protectedpure virtual |
void ROOT::Experimental::Detail::RPageSink::CommitSealedPageV | ( | std::span< RPageStorage::RSealedPageGroup > | ranges | ) |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
Definition at line 351 of file RPageStorage.cxx.
|
protectedvirtual |
Vector commit of preprocessed pages.
The ranges
array specifies a range of sealed pages to be committed for each column. The returned vector contains, in order, the RNTupleLocator for each page on each range in ranges
, i.e. the first N entries refer to the N pages in ranges[0]
, followed by M entries that refer to the M pages in ranges[1]
, etc. The default is to call CommitSealedPageImpl
for each page; derived classes may provide an optimized implementation though.
Reimplemented in ROOT::Experimental::Detail::RPageSinkDaos.
Definition at line 341 of file RPageStorage.cxx.
void ROOT::Experimental::Detail::RPageSink::Create | ( | RNTupleModel & | model | ) |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) To do so, Create() calls CreateImpl() after updating the descriptor.
Create() associates column handles to the columns referenced by the model
Definition at line 280 of file RPageStorage.cxx.
|
static |
Guess the concrete derived page source from the file name (location)
Definition at line 246 of file RPageStorage.cxx.
|
protectedpure virtual |
|
inlinefinalvirtual |
Unregisters a column.
A page source decreases the reference counter for the corresponding active column. For a page sink, dropping columns is currently a no-op.
Implements ROOT::Experimental::Detail::RPageStorage.
Definition at line 262 of file RPageStorage.hxx.
|
protected |
Enables the default set of metrics provided by RPageSink.
prefix
will be used as the prefix for the counters registered in the internal RNTupleMetrics object. This set of counters can be extended by a subclass by calling fMetrics.MakeCounter<...>()
.
A subclass using the default set of metrics is always responsible for updating the counters appropriately, e.g. fCounters->fNPageCommited.Inc()
Alternatively, a subclass might provide its own RNTupleMetrics object by overriding the GetMetrics() member function.
Definition at line 467 of file RPageStorage.cxx.
|
inlineoverridevirtual |
Returns the default metrics object. Subclasses might alternatively provide their own metrics object by overriding this.
Implements ROOT::Experimental::Detail::RPageStorage.
Reimplemented in ROOT::Experimental::Detail::RPageSinkBuf.
Definition at line 288 of file RPageStorage.hxx.
|
inlinefinalvirtual |
Whether the concrete implementation is a sink or a source.
Implements ROOT::Experimental::Detail::RPageStorage.
Definition at line 257 of file RPageStorage.hxx.
|
inline |
Returns the sink's write options.
Definition at line 259 of file RPageStorage.hxx.
|
pure virtual |
Get a new, empty page for the given column that can be filled with up to nElements.
If nElements is zero, the page sink picks an appropriate size.
Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.
|
protected |
Helper for streaming a page.
This is commonly used in derived, concrete page sinks. Note that if compressionSetting is 0 (uncompressed) and the page is mappable, the returned sealed page will point directly to the input page buffer. Otherwise, the sealed page references an internal buffer of fCompressor. Thus, the buffer pointed to by the RSealedPage should never be freed. Usage of this method requires construction of fCompressor.
Definition at line 460 of file RPageStorage.cxx.
|
staticprotected |
Seal a page using the provided buffer.
Definition at line 431 of file RPageStorage.cxx.
|
protected |
Helper to zip pages and header/footer; includes a 16MB (kMAXZIPBUF) zip buffer.
There could be concrete page sinks that don't need a compressor. Therefore, and in order to stay consistent with the page source, we leave it up to the derived class whether or not the compressor gets constructed.
Definition at line 193 of file RPageStorage.hxx.
|
protected |
Definition at line 185 of file RPageStorage.hxx.
|
protected |
Definition at line 203 of file RPageStorage.hxx.
|
protected |
Definition at line 186 of file RPageStorage.hxx.
|
protected |
Remembers the starting cluster id for the next cluster group.
Definition at line 196 of file RPageStorage.hxx.
|
protected |
Keeps track of the number of elements in the currently open cluster. Indexed by column id.
Definition at line 200 of file RPageStorage.hxx.
|
protected |
Keeps track of the written pages in the currently open cluster. Indexed by column id.
Definition at line 202 of file RPageStorage.hxx.
|
protected |
Definition at line 188 of file RPageStorage.hxx.
|
protected |
Used to calculate the number of entries in the current cluster.
Definition at line 198 of file RPageStorage.hxx.
|
private |
Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization.
Definition at line 172 of file RPageStorage.hxx.