Wrapper sink that coalesces cluster column page writes.
Definition at line 41 of file RPageSinkBuf.hxx.
Classes | |
class | RColumnBuf |
A buffered column. More... | |
struct | RCounters |
I/O performance counters that get registered in fMetrics. More... | |
Public Member Functions | |
RPageSinkBuf (const RPageSinkBuf &)=delete | |
RPageSinkBuf (RPageSinkBuf &&)=default | |
RPageSinkBuf (std::unique_ptr< RPageSink > inner) | |
~RPageSinkBuf () override | |
ColumnHandle_t | AddColumn (DescriptorId_t fieldId, RColumn &column) final |
Register a new column. | |
std::uint64_t | CommitCluster (NTupleSize_t nNewEntries) final |
Finalize the current cluster and create a new one for the following data. | |
void | CommitClusterGroup () final |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing). | |
void | CommitDatasetImpl () final |
void | CommitPage (ColumnHandle_t columnHandle, const RPage &page) final |
Write a page to the storage. The column must have been added before. | |
void | CommitSealedPage (DescriptorId_t physicalColumnId, const RSealedPage &sealedPage) final |
Write a preprocessed page to storage. The column must have been added before. | |
void | CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges) final |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before. | |
void | CommitStagedClusters (std::span< RStagedCluster > clusters) final |
Commit staged clusters, logically appending them to the ntuple descriptor. | |
void | CommitSuppressedColumn (ColumnHandle_t columnHandle) final |
Commits a suppressed column for the current cluster. | |
const RNTupleDescriptor & | GetDescriptor () const final |
Return the RNTupleDescriptor being constructed. | |
void | InitImpl (RNTupleModel &model) final |
RPageSinkBuf & | operator= (const RPageSinkBuf &)=delete |
RPageSinkBuf & | operator= (RPageSinkBuf &&)=default |
RPage | ReservePage (ColumnHandle_t columnHandle, std::size_t nElements) final |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero. | |
RStagedCluster | StageCluster (NTupleSize_t nNewEntries) final |
Stage the current cluster and create a new one for the following data. | |
void | UpdateExtraTypeInfo (const RExtraTypeInfoDescriptor &extraTypeInfo) final |
Adds an extra type information record to schema. | |
void | UpdateSchema (const RNTupleModelChangeset &changeset, NTupleSize_t firstEntry) final |
Incorporate incremental changes to the model into the ntuple descriptor. | |
Public Member Functions inherited from ROOT::Experimental::Internal::RPageSink | |
RPageSink (const RPageSink &)=delete | |
RPageSink (RPageSink &&)=default | |
RPageSink (std::string_view ntupleName, const RNTupleWriteOptions &options) | |
~RPageSink () override | |
void | CommitDataset () |
Run the registered callbacks and finalize the current cluster and the entrire data set. | |
void | DropColumn (ColumnHandle_t) final |
Unregisters a column. | |
virtual RSinkGuard | GetSinkGuard () |
EPageStorageType | GetType () final |
Whether the concrete implementation is a sink or a source. | |
const RNTupleWriteOptions & | GetWriteOptions () const |
Returns the sink's write options. | |
void | Init (RNTupleModel &model) |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model. | |
bool | IsInitialized () const |
RPageSink & | operator= (const RPageSink &)=delete |
RPageSink & | operator= (RPageSink &&)=default |
void | RegisterOnCommitDatasetCallback (Callback_t callback) |
The registered callback is executed at the beginning of CommitDataset();. | |
Public Member Functions inherited from ROOT::Experimental::Internal::RPageStorage | |
RPageStorage (const RPageStorage &other)=delete | |
RPageStorage (RPageStorage &&other)=default | |
RPageStorage (std::string_view name) | |
virtual | ~RPageStorage () |
ColumnId_t | GetColumnId (ColumnHandle_t columnHandle) const |
virtual Detail::RNTupleMetrics & | GetMetrics () |
Returns the default metrics object. | |
const std::string & | GetNTupleName () const |
Returns the NTuple name. | |
RPageStorage & | operator= (const RPageStorage &other)=delete |
RPageStorage & | operator= (RPageStorage &&other)=default |
void | SetTaskScheduler (RTaskScheduler *taskScheduler) |
Private Member Functions | |
void | ConnectFields (const std::vector< RFieldBase * > &fields, NTupleSize_t firstEntry) |
void | FlushClusterImpl (std::function< void(void)> FlushClusterFn) |
Private Attributes | |
std::vector< RColumnBuf > | fBufferedColumns |
Vector of buffered column pages. Indexed by column id. | |
std::unique_ptr< RCounters > | fCounters |
std::unique_ptr< RNTupleModel > | fInnerModel |
The buffered page sink maintains a copy of the RNTupleModel for the inner sink. | |
std::unique_ptr< RPageSink > | fInnerSink |
The inner sink, responsible for actually performing I/O. | |
DescriptorId_t | fNColumns = 0 |
DescriptorId_t | fNFields = 0 |
std::vector< ColumnHandle_t > | fSuppressedColumns |
Columns committed as suppressed are stored and passed to the inner sink at cluster commit. | |
Additional Inherited Members | |
Public Types inherited from ROOT::Experimental::Internal::RPageSink | |
using | Callback_t = std::function< void(RPageSink &)> |
Public Types inherited from ROOT::Experimental::Internal::RPageStorage | |
using | ColumnHandle_t = RColumnHandle |
The column handle identifies a column with the current open page storage. | |
using | SealedPageSequence_t = std::deque< RSealedPage > |
Static Public Member Functions inherited from ROOT::Experimental::Internal::RPageSink | |
static RSealedPage | SealPage (const RSealPageConfig &config) |
Seal a page using the provided info. | |
Static Public Attributes inherited from ROOT::Experimental::Internal::RPageStorage | |
static constexpr std::size_t | kNBytesPageChecksum = sizeof(std::uint64_t) |
The page checksum is a 64bit xxhash3. | |
Protected Member Functions inherited from ROOT::Experimental::Internal::RPageSink | |
RSealedPage | SealPage (const RPage &page, const RColumnElementBase &element) |
Helper for streaming a page. | |
Protected Member Functions inherited from ROOT::Experimental::Internal::RPageStorage | |
void | WaitForAllTasks () |
Protected Attributes inherited from ROOT::Experimental::Internal::RPageSink | |
std::unique_ptr< RNTupleCompressor > | fCompressor |
Helper to zip pages and header/footer; includes a 16MB (kMAXZIPBUF) zip buffer. | |
std::unique_ptr< RNTupleWriteOptions > | fOptions |
Protected Attributes inherited from ROOT::Experimental::Internal::RPageStorage | |
Detail::RNTupleMetrics | fMetrics |
std::string | fNTupleName |
std::unique_ptr< RPageAllocator > | fPageAllocator |
For the time being, we will use the heap allocator for all sources and sinks. This may change in the future. | |
RTaskScheduler * | fTaskScheduler = nullptr |
#include <ROOT/RPageSinkBuf.hxx>
|
explicit |
Definition at line 34 of file RPageSinkBuf.cxx.
|
delete |
|
default |
|
override |
Definition at line 50 of file RPageSinkBuf.cxx.
|
finalvirtual |
Register a new column.
When reading, the column must exist in the ntuple on disk corresponding to the meta-data. When writing, every column can only be attached once.
Implements ROOT::Experimental::Internal::RPageStorage.
Definition at line 59 of file RPageSinkBuf.cxx.
|
finalvirtual |
Finalize the current cluster and create a new one for the following data.
Returns the number of bytes written to storage (excluding meta-data).
Reimplemented from ROOT::Experimental::Internal::RPageSink.
Definition at line 261 of file RPageSinkBuf.cxx.
|
finalvirtual |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 283 of file RPageSinkBuf.cxx.
|
finalvirtual |
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 290 of file RPageSinkBuf.cxx.
|
finalvirtual |
Write a page to the storage. The column must have been added before.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 143 of file RPageSinkBuf.cxx.
|
finalvirtual |
Write a preprocessed page to storage. The column must have been added before.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 219 of file RPageSinkBuf.cxx.
|
finalvirtual |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 225 of file RPageSinkBuf.cxx.
|
finalvirtual |
Commit staged clusters, logically appending them to the ntuple descriptor.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 276 of file RPageSinkBuf.cxx.
|
finalvirtual |
Commits a suppressed column for the current cluster.
Can be called anytime before CommitCluster(). For any given column and cluster, there must be no calls to both CommitSuppressedColumn() and page commits.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 138 of file RPageSinkBuf.cxx.
|
private |
Definition at line 64 of file RPageSinkBuf.cxx.
|
private |
Definition at line 233 of file RPageSinkBuf.cxx.
|
finalvirtual |
Return the RNTupleDescriptor being constructed.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 82 of file RPageSinkBuf.cxx.
|
finalvirtual |
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 87 of file RPageSinkBuf.cxx.
|
delete |
|
default |
|
finalvirtual |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero.
Reimplemented from ROOT::Experimental::Internal::RPageSink.
Definition at line 298 of file RPageSinkBuf.cxx.
|
finalvirtual |
Stage the current cluster and create a new one for the following data.
Returns the object that must be passed to CommitStagedClusters to logically append the staged cluster to the ntuple descriptor.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 269 of file RPageSinkBuf.cxx.
|
finalvirtual |
Adds an extra type information record to schema.
The extra type information will be written to the extension header. The information in the record will be merged with the existing information, e.g. duplicate streamer info records will be removed. This method is called by the "on commit dataset" callback registered by specific fields (e.g., streamer field) and during merging.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 131 of file RPageSinkBuf.cxx.
|
finalvirtual |
Incorporate incremental changes to the model into the ntuple descriptor.
This happens, e.g. if new fields were added after the initial call to RPageSink::Init(RNTupleModel &)
. firstEntry
specifies the global index for the first stored element in the added columns.
Implements ROOT::Experimental::Internal::RPageSink.
Definition at line 95 of file RPageSinkBuf.cxx.
|
private |
Vector of buffered column pages. Indexed by column id.
Definition at line 115 of file RPageSinkBuf.hxx.
|
private |
Definition at line 108 of file RPageSinkBuf.hxx.
|
private |
The buffered page sink maintains a copy of the RNTupleModel for the inner sink.
For the unbuffered case, the RNTupleModel is instead managed by a RNTupleWriter.
Definition at line 113 of file RPageSinkBuf.hxx.
|
private |
The inner sink, responsible for actually performing I/O.
Definition at line 110 of file RPageSinkBuf.hxx.
|
private |
Definition at line 119 of file RPageSinkBuf.hxx.
|
private |
Definition at line 118 of file RPageSinkBuf.hxx.
|
private |
Columns committed as suppressed are stored and passed to the inner sink at cluster commit.
Definition at line 117 of file RPageSinkBuf.hxx.