Inspect on-disk and storage-related information of an RNTuple.
The RNTupleInspector can be used for studying an RNTuple in terms of its storage efficiency. It provides information on the level of the RNTuple itself, on the (sub)field level and on the column level.
Example usage:
Definition at line 75 of file RNTupleInspector.hxx.
Classes | |
class | RColumnInspector |
Provides column-level storage information. More... | |
class | RFieldTreeInspector |
Provides field-level storage information. More... | |
Public Member Functions | |
RNTupleInspector (const RNTupleInspector &other)=delete | |
RNTupleInspector (RNTupleInspector &&other)=delete | |
~RNTupleInspector () | |
size_t | GetColumnCountByType (EColumnType colType) const |
Get the number of columns of a given type present in the RNTuple. | |
const RColumnInspector & | GetColumnInspector (DescriptorId_t physicalColumnId) const |
Get storage information for a given column. | |
const std::vector< DescriptorId_t > | GetColumnsByType (EColumnType colType) |
Get the IDs of all columns with the given type. | |
std::unique_ptr< TH1D > | GetColumnTypeInfoAsHist (ENTupleInspectorHist histKind, std::string_view histName="", std::string_view histTitle="") |
Get a histogram showing information for each column type present,. | |
const std::vector< EColumnType > | GetColumnTypes () |
Get all column types present in the RNTuple being inspected. | |
std::uint64_t | GetCompressedSize () const |
Get the compressed, on-disk size of the RNTuple being inspected. | |
float | GetCompressionFactor () const |
Get the compression factor of the RNTuple being inspected. | |
int | GetCompressionSettings () const |
Get the compression settings of the RNTuple being inspected. | |
std::string | GetCompressionSettingsAsString () const |
Get a string describing compression settings of the RNTuple being inspected. | |
RNTupleDescriptor * | GetDescriptor () const |
Get the descriptor for the RNTuple being inspected. | |
size_t | GetFieldCountByType (const std::regex &typeNamePattern, bool searchInSubFields=true) const |
Get the number of fields of a given type or class present in the RNTuple. | |
size_t | GetFieldCountByType (std::string_view typeNamePattern, bool searchInSubFields=true) const |
Get the number of fields of a given type or class present in the RNTuple. | |
const std::vector< DescriptorId_t > | GetFieldsByName (const std::regex &fieldNamePattern, bool searchInSubFields=true) const |
Get the IDs of (sub-)fields whose name matches the given string. | |
const std::vector< DescriptorId_t > | GetFieldsByName (std::string_view fieldNamePattern, bool searchInSubFields=true) |
Get the IDs of (sub-)fields whose name matches the given string. | |
const RFieldTreeInspector & | GetFieldTreeInspector (DescriptorId_t fieldId) const |
Get storage information for a given (sub)field by ID. | |
const RFieldTreeInspector & | GetFieldTreeInspector (std::string_view fieldName) const |
Get a storage information inspector for a given (sub)field by name, including its subfields. | |
std::unique_ptr< TH1D > | GetPageSizeDistribution (DescriptorId_t physicalColumnId, std::string histName="", std::string histTitle="", size_t nBins=64) |
Get a histogram containing the size distribution of the compressed pages for an individual column. | |
std::unique_ptr< TH1D > | GetPageSizeDistribution (EColumnType colType, std::string histName="", std::string histTitle="", size_t nBins=64) |
Get a histogram containing the size distribution of the compressed pages for all columns of a given type. | |
std::unique_ptr< TH1D > | GetPageSizeDistribution (std::initializer_list< DescriptorId_t > colIds, std::string histName="", std::string histTitle="", size_t nBins=64) |
Get a histogram containing the size distribution of the compressed pages for a collection columns. | |
std::unique_ptr< THStack > | GetPageSizeDistribution (std::initializer_list< EColumnType > colTypes={}, std::string histName="", std::string histTitle="", size_t nBins=64) |
Get a histogram containing the size distribution of the compressed pages for all columns of a given list of types. | |
std::uint64_t | GetUncompressedSize () const |
Get the uncompressed total size of the RNTuple being inspected. | |
RNTupleInspector & | operator= (const RNTupleInspector &other)=delete |
RNTupleInspector & | operator= (RNTupleInspector &&other)=delete |
void | PrintColumnTypeInfo (ENTupleInspectorPrintFormat format=ENTupleInspectorPrintFormat::kTable, std::ostream &output=std::cout) |
Print storage information per column type. | |
Static Public Member Functions | |
static std::unique_ptr< RNTupleInspector > | Create (const RNTuple &sourceNTuple) |
Create a new RNTupleInspector. | |
static std::unique_ptr< RNTupleInspector > | Create (std::string_view ntupleName, std::string_view storage) |
Create a new RNTupleInspector. | |
Private Member Functions | |
RNTupleInspector (std::unique_ptr< Internal::RPageSource > pageSource) | |
void | CollectColumnInfo () |
Gather column-level and RNTuple-level information. | |
RFieldTreeInspector | CollectFieldTreeInfo (DescriptorId_t fieldId) |
Recursively gather field-level information. | |
std::vector< DescriptorId_t > | GetColumnsByFieldId (DescriptorId_t fieldId) const |
Get the columns that make up the given field, including its subfields. | |
Private Attributes | |
std::unordered_map< int, RColumnInspector > | fColumnInfo |
std::uint64_t | fCompressedSize = 0 |
int | fCompressionSettings = -1 |
std::unique_ptr< RNTupleDescriptor > | fDescriptor |
std::unordered_map< int, RFieldTreeInspector > | fFieldTreeInfo |
std::unique_ptr< Internal::RPageSource > | fPageSource |
std::uint64_t | fUncompressedSize = 0 |
#include <ROOT/RNTupleInspector.hxx>
|
private |
Definition at line 32 of file RNTupleInspector.cxx.
|
delete |
|
delete |
|
default |
|
private |
Gather column-level and RNTuple-level information.
Definition at line 47 of file RNTupleInspector.cxx.
|
private |
Recursively gather field-level information.
[in] | fieldId | The ID of the field from which to start the recursive traversal. Typically this is the "zero ID", i.e. the logical parent of all top-level fields. |
This method is called when the RNTupleInspector is initially created.
Definition at line 102 of file RNTupleInspector.cxx.
|
static |
Create a new RNTupleInspector.
[in] | sourceNTuple | A pointer to the RNTuple to be inspected. |
Definition at line 154 of file RNTupleInspector.cxx.
|
static |
Create a new RNTupleInspector.
[in] | ntupleName | The name of the RNTuple to be inspected. |
[in] | storage | The path or URI to the RNTuple to be inspected. |
Definition at line 161 of file RNTupleInspector.cxx.
size_t ROOT::Experimental::RNTupleInspector::GetColumnCountByType | ( | EColumnType | colType | ) | const |
Get the number of columns of a given type present in the RNTuple.
[in] | colType | The column type to count, as defined by ROOT::Experimental::EColumnType. |
Definition at line 188 of file RNTupleInspector.cxx.
const ROOT::Experimental::RNTupleInspector::RColumnInspector & ROOT::Experimental::RNTupleInspector::GetColumnInspector | ( | DescriptorId_t | physicalColumnId | ) | const |
Get storage information for a given column.
[in] | physicalColumnId | The physical ID of the column for which to get the information. |
Definition at line 179 of file RNTupleInspector.cxx.
|
private |
Get the columns that make up the given field, including its subfields.
[in] | fieldId | The ID of the field for which to collect the columns. |
Definition at line 128 of file RNTupleInspector.cxx.
const std::vector< ROOT::Experimental::DescriptorId_t > ROOT::Experimental::RNTupleInspector::GetColumnsByType | ( | EColumnType | colType | ) |
Get the IDs of all columns with the given type.
[in] | colType | The column type to collect, as defined by ROOT::Experimental::EColumnType. |
Definition at line 202 of file RNTupleInspector.cxx.
std::unique_ptr< TH1D > ROOT::Experimental::RNTupleInspector::GetColumnTypeInfoAsHist | ( | ENTupleInspectorHist | histKind, |
std::string_view | histName = "" , |
||
std::string_view | histTitle = "" |
||
) |
Get a histogram showing information for each column type present,.
[in] | histKind | Which type of information should be returned. |
[in] | histName | The name of the histogram. An empty string means a default name will be used. |
[in] | histTitle | The title of the histogram. An empty string means a default title will be used. |
TH1D
containing the specified kind of information.Get a histogram showing the count, number of elements, size on disk, or size in memory for each column type present in the inspected RNTuple.
Definition at line 268 of file RNTupleInspector.cxx.
const std::vector< ROOT::Experimental::EColumnType > ROOT::Experimental::RNTupleInspector::GetColumnTypes | ( | ) |
Get all column types present in the RNTuple being inspected.
Definition at line 214 of file RNTupleInspector.cxx.
|
inline |
Get the compressed, on-disk size of the RNTuple being inspected.
Definition at line 231 of file RNTupleInspector.hxx.
|
inline |
Get the compression factor of the RNTuple being inspected.
The compression factor shows how well the data present in the RNTuple is compressed by the compression settings that were used. The compression factor is calculated as \(size_{uncompressed} / size_{compressed}\).
Definition at line 246 of file RNTupleInspector.hxx.
|
inline |
Get the compression settings of the RNTuple being inspected.
Definition at line 215 of file RNTupleInspector.hxx.
std::string ROOT::Experimental::RNTupleInspector::GetCompressionSettingsAsString | ( | ) | const |
Get a string describing compression settings of the RNTuple being inspected.
"A (level L)"
, where A
is the name of the compression algorithm and L
the compression level.Definition at line 167 of file RNTupleInspector.cxx.
|
inline |
Get the descriptor for the RNTuple being inspected.
Definition at line 205 of file RNTupleInspector.hxx.
size_t ROOT::Experimental::RNTupleInspector::GetFieldCountByType | ( | const std::regex & | typeNamePattern, |
bool | searchInSubFields = true |
||
) | const |
Get the number of fields of a given type or class present in the RNTuple.
[in] | typeNamePattern | The type or class name to count. May contain regular expression patterns for grouping multiple kinds of types or classes. |
[in] | searchInSubFields | If set to false , only top-level fields will be considered. |
Definition at line 456 of file RNTupleInspector.cxx.
|
inline |
Get the number of fields of a given type or class present in the RNTuple.
Definition at line 447 of file RNTupleInspector.hxx.
const std::vector< ROOT::Experimental::DescriptorId_t > ROOT::Experimental::RNTupleInspector::GetFieldsByName | ( | const std::regex & | fieldNamePattern, |
bool | searchInSubFields = true |
||
) | const |
Get the IDs of (sub-)fields whose name matches the given string.
[in] | fieldNamePattern | The name of the field name to get. Because field names are unique by design, providing a single field name will return a vector containing just the ID of that field. However, regular expression patterns are supported in order to get the IDs of all fields whose name follow a certain structure. |
[in] | searchInSubFields | If set to false , only top-level fields will be considered. |
Definition at line 475 of file RNTupleInspector.cxx.
|
inline |
Get the IDs of (sub-)fields whose name matches the given string.
Definition at line 468 of file RNTupleInspector.hxx.
const ROOT::Experimental::RNTupleInspector::RFieldTreeInspector & ROOT::Experimental::RNTupleInspector::GetFieldTreeInspector | ( | DescriptorId_t | fieldId | ) | const |
Get storage information for a given (sub)field by ID.
[in] | fieldId | The ID of the (sub)field for which to get the information. |
Definition at line 435 of file RNTupleInspector.cxx.
const ROOT::Experimental::RNTupleInspector::RFieldTreeInspector & ROOT::Experimental::RNTupleInspector::GetFieldTreeInspector | ( | std::string_view | fieldName | ) | const |
Get a storage information inspector for a given (sub)field by name, including its subfields.
[in] | fieldName | The name of the (sub)field for which to get the information. |
Definition at line 445 of file RNTupleInspector.cxx.
std::unique_ptr< TH1D > ROOT::Experimental::RNTupleInspector::GetPageSizeDistribution | ( | DescriptorId_t | physicalColumnId, |
std::string | histName = "" , |
||
std::string | histTitle = "" , |
||
size_t | nBins = 64 |
||
) |
Get a histogram containing the size distribution of the compressed pages for an individual column.
[in] | physicalColumnId | The physical ID of the column for which to get the page size distribution. |
[in] | histName | The name of the histogram. An empty string means a default name will be used. |
[in] | histTitle | The title of the histogram. An empty string means a default title will be used. |
[in] | nBins | The desired number of histogram bins. |
TH1D
containing the page size distribution.The x-axis will range from the smallest page size, to the largest (inclusive).
Definition at line 310 of file RNTupleInspector.cxx.
std::unique_ptr< TH1D > ROOT::Experimental::RNTupleInspector::GetPageSizeDistribution | ( | EColumnType | colType, |
std::string | histName = "" , |
||
std::string | histTitle = "" , |
||
size_t | nBins = 64 |
||
) |
Get a histogram containing the size distribution of the compressed pages for all columns of a given type.
[in] | colType | The column type for which to get the size distribution, as defined by ROOT::Experimental::EColumnType. |
[in] | histName | The name of the histogram. An empty string means a default name will be used. |
[in] | histTitle | The title of the histogram. An empty string means a default title will be used. |
[in] | nBins | The desired number of histogram bins. |
TH1D
containing the page size distribution.The x-axis will range from the smallest page size, to the largest (inclusive).
std::unique_ptr< TH1D > ROOT::Experimental::RNTupleInspector::GetPageSizeDistribution | ( | std::initializer_list< DescriptorId_t > | colIds, |
std::string | histName = "" , |
||
std::string | histTitle = "" , |
||
size_t | nBins = 64 |
||
) |
Get a histogram containing the size distribution of the compressed pages for a collection columns.
[in] | colIds | The physical IDs of the columns for which to get the page size distribution. |
[in] | histName | The name of the histogram. An empty string means a default name will be used. |
[in] | histTitle | The title of the histogram. An empty string means a default title will be used. |
[in] | nBins | The desired number of histogram bins. |
TH1D
containing the (cumulative) page size distribution.The x-axis will range from the smallest page size, to the largest (inclusive).
Definition at line 345 of file RNTupleInspector.cxx.
std::unique_ptr< THStack > ROOT::Experimental::RNTupleInspector::GetPageSizeDistribution | ( | std::initializer_list< EColumnType > | colTypes = {} , |
std::string | histName = "" , |
||
std::string | histTitle = "" , |
||
size_t | nBins = 64 |
||
) |
Get a histogram containing the size distribution of the compressed pages for all columns of a given list of types.
[in] | colTypes | The column types for which to get the size distribution, as defined by ROOT::Experimental::EColumnType. The default is an empty vector, which indicates that the distribution for all physical columns will be returned. |
[in] | histName | The name of the histogram. An empty string means a default name will be used. The name of each histogram inside the THStack will be histName + colType . |
[in] | histTitle | The title of the histogram. An empty string means a default title will be used. |
[in] | nBins | The desired number of histogram bins. |
THStack
with one histogram for each column type.The x-axis will range from the smallest page size, to the largest (inclusive).
Example: Drawing a non-stacked page size distribution with a legend
|
inline |
Get the uncompressed total size of the RNTuple being inspected.
Definition at line 237 of file RNTupleInspector.hxx.
|
delete |
|
delete |
void ROOT::Experimental::RNTupleInspector::PrintColumnTypeInfo | ( | ENTupleInspectorPrintFormat | format = ENTupleInspectorPrintFormat::kTable , |
std::ostream & | output = std::cout |
||
) |
Print storage information per column type.
[in] | format | Whether to print the information as a (markdown-parseable) table or in CSV format. |
[in] | output | Where to write the output to. Default is stdout . |
The output includes for each column type its count, the total number of elements, the compressed size and the uncompressed size.
Example: printing the column type information of an RNTuple as a table
Output:
Example: printing the column type information of an RNTuple in CSV format
Output:
Definition at line 225 of file RNTupleInspector.cxx.
|
private |
Definition at line 141 of file RNTupleInspector.hxx.
|
private |
Definition at line 138 of file RNTupleInspector.hxx.
|
private |
Definition at line 137 of file RNTupleInspector.hxx.
|
private |
Definition at line 136 of file RNTupleInspector.hxx.
|
private |
Definition at line 142 of file RNTupleInspector.hxx.
|
private |
Definition at line 135 of file RNTupleInspector.hxx.
|
private |
Definition at line 139 of file RNTupleInspector.hxx.