Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
CudaKernels.cuh File Reference

Namespaces

namespace  RooFit
 The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or other types of arguments).
 
namespace  RooFit::Detail
 
namespace  RooFit::Detail::CudaKernels
 
namespace  RooFit::Detail::CudaKernels::Reducers
 Dedicated namespace for reduction kernels.
 

Macros

#define RooFit_Detail_CudaKernels_cuh
 

Typedefs

using RooFit::Detail::CudaKernels::Size_t = int
 The type for array size parameters.
 

Functions

template<int BlockSize_n>
__global__ void RooFit::Detail::CudaKernels::Reducers::Covariance2D (Size_t arraySize, float const *__restrict__ x, const float *__restrict__ y, double xMean, double yMean, double *__restrict__ output)
 Computes the covariance, variance of 'x', and variance of 'y' for a 2D data set.
 
template<class Num_t , class Den_t >
__global__ void RooFit::Detail::CudaKernels::DivideBy (Size_t arraySize, Den_t *__restrict__ num, Num_t const *__restrict__ den)
 Divides elements of the num array in-place by corresponding elements of the den array.
 
template<class Idx_t , class Elem_t >
__global__ void RooFit::Detail::CudaKernels::Lookup2D (Size_t arraySize, Idx_t n2, Idx_t const *__restrict__ x1, Idx_t const *__restrict__ x2, Elem_t const *__restrict__ lut, Elem_t *__restrict__ output)
 Fills an output array 'output' with values from a row-wise flattened two-dimensional lookup table lut based on input arrays x1 and x2:
 
template<int BlockSize_n, int Bins1_n, int Bins2_n, class Idx_t , class Elem_t , class Sum_t , class Counts_t >
__global__ void RooFit::Detail::CudaKernels::Reducers::SumBinwise2D (int arraySize, Idx_t const *__restrict__ x1, Idx_t const *__restrict__ x2, const Elem_t *__restrict__ arr, Sum_t *__restrict__ outputSum, Counts_t *__restrict__ outputCounts)
 Computes bin-wise sum and count of elements from the 'arr' array into separate output arrays based on indices provided in 'x1' and 'x2' arrays, using a 2D grid-stride loop approach.
 
template<int BlockSize_n, int ElemDim_n, class Elem_t , class Sum_t >
__global__ void RooFit::Detail::CudaKernels::Reducers::SumVectors (Size_t nElems, const Elem_t *__restrict__ arr, Sum_t *__restrict__ output)
 Performs a multi-block sum reduction on the input array arr.
 

Macro Definition Documentation

◆ RooFit_Detail_CudaKernels_cuh

#define RooFit_Detail_CudaKernels_cuh

Definition at line 14 of file CudaKernels.cuh.