Classes | |
| class | Array |
| A templated class for managing an array of data using a specified memory type. More... | |
| class | CudaEvent |
| class | CudaStream |
Typedefs | |
| template<class T > | |
| using | DeviceArray = Array<T, DeviceMemory> |
An array of specific type that is allocated on the device with cudaMalloc and freed with cudaFree. | |
| template<class T > | |
| using | PinnedHostArray = Array<T, PinnedHostMemory> |
A pinned array of specific type that allocated on the host with cudaMallocHost and freed with cudaFreeHost. | |
Functions | |
| template<class T > | |
| void | copyDeviceToDevice (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
| Copies data from the CUDA device to the CUDA device. | |
| template<class T > | |
| void | copyDeviceToHost (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
| Copies data from the CUDA device to the host. | |
| template<class T > | |
| void | copyHostToDevice (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
| Copies data from the host to the CUDA device. | |
| float | cudaEventElapsedTime (CudaEvent &begin, CudaEvent &end) |
| Calculates the elapsed time between two CUDA events. | |
| void | cudaEventRecord (CudaEvent &event, CudaStream &stream) |
| Records a CUDA event. | |
| using RooBatchCompute::CudaInterface::DeviceArray = Array<T, DeviceMemory> |
An array of specific type that is allocated on the device with cudaMalloc and freed with cudaFree.
Definition at line 209 of file CudaInterface.h.
| using RooBatchCompute::CudaInterface::PinnedHostArray = Array<T, PinnedHostMemory> |
A pinned array of specific type that allocated on the host with cudaMallocHost and freed with cudaFreeHost.
The memory is "pinned", i.e. page-locked and accessible to the device for fast copying.
cudaMallocHost on developer.download.nvidia.com. Definition at line 218 of file CudaInterface.h.
| void RooBatchCompute::CudaInterface::copyDeviceToDevice | ( | const T * | src, |
| T * | dest, | ||
| std::size_t | n, | ||
| CudaStream * | = nullptr ) |
Copies data from the CUDA device to the CUDA device.
| [in] | src | Pointer to the source memory on the device. |
| [in] | dest | Pointer to the destination memory on the device. |
| [in] | nBytes | Number of bytes to copy. |
| [in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 120 of file CudaInterface.h.
| void RooBatchCompute::CudaInterface::copyDeviceToHost | ( | const T * | src, |
| T * | dest, | ||
| std::size_t | n, | ||
| CudaStream * | = nullptr ) |
Copies data from the CUDA device to the host.
| [in] | src | Pointer to the source memory on the device. |
| [in] | dest | Pointer to the destination memory on the host. |
| [in] | nBytes | Number of bytes to copy. |
| [in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 106 of file CudaInterface.h.
| void RooBatchCompute::CudaInterface::copyHostToDevice | ( | const T * | src, |
| T * | dest, | ||
| std::size_t | n, | ||
| CudaStream * | = nullptr ) |
Copies data from the host to the CUDA device.
| [in] | src | Pointer to the source memory on the host. |
| [in] | dest | Pointer to the destination memory on the device. |
| [in] | nBytes | Number of bytes to copy. |
| [in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 92 of file CudaInterface.h.
Calculates the elapsed time between two CUDA events.
| [in] | begin | CudaEvent representing the start event. |
| [in] | end | CudaEvent representing the end event. |
Definition at line 146 of file CudaInterface.cu.
| void RooBatchCompute::CudaInterface::cudaEventRecord | ( | CudaEvent & | event, |
| CudaStream & | stream ) |
Records a CUDA event.
| [in] | event | CudaEvent object representing the event to be recorded. |
| [in] | stream | CudaStream in which to record the event. |
Definition at line 96 of file CudaInterface.cu.