Class DriverAPINativeMethods.MemoryManagement
Combines all API calls for memory management
Inheritance
Inherited Members
Namespace: ManagedCuda
Assembly: ManagedCuda.dll
Syntax
public static class MemoryManagement
Methods
cuMemAdvise(CUdeviceptr, SizeT, CUmemAdvise, CUdevice)
Advise about the usage of a given memory range
Advise the Unified Memory subsystem about the usage pattern for the memory range starting at devPtr with a size of count bytes.
The \p advice parameter can take the following values:
- ::CU_MEM_ADVISE_SET_READ_MOSTLY: This implies that the data is mostly going to be read from and only occasionally written to. This allows the driver to create read-only copies of the data in a processor's memory when that processor accesses it. Similarly, if cuMemPrefetchAsync is called on this region, it will create a read-only copy of the data on the destination processor. When a processor writes to this data, all copies of the corresponding page are invalidated except for the one where the write occurred. The \p device argument is ignored for this advice.
- ::CU_MEM_ADVISE_UNSET_READ_MOSTLY: Undoes the effect of ::CU_MEM_ADVISE_SET_READ_MOSTLY. Any read duplicated copies of the data will be freed no later than the next write access to that data.
- ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION: This advice sets the preferred location for the data to be the memory belonging to \p device. Passing in CU_DEVICE_CPU for \p device sets the preferred location as CPU memory. Setting the preferred location does not cause data to migrate to that location immediately. Instead, it guides the migration policy when a fault occurs on that memory region. If the data is already in its preferred location and the faulting processor can establish a mapping without requiring the data to be migrated, then the migration will be avoided. On the other hand, if the data is not in its preferred location or if a direct mapping cannot be established, then it will be migrated to the processor accessing it. It is important to note that setting the preferred location does not prevent data prefetching done using ::cuMemPrefetchAsync.
Having a preferred location can override the thrash detection and resolution logic in the Unified Memory driver. Normally, if a page is detected to be constantly thrashing between CPU and GPU memory say, the page will eventually be pinned to CPU memory by the Unified Memory driver. But if the preferred location is set as GPU memory, then the page will continue to thrash indefinitely. When the Unified Memory driver has to evict pages from a certain location on account of that memory being oversubscribed, the preferred location will be used to decide the destination to which a page should be evicted to.
If ::CU_MEM_ADVISE_SET_READ_MOSTLY is also set on this memory region or any subset of it, the preferred location will be ignored for that subset.
- ::CU_MEM_ADVISE_UNSET_PREFERRED_LOCATION: Undoes the effect of ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION and changes the preferred location to none.
- ::CU_MEM_ADVISE_SET_ACCESSED_BY: This advice implies that the data will be accessed by \p device. This does not cause data migration and has no impact on the location of the data per se. Instead, it causes the data to always be mapped in the specified processor's page tables, as long as the location of the data permits a mapping to be established. If the data gets migrated for any reason, the mappings are updated accordingly.
This advice is useful in scenarios where data locality is not important, but avoiding faults is. Consider for example a system containing multiple GPUs with peer-to-peer access enabled, where the data located on one GPU is occasionally accessed by other GPUs. In such scenarios, migrating data over to the other GPUs is not as important because the accesses are infrequent and the overhead of migration may be too high. But preventing faults can still help improve performance, and so having a mapping set up in advance is useful. Note that on CPU access of this data, the data may be migrated to CPU memory because the CPU typically cannot access GPU memory directly. Any GPU that had the ::CU_MEM_ADVISE_SET_ACCESSED_BY flag set for this data will now have its mapping updated to point to the page in CPU memory.
- ::CU_MEM_ADVISE_UNSET_ACCESSED_BY: Undoes the effect of CU_MEM_ADVISE_SET_ACCESSED_BY. The current set of mappings may be removed at any time causing accesses to result in page faults.
Passing in ::CU_DEVICE_CPU for \p device will set the advice for the CPU.
Note that this function is asynchronous with respect to the host and all work on other devices.
Declaration
public static CUResult cuMemAdvise(CUdeviceptr devPtr, SizeT count, CUmemAdvise advice, CUdevice device)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | devPtr | Pointer to memory to set the advice for |
SizeT | count | Size in bytes of the memory range |
CUmemAdvise | advice | Advice to be applied for the specified memory range |
CUdevice | device | Device to apply the advice for |
Returns
Type | Description |
---|---|
CUResult |
cuMemAlloc_v2(ref CUdeviceptr, SizeT)
Allocates bytesize
bytes of linear memory on the device and returns in dptr
a pointer to the allocated memory.
The allocated memory is suitably aligned for any kind of variable. The memory is not cleared. If bytesize
is 0,
cuMemAlloc_v2(ref CUdeviceptr, SizeT) returns ErrorInvalidValue.
Declaration
public static CUResult cuMemAlloc_v2(ref CUdeviceptr dptr, SizeT bytesize)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | dptr | Returned device pointer |
SizeT | bytesize | Requested allocation size in bytes |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemAllocHost_v2(ref IntPtr, SizeT)
Allocates bytesize
bytes of host memory that is page-locked and accessible to the device. The driver tracks the virtual
memory ranges allocated with this function and automatically accelerates calls to functions such as cuMemcpyHtoD_v2(CUdeviceptr, IntPtr, SizeT).
Since the memory can be accessed directly by the device, it can be read or written with much higher bandwidth than
pageable memory obtained with functions such as malloc()
. Allocating excessive amounts of memory with cuMemAllocHost_v2(ref IntPtr, SizeT)
may degrade system performance, since it reduces the amount of memory available to the system for paging.
As a result, this function is best used sparingly to allocate staging areas for data exchange between host and device.
Declaration
public static CUResult cuMemAllocHost_v2(ref IntPtr pp, SizeT bytesize)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | pp | Returned host pointer to page-locked memory |
SizeT | bytesize | Requested allocation size in bytes |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemAllocManaged(ref CUdeviceptr, SizeT, CUmemAttach_flags)
Allocates memory that will be automatically managed by the Unified Memory system
Allocates bytesize
bytes of managed memory on the device and returns in
dptr
a pointer to the allocated memory. If the device doesn't support
allocating managed memory, ErrorNotSupported is returned. Support
for managed memory can be queried using the device attribute
ManagedMemory. The allocated memory is suitably
aligned for any kind of variable. The memory is not cleared. If bytesize
is 0, ::cuMemAllocManaged returns ::CUDA_ERROR_INVALID_VALUE. The pointer
is valid on the CPU and on all GPUs in the system that support managed memory.
All accesses to this pointer must obey the Unified Memory programming model.
flags
specifies the default stream association for this allocation.
flags
must be one of ::CU_MEM_ATTACH_GLOBAL or ::CU_MEM_ATTACH_HOST. If
::CU_MEM_ATTACH_GLOBAL is specified, then this memory is accessible from
any stream on any device. If ::CU_MEM_ATTACH_HOST is specified, then the
allocation is created with initial visibility restricted to host access only;
an explicit call to ::cuStreamAttachMemAsync will be required to enable access
on the device.
If the association is later changed via ::cuStreamAttachMemAsync to a single stream, the default association as specifed during ::cuMemAllocManaged is restored when that stream is destroyed. For __managed__ variables, the default association is always ::CU_MEM_ATTACH_GLOBAL. Note that destroying a stream is an asynchronous operation, and as a result, the change to default association won't happen until all work in the stream has completed.
Memory allocated with ::cuMemAllocManaged should be released with ::cuMemFree.
On a multi-GPU system with peer-to-peer support, where multiple GPUs support managed memory, the physical storage is created on the GPU which is active at the time ::cuMemAllocManaged is called. All other GPUs will reference the data at reduced bandwidth via peer mappings over the PCIe bus. The Unified Memory management system does not migrate memory between GPUs.
On a multi-GPU system where multiple GPUs support managed memory, but not all pairs of such GPUs have peer-to-peer support between them, the physical storage is created in 'zero-copy' or system memory. All GPUs will reference the data at reduced bandwidth over the PCIe bus. In these circumstances, use of the environment variable, CUDA_VISIBLE_DEVICES, is recommended to restrict CUDA to only use those GPUs that have peer-to-peer support. This environment variable is described in the CUDA programming guide under the "CUDA environment variables" section.
Declaration
public static CUResult cuMemAllocManaged(ref CUdeviceptr dptr, SizeT bytesize, CUmemAttach_flags flags)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | dptr | Returned device pointer |
SizeT | bytesize | Requested allocation size in bytes |
CUmemAttach_flags | flags |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorNotSupported, , ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32)
Allocates at least WidthInBytes * Height
bytes of linear memory on the device and returns in dptr
a pointer
to the allocated memory. The function may pad the allocation to ensure that corresponding pointers in any given
row will continue to meet the alignment requirements for coalescing as the address is updated from row to row.
ElementSizeBytes
specifies the size of the largest reads and writes that will be performed on the memory range.
ElementSizeBytes
may be 4, 8 or 16 (since coalesced memory transactions are not possible on other data sizes). If
ElementSizeBytes
is smaller than the actual read/write size of a kernel, the kernel will run correctly, but possibly
at reduced speed. The pitch returned in pPitch
by cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32) is the width in bytes of the allocation. The
intended usage of pitch is as a separate parameter of the allocation, used to compute addresses within the 2D array.
Given the row and column of an array element of type T, the address is computed as:
T * pElement = (T*)((char*)BaseAddress + Row * Pitch) + Column;
The pitch returned by cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32) is guaranteed to work with cuMemcpy2D_v2(ref CUDAMemCpy2D) under all circumstances. For allocations of 2D arrays, it is recommended that programmers consider performing pitch allocations using cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32). Due to alignment restrictions in the hardware, this is especially true if the application will be performing 2D memory copies between different regions of device memory (whether linear memory or CUDA arrays).
The byte alignment of the pitch returned by cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32) is guaranteed to match or exceed the alignment requirement for texture binding with cuTexRefSetAddress2D_v2(CUtexref, ref CUDAArrayDescriptor, CUdeviceptr, SizeT).
Declaration
public static CUResult cuMemAllocPitch_v2(ref CUdeviceptr dptr, ref SizeT pPitch, SizeT WidthInBytes, SizeT Height, uint ElementSizeBytes)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | dptr | Returned device pointer |
SizeT | pPitch | Returned pitch of allocation in bytes |
SizeT | WidthInBytes | Requested allocation width in bytes |
SizeT | Height | Requested allocation height in rows |
System.UInt32 | ElementSizeBytes | Size of largest reads/writes for range |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemFree_v2(CUdeviceptr)
Frees the memory space pointed to by dptr
, which must have been returned by a previous call to cuMemAlloc_v2(ref CUdeviceptr, SizeT) or
cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32).
Declaration
public static CUResult cuMemFree_v2(CUdeviceptr dptr)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | dptr | Pointer to memory to free |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemFreeHost(IntPtr)
Frees the memory space pointed to by p
, which must have been returned by a previous call to cuMemAllocHost_v2(ref IntPtr, SizeT).
Declaration
public static CUResult cuMemFreeHost(IntPtr p)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | p | Pointer to memory to free |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemGetAddressRange_v2(ref CUdeviceptr, ref SizeT, CUdeviceptr)
Returns the base address in pbase
and size in psize
of the allocation by cuMemAlloc_v2(ref CUdeviceptr, SizeT) or cuMemAllocPitch_v2(ref CUdeviceptr, ref SizeT, SizeT, SizeT, UInt32)
that contains the input pointer dptr
. Both parameters pbase
and psize
are optional. If one of them is null
, it is
ignored.
Declaration
public static CUResult cuMemGetAddressRange_v2(ref CUdeviceptr pbase, ref SizeT psize, CUdeviceptr dptr)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | pbase | Returned base address |
SizeT | psize | Returned size of device memory allocation |
CUdeviceptr | dptr | Device pointer to query |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemGetInfo_v2(ref SizeT, ref SizeT)
Returns in free
and total
respectively, the free and total amount of memory available for allocation by the
CUDA context, in bytes.
Declaration
public static CUResult cuMemGetInfo_v2(ref SizeT free, ref SizeT total)
Parameters
Type | Name | Description |
---|---|---|
SizeT | free | Returned free memory in bytes |
SizeT | total | Returned total memory in bytes |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemHostAlloc(ref IntPtr, SizeT, CUMemHostAllocFlags)
Allocates bytesize
bytes of host memory that is page-locked and accessible to the device. The driver tracks the virtual
memory ranges allocated with this function and automatically accelerates calls to functions such as cuMemcpyHtoD_v2(CUdeviceptr, IntPtr, SizeT).
Since the memory can be accessed directly by the device, it can be read or written with much higher bandwidth than
pageable memory obtained with functions such as malloc()
. Allocating excessive amounts of pinned
memory may degrade system performance, since it reduces the amount of memory available to the system for paging.
As a result, this function is best used sparingly to allocate staging areas for data exchange between host and device.
For the Flags
parameter see CUMemHostAllocFlags.
The CUDA context must have been created with the MapHost flag in order for the DeviceMap flag to have any effect.
The MapHost flag may be specified on CUDA contexts for devices that do not support mapped pinned memory. The failure is deferred to cuMemHostGetDevicePointer_v2(ref CUdeviceptr, IntPtr, Int32) because the memory may be mapped into other CUDA contexts via the Portable flag.
The memory allocated by this function must be freed with cuMemFreeHost(IntPtr).
Note all host memory allocated using cuMemHostAlloc(ref IntPtr, SizeT, CUMemHostAllocFlags) will automatically
be immediately accessible to all contexts on all devices which support unified
addressing (as may be queried using ::CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING).
Unless the flag ::CU_MEMHOSTALLOC_WRITECOMBINED is specified, the device pointer
that may be used to access this host memory from those contexts is always equal
to the returned host pointer pp
. If the flag ::CU_MEMHOSTALLOC_WRITECOMBINED
is specified, then the function cuMemHostGetDevicePointer_v2(ref CUdeviceptr, IntPtr, Int32) must be used
to query the device pointer, even if the context supports unified addressing.
See \ref CUDA_UNIFIED for additional details.
Declaration
public static CUResult cuMemHostAlloc(ref IntPtr pp, SizeT bytesize, CUMemHostAllocFlags Flags)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | pp | Returned host pointer to page-locked memory |
SizeT | bytesize | Requested allocation size in bytes |
CUMemHostAllocFlags | Flags | Flags for allocation request |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemHostGetDevicePointer_v2(ref CUdeviceptr, IntPtr, Int32)
Passes back the device pointer pdptr
corresponding to the mapped, pinned host buffer p
allocated by cuMemHostAlloc(ref IntPtr, SizeT, CUMemHostAllocFlags).
cuMemHostGetDevicePointer_v2(ref CUdeviceptr, IntPtr, Int32) will fail if the DeviceMap flag was not specified at the
time the memory was allocated, or if the function is called on a GPU that does not support mapped pinned memory.
Flags provides for future releases. For now, it must be set to 0.
Declaration
public static CUResult cuMemHostGetDevicePointer_v2(ref CUdeviceptr pdptr, IntPtr p, int Flags)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | pdptr | Returned device pointer |
System.IntPtr | p | Host pointer |
System.Int32 | Flags | Options (must be 0) |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemHostGetFlags(ref CUMemHostAllocFlags, IntPtr)
Passes back the flags pFlags
that were specified when allocating the pinned host buffer p
allocated by
cuMemHostAlloc(ref IntPtr, SizeT, CUMemHostAllocFlags).
cuMemHostGetFlags(ref CUMemHostAllocFlags, IntPtr) will fail if the pointer does not reside in an allocation performed by cuMemAllocHost_v2(ref IntPtr, SizeT) or cuMemHostAlloc(ref IntPtr, SizeT, CUMemHostAllocFlags).
Declaration
public static CUResult cuMemHostGetFlags(ref CUMemHostAllocFlags pFlags, IntPtr p)
Parameters
Type | Name | Description |
---|---|---|
CUMemHostAllocFlags | pFlags | Returned flags |
System.IntPtr | p | Host pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue.
|
cuMemHostRegister(IntPtr, SizeT, CUMemHostRegisterFlags)
Page-locks the memory range specified by p
and bytesize
and maps it
for the device(s) as specified by Flags
. This memory range also is added
to the same tracking mechanism as ::cuMemHostAlloc to automatically accelerate
calls to functions such as cuMemcpyHtoD_v2(CUdeviceptr, dim3[], SizeT). Since the memory can be accessed
directly by the device, it can be read or written with much higher bandwidth
than pageable memory that has not been registered. Page-locking excessive
amounts of memory may degrade system performance, since it reduces the amount
of memory available to the system for paging. As a result, this function is
best used sparingly to register staging areas for data exchange between
host and device.
The pointer p
and size bytesize
must be aligned to the host page size (4 KB).
The memory page-locked by this function must be unregistered with cuMemHostUnregister(IntPtr)
Declaration
public static CUResult cuMemHostRegister(IntPtr p, SizeT byteSize, CUMemHostRegisterFlags Flags)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | p | Host pointer to memory to page-lock |
SizeT | byteSize | Size in bytes of the address range to page-lock |
CUMemHostRegisterFlags | Flags | Flags for allocation request |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemHostUnregister(IntPtr)
Unmaps the memory range whose base address is specified by p
, and makes it pageable again.
The base address must be the same one specified to cuMemHostRegister(IntPtr, SizeT, CUMemHostRegisterFlags).
Declaration
public static CUResult cuMemHostUnregister(IntPtr p)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | p | Host pointer to memory to page-lock |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorOutOfMemory.
|
cuMemPrefetchAsync(CUdeviceptr, SizeT, CUdevice, CUstream)
Prefetches memory to the specified destination device
Prefetches memory to the specified destination device. devPtr is the base device pointer of the memory to be prefetched and dstDevice is the destination device. count specifies the number of bytes to copy. hStream is the stream in which the operation is enqueued.
Passing in CU_DEVICE_CPU for dstDevice will prefetch the data to CPU memory.
If no physical memory has been allocated for this region, then this memory region will be populated and mapped on the destination device. If there's insufficient memory to prefetch the desired region, the Unified Memory driver may evict pages belonging to other memory regions to make room. If there's no memory that can be evicted, then the Unified Memory driver will prefetch less than what was requested.
In the normal case, any mappings to the previous location of the migrated pages are removed and mappings for the new location are only setup on the dstDevice. The application can exercise finer control on these mappings using ::cudaMemAdvise.
Declaration
public static CUResult cuMemPrefetchAsync(CUdeviceptr devPtr, SizeT count, CUdevice dstDevice, CUstream hStream)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | devPtr | Pointer to be prefetched |
SizeT | count | Size in bytes |
CUdevice | dstDevice | Destination device to prefetch to |
CUstream | hStream | Stream to enqueue prefetch operation |
Returns
Type | Description |
---|---|
CUResult |
Remarks
Note that this function is asynchronous with respect to the host and all work on other devices.
cuMemRangeGetAttribute(IntPtr, SizeT, CUmem_range_attribute, CUdeviceptr, SizeT)
Query an attribute of a given memory range
Declaration
public static CUResult cuMemRangeGetAttribute(IntPtr data, SizeT dataSize, CUmem_range_attribute attribute, CUdeviceptr devPtr, SizeT count)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | data | A pointers to a memory location where the result of each attribute query will be written to. |
SizeT | dataSize | Array containing the size of data |
CUmem_range_attribute | attribute | The attribute to query |
CUdeviceptr | devPtr | Start of the range to query |
SizeT | count | Size of the range to query |
Returns
Type | Description |
---|---|
CUResult |
cuMemRangeGetAttributes(IntPtr[], SizeT[], CUmem_range_attribute[], SizeT, CUdeviceptr, SizeT)
Query attributes of a given memory range.
Declaration
public static CUResult cuMemRangeGetAttributes(IntPtr[] data, SizeT[] dataSizes, CUmem_range_attribute[] attributes, SizeT numAttributes, CUdeviceptr devPtr, SizeT count)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr[] | data | A two-dimensional array containing pointers to memory locations where the result of each attribute query will be written to. |
SizeT[] | dataSizes | Array containing the sizes of each result |
CUmem_range_attribute[] | attributes | An array of attributes to query (numAttributes and the number of attributes in this array should match) |
SizeT | numAttributes | Number of attributes to query |
CUdeviceptr | devPtr | Start of the range to query |
SizeT | count | Size of the range to query |
Returns
Type | Description |
---|---|
CUResult |
cuPointerGetAttribute(ref CUcontext, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref CUcontext data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
CUcontext | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref CudaPointerAttributeP2PTokens, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref CudaPointerAttributeP2PTokens data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
CudaPointerAttributeP2PTokens | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref CUdeviceptr, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref CUdeviceptr data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref CUMemoryType, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref CUMemoryType data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
CUMemoryType | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref Int32, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref int data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref IntPtr, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref IntPtr data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
System.IntPtr | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttribute(ref UInt64, CUPointerAttribute, CUdeviceptr)
Returns information about a pointer
Declaration
public static CUResult cuPointerGetAttribute(ref ulong data, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
System.UInt64 | data | Returned pointer attribute value |
CUPointerAttribute | attribute | Pointer attribute to query |
CUdeviceptr | ptr | Pointer |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice.
|
cuPointerGetAttributes(UInt32, CUPointerAttribute[], IntPtr, CUdeviceptr)
Returns information about a pointer.
The supported attributes are (refer to ::cuPointerGetAttribute for attribute descriptions and restrictions):
- ::CU_POINTER_ATTRIBUTE_CONTEXT
- ::CU_POINTER_ATTRIBUTE_MEMORY_TYPE
- ::CU_POINTER_ATTRIBUTE_DEVICE_POINTER
- ::CU_POINTER_ATTRIBUTE_HOST_POINTER
- ::CU_POINTER_ATTRIBUTE_SYNC_MEMOPS
- ::CU_POINTER_ATTRIBUTE_BUFFER_ID
- ::CU_POINTER_ATTRIBUTE_IS_MANAGED
Declaration
public static CUResult cuPointerGetAttributes(uint numAttributes, CUPointerAttribute[] attributes, IntPtr data, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
System.UInt32 | numAttributes | Number of attributes to query |
CUPointerAttribute[] | attributes | An array of attributes to query (numAttributes and the number of attributes in this array should match) |
System.IntPtr | data | A two-dimensional array containing pointers to memory locations where the result of each attribute query will be written to. |
CUdeviceptr | ptr | Pointer to query |
Returns
Type | Description |
---|---|
CUResult |
cuPointerSetAttribute(ref Int32, CUPointerAttribute, CUdeviceptr)
Set attributes on a previously allocated memory region
The supported attributes are:
SyncMemops: A boolean attribute that can either be set (1) or unset (0). When set,
memory operations that are synchronous. If there are some previously initiated
synchronous memory operations that are pending when this attribute is set, the
function does not return until those memory operations are complete.
See further documentation in the section titled "API synchronization behavior"
to learn more about cases when synchronous memory operations can
exhibit asynchronous behavior.
value
will be considered as a pointer to an unsigned integer to which this attribute is to be set.
Declaration
public static CUResult cuPointerSetAttribute(ref int value, CUPointerAttribute attribute, CUdeviceptr ptr)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | value | Pointer to memory containing the value to be set |
CUPointerAttribute | attribute | Pointer attribute to set |
CUdeviceptr | ptr | Pointer to a memory region allocated using CUDA memory allocation APIs |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized, ErrorInvalidContext, ErrorInvalidValue, ErrorInvalidDevice |