Class CudaManagedMemory_ushort
A variable located in managed memory.
Type: ushort
Inheritance
Implements
Inherited Members
Namespace: ManagedCuda
Assembly: ManagedCuda.dll
Syntax
public class CudaManagedMemory_ushort : IDisposable, IEnumerable<ushort>, IEnumerable
Constructors
| Improve this Doc View SourceCudaManagedMemory_ushort(CUmodule, String)
Creates a new CudaManagedMemory from definition in cu-file.
Declaration
public CudaManagedMemory_ushort(CUmodule module, string name)
Parameters
Type | Name | Description |
---|---|---|
CUmodule | module | The module where the variable is defined in. |
System.String | name | The variable name as defined in the cu-file. |
CudaManagedMemory_ushort(SizeT, CUmemAttach_flags)
Creates a new CudaManagedMemory and allocates the memory on host/device.
Declaration
public CudaManagedMemory_ushort(SizeT size, CUmemAttach_flags attachFlags)
Parameters
Type | Name | Description |
---|---|---|
SizeT | size | In elements |
CUmemAttach_flags | attachFlags |
CudaManagedMemory_ushort(CudaKernel, String)
Creates a new CudaManagedMemory from definition in cu-file.
Declaration
public CudaManagedMemory_ushort(CudaKernel kernel, string name)
Parameters
Type | Name | Description |
---|---|---|
CudaKernel | kernel | The kernel which module defines the variable. |
System.String | name | The variable name as defined in the cu-file. |
Properties
| Improve this Doc View SourceAttributeBufferID
A process-wide unique ID for an allocated memory region
Declaration
public ulong AttributeBufferID { get; }
Property Value
Type | Description |
---|---|
System.UInt64 |
AttributeContext
The CUcontext on which a pointer was allocated or registered
Declaration
public CUcontext AttributeContext { get; }
Property Value
Type | Description |
---|---|
CUcontext |
AttributeDevicePointer
The address at which a pointer's memory may be accessed on the device
Except in the exceptional disjoint addressing cases, the value returned will equal the input value.
Declaration
public CUdeviceptr AttributeDevicePointer { get; }
Property Value
Type | Description |
---|---|
CUdeviceptr |
AttributeHostPointer
The address at which a pointer's memory may be accessed on the host
Declaration
public IntPtr AttributeHostPointer { get; }
Property Value
Type | Description |
---|---|
System.IntPtr |
AttributeIsManaged
Indicates if the pointer points to managed memory
Declaration
public bool AttributeIsManaged { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
AttributeMemoryType
The CUMemoryType describing the physical location of a pointer
Declaration
public CUMemoryType AttributeMemoryType { get; }
Property Value
Type | Description |
---|---|
CUMemoryType |
AttributeP2PTokens
A pair of tokens for use with the nv-p2p.h Linux kernel interface
Declaration
public CudaPointerAttributeP2PTokens AttributeP2PTokens { get; }
Property Value
Type | Description |
---|---|
CudaPointerAttributeP2PTokens |
AttributeSyncMemops
Synchronize every synchronous memory operation initiated on this region
Declaration
public bool AttributeSyncMemops { get; set; }
Property Value
Type | Description |
---|---|
System.Boolean |
DevicePointer
CUdeviceptr to managed memory.
Declaration
public CUdeviceptr DevicePointer { get; }
Property Value
Type | Description |
---|---|
CUdeviceptr |
HostPointer
UIntPtr to managed memory.
Declaration
public UIntPtr HostPointer { get; }
Property Value
Type | Description |
---|---|
System.UIntPtr |
IsOwner
If the wrapper class instance is the owner of a CUDA handle, it will be destroyed while disposing.
Declaration
public bool IsOwner { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
Item[SizeT]
Access array per element.
Declaration
public ushort this[SizeT index] { get; set; }
Parameters
Type | Name | Description |
---|---|---|
SizeT | index | index in elements |
Property Value
Type | Description |
---|---|
System.UInt16 |
Size
Size in elements
Declaration
public SizeT Size { get; }
Property Value
Type | Description |
---|---|
SizeT |
SizeInBytes
Size in bytes
Declaration
public SizeT SizeInBytes { get; }
Property Value
Type | Description |
---|---|
SizeT |
Methods
| Improve this Doc View SourceDispose()
Dispose
Declaration
public void Dispose()
Dispose(Boolean)
For IDisposable
Declaration
protected virtual void Dispose(bool fDisposing)
Parameters
Type | Name | Description |
---|---|---|
System.Boolean | fDisposing |
Finalize()
For dispose
Declaration
protected void Finalize()
MemAdvise(CUdeviceptr, SizeT, CUmemAdvise, CUdevice)
Advise about the usage of a given memory range
Advise the Unified Memory subsystem about the usage pattern for the memory range starting at devPtr with a size of count bytes.
The \p advice parameter can take the following values:
- ::CU_MEM_ADVISE_SET_READ_MOSTLY: This implies that the data is mostly going to be read from and only occasionally written to. This allows the driver to create read-only copies of the data in a processor's memory when that processor accesses it. Similarly, if cuMemPrefetchAsync is called on this region, it will create a read-only copy of the data on the destination processor. When a processor writes to this data, all copies of the corresponding page are invalidated except for the one where the write occurred. The \p device argument is ignored for this advice.
- ::CU_MEM_ADVISE_UNSET_READ_MOSTLY: Undoes the effect of ::CU_MEM_ADVISE_SET_READ_MOSTLY. Any read duplicated copies of the data will be freed no later than the next write access to that data.
- ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION: This advice sets the preferred location for the data to be the memory belonging to \p device. Passing in CU_DEVICE_CPU for \p device sets the preferred location as CPU memory. Setting the preferred location does not cause data to migrate to that location immediately. Instead, it guides the migration policy when a fault occurs on that memory region. If the data is already in its preferred location and the faulting processor can establish a mapping without requiring the data to be migrated, then the migration will be avoided. On the other hand, if the data is not in its preferred location or if a direct mapping cannot be established, then it will be migrated to the processor accessing it. It is important to note that setting the preferred location does not prevent data prefetching done using ::cuMemPrefetchAsync.
Having a preferred location can override the thrash detection and resolution logic in the Unified Memory driver. Normally, if a page is detected to be constantly thrashing between CPU and GPU memory say, the page will eventually be pinned to CPU memory by the Unified Memory driver. But if the preferred location is set as GPU memory, then the page will continue to thrash indefinitely. When the Unified Memory driver has to evict pages from a certain location on account of that memory being oversubscribed, the preferred location will be used to decide the destination to which a page should be evicted to.
If ::CU_MEM_ADVISE_SET_READ_MOSTLY is also set on this memory region or any subset of it, the preferred location will be ignored for that subset.
- ::CU_MEM_ADVISE_UNSET_PREFERRED_LOCATION: Undoes the effect of ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION and changes the preferred location to none.
- ::CU_MEM_ADVISE_SET_ACCESSED_BY: This advice implies that the data will be accessed by \p device. This does not cause data migration and has no impact on the location of the data per se. Instead, it causes the data to always be mapped in the specified processor's page tables, as long as the location of the data permits a mapping to be established. If the data gets migrated for any reason, the mappings are updated accordingly.
This advice is useful in scenarios where data locality is not important, but avoiding faults is. Consider for example a system containing multiple GPUs with peer-to-peer access enabled, where the data located on one GPU is occasionally accessed by other GPUs. In such scenarios, migrating data over to the other GPUs is not as important because the accesses are infrequent and the overhead of migration may be too high. But preventing faults can still help improve performance, and so having a mapping set up in advance is useful. Note that on CPU access of this data, the data may be migrated to CPU memory because the CPU typically cannot access GPU memory directly. Any GPU that had the ::CU_MEM_ADVISE_SET_ACCESSED_BY flag set for this data will now have its mapping updated to point to the page in CPU memory.
- ::CU_MEM_ADVISE_UNSET_ACCESSED_BY: Undoes the effect of CU_MEM_ADVISE_SET_ACCESSED_BY. The current set of mappings may be removed at any time causing accesses to result in page faults.
Passing in ::CU_DEVICE_CPU for \p device will set the advice for the CPU.
Note that this function is asynchronous with respect to the host and all work on other devices.
Declaration
public static void MemAdvise(CUdeviceptr devPtr, SizeT count, CUmemAdvise advice, CUdevice device)
Parameters
Type | Name | Description |
---|---|---|
CUdeviceptr | devPtr | Pointer to memory to set the advice for |
SizeT | count | Size in bytes of the memory range |
CUmemAdvise | advice | Advice to be applied for the specified memory range |
CUdevice | device | Device to apply the advice for |
MemAdvise(CudaManagedMemory_ushort, CUmemAdvise, CUdevice)
Advise about the usage of a given memory range
Advise the Unified Memory subsystem about the usage pattern for the memory range starting at devPtr with a size of count bytes.
The \p advice parameter can take the following values:
- ::CU_MEM_ADVISE_SET_READ_MOSTLY: This implies that the data is mostly going to be read from and only occasionally written to. This allows the driver to create read-only copies of the data in a processor's memory when that processor accesses it. Similarly, if cuMemPrefetchAsync is called on this region, it will create a read-only copy of the data on the destination processor. When a processor writes to this data, all copies of the corresponding page are invalidated except for the one where the write occurred. The \p device argument is ignored for this advice.
- ::CU_MEM_ADVISE_UNSET_READ_MOSTLY: Undoes the effect of ::CU_MEM_ADVISE_SET_READ_MOSTLY. Any read duplicated copies of the data will be freed no later than the next write access to that data.
- ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION: This advice sets the preferred location for the data to be the memory belonging to \p device. Passing in CU_DEVICE_CPU for \p device sets the preferred location as CPU memory. Setting the preferred location does not cause data to migrate to that location immediately. Instead, it guides the migration policy when a fault occurs on that memory region. If the data is already in its preferred location and the faulting processor can establish a mapping without requiring the data to be migrated, then the migration will be avoided. On the other hand, if the data is not in its preferred location or if a direct mapping cannot be established, then it will be migrated to the processor accessing it. It is important to note that setting the preferred location does not prevent data prefetching done using ::cuMemPrefetchAsync.
Having a preferred location can override the thrash detection and resolution logic in the Unified Memory driver. Normally, if a page is detected to be constantly thrashing between CPU and GPU memory say, the page will eventually be pinned to CPU memory by the Unified Memory driver. But if the preferred location is set as GPU memory, then the page will continue to thrash indefinitely. When the Unified Memory driver has to evict pages from a certain location on account of that memory being oversubscribed, the preferred location will be used to decide the destination to which a page should be evicted to.
If ::CU_MEM_ADVISE_SET_READ_MOSTLY is also set on this memory region or any subset of it, the preferred location will be ignored for that subset.
- ::CU_MEM_ADVISE_UNSET_PREFERRED_LOCATION: Undoes the effect of ::CU_MEM_ADVISE_SET_PREFERRED_LOCATION and changes the preferred location to none.
- ::CU_MEM_ADVISE_SET_ACCESSED_BY: This advice implies that the data will be accessed by \p device. This does not cause data migration and has no impact on the location of the data per se. Instead, it causes the data to always be mapped in the specified processor's page tables, as long as the location of the data permits a mapping to be established. If the data gets migrated for any reason, the mappings are updated accordingly.
This advice is useful in scenarios where data locality is not important, but avoiding faults is. Consider for example a system containing multiple GPUs with peer-to-peer access enabled, where the data located on one GPU is occasionally accessed by other GPUs. In such scenarios, migrating data over to the other GPUs is not as important because the accesses are infrequent and the overhead of migration may be too high. But preventing faults can still help improve performance, and so having a mapping set up in advance is useful. Note that on CPU access of this data, the data may be migrated to CPU memory because the CPU typically cannot access GPU memory directly. Any GPU that had the ::CU_MEM_ADVISE_SET_ACCESSED_BY flag set for this data will now have its mapping updated to point to the page in CPU memory.
- ::CU_MEM_ADVISE_UNSET_ACCESSED_BY: Undoes the effect of CU_MEM_ADVISE_SET_ACCESSED_BY. The current set of mappings may be removed at any time causing accesses to result in page faults.
Passing in ::CU_DEVICE_CPU for \p device will set the advice for the CPU.
Note that this function is asynchronous with respect to the host and all work on other devices.
Declaration
public static void MemAdvise(CudaManagedMemory_ushort ptr, CUmemAdvise advice, CUdevice device)
Parameters
Type | Name | Description |
---|---|---|
CudaManagedMemory_ushort | ptr | managed memory variable |
CUmemAdvise | advice | Advice to be applied for the specified memory range |
CUdevice | device | Device to apply the advice for |
PrefetchAsync(CUdevice, CUstream)
Prefetches memory to the specified destination device
Prefetches memory to the specified destination device. devPtr is the base device pointer of the memory to be prefetched and dstDevice is the destination device. count specifies the number of bytes to copy. hStream is the stream in which the operation is enqueued.
Passing in CU_DEVICE_CPU for dstDevice will prefetch the data to CPU memory.
If no physical memory has been allocated for this region, then this memory region will be populated and mapped on the destination device. If there's insufficient memory to prefetch the desired region, the Unified Memory driver may evict pages belonging to other memory regions to make room. If there's no memory that can be evicted, then the Unified Memory driver will prefetch less than what was requested.
In the normal case, any mappings to the previous location of the migrated pages are removed and mappings for the new location are only setup on the dstDevice. The application can exercise finer control on these mappings using ::cudaMemAdvise.
Declaration
public void PrefetchAsync(CUdevice dstDevice, CUstream hStream)
Parameters
Type | Name | Description |
---|---|---|
CUdevice | dstDevice | Destination device to prefetch to |
CUstream | hStream | Stream to enqueue prefetch operation |
Remarks
Note that this function is asynchronous with respect to the host and all work on other devices.
StreamAttachMemAsync(CUstream, SizeT, CUmemAttach_flags)
Attach memory to a stream asynchronously
Enqueues an operation in hStream
to specify stream association of
length
bytes of memory starting from dptr
. This function is a
stream-ordered operation, meaning that it is dependent on, and will
only take effect when, previous work in stream has completed. Any
previous association is automatically replaced.
dptr
must point to an address within managed memory space declared
using the __managed__ keyword or allocated with cuMemAllocManaged.
length
must be zero, to indicate that the entire allocation's
stream association is being changed. Currently, it's not possible
to change stream association for a portion of an allocation.
The stream association is specified using flags
which must be
one of CUmemAttach_flags.
If the Global flag is specified, the memory can be accessed
by any stream on any device.
If the Host flag is specified, the program makes a guarantee
that it won't access the memory on the device from any stream.
If the Single flag is specified, the program makes a guarantee
that it will only access the memory on the device from hStream
. It is illegal
to attach singly to the NULL stream, because the NULL stream is a virtual global
stream and not a specific stream. An error will be returned in this case.
When memory is associated with a single stream, the Unified Memory system will
allow CPU access to this memory region so long as all operations in hStream
have completed, regardless of whether other streams are active. In effect,
this constrains exclusive ownership of the managed memory region by
an active GPU to per-stream activity instead of whole-GPU activity.
Accessing memory on the device from streams that are not associated with it will produce undefined results. No error checking is performed by the Unified Memory system to ensure that kernels launched into other streams do not access this region.
It is a program's responsibility to order calls to cuStreamAttachMemAsync(CUstream, CUdeviceptr, SizeT, CUmemAttach_flags) via events, synchronization or other means to ensure legal access to memory at all times. Data visibility and coherency will be changed appropriately for all kernels which follow a stream-association change.
If hStream
is destroyed while data is associated with it, the association is
removed and the association reverts to the default visibility of the allocation
as specified at cuMemAllocManaged. For __managed__ variables, the default
association is always Global. Note that destroying a stream is an
asynchronous operation, and as a result, the change to default association won't
happen until all work in the stream has completed.
Declaration
public void StreamAttachMemAsync(CUstream hStream, SizeT length, CUmemAttach_flags flags)
Parameters
Type | Name | Description |
---|---|---|
CUstream | hStream | Stream in which to enqueue the attach operation |
SizeT | length | Length of memory (must be zero) |
CUmemAttach_flags | flags | Must be one of CUmemAttach_flags |
Operators
| Improve this Doc View SourceImplicit(CudaManagedMemory_ushort to UInt16)
Converts a managed variable to a host value. In case of multiple managed values (array), only the first value is converted.
Declaration
public static implicit operator ushort (CudaManagedMemory_ushort d)
Parameters
Type | Name | Description |
---|---|---|
CudaManagedMemory_ushort | d | managed variable |
Returns
Type | Description |
---|---|
System.UInt16 | newly allocated host variable with value from managed memory |
Explicit Interface Implementations
| Improve this Doc View SourceIEnumerable<UInt16>.GetEnumerator()
Declaration
IEnumerator<ushort> IEnumerable<ushort>.GetEnumerator()
Returns
Type | Description |
---|---|
System.Collections.Generic.IEnumerator<System.UInt16> |
IEnumerable.GetEnumerator()
Declaration
IEnumerator IEnumerable.GetEnumerator()
Returns
Type | Description |
---|---|
System.Collections.IEnumerator |