Show / Hide Table of Contents

    Class CudaStream

    Wrapps a CUstream handle. In case of a so called NULL stream, use the native CUstream struct instead.

    Inheritance
    System.Object
    CudaStream
    Implements
    System.IDisposable
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: ManagedCuda
    Assembly: ManagedCuda.dll
    Syntax
    public class CudaStream : IDisposable

    Constructors

    | Improve this Doc View Source

    CudaStream()

    Creates a new Stream using None

    Declaration
    public CudaStream()
    | Improve this Doc View Source

    CudaStream(CUstream)

    Creates a new wrapper for an existing stream

    Declaration
    public CudaStream(CUstream _stream)
    Parameters
    Type Name Description
    CUstream _stream
    | Improve this Doc View Source

    CudaStream(CUStreamFlags)

    Creates a new Stream

    Declaration
    public CudaStream(CUStreamFlags flags)
    Parameters
    Type Name Description
    CUStreamFlags flags

    Parameters for stream creation (must be None)

    | Improve this Doc View Source

    CudaStream(Int32)

    Creates a new Stream using None and with the given priority

    This API alters the scheduler priority of work in the stream. Work in a higher priority stream may preempt work already executing in a low priority stream.

    priority follows a convention where lower numbers represent higher priorities.

    '0' represents default priority.

    Declaration
    public CudaStream(int priority)
    Parameters
    Type Name Description
    System.Int32 priority

    Stream priority. Lower numbers represent higher priorities.

    | Improve this Doc View Source

    CudaStream(Int32, CUStreamFlags)

    Creates a new Stream using None and with the given priority

    This API alters the scheduler priority of work in the stream. Work in a higher priority stream may preempt work already executing in a low priority stream.

    priority follows a convention where lower numbers represent higher priorities.

    '0' represents default priority.

    Declaration
    public CudaStream(int priority, CUStreamFlags flags)
    Parameters
    Type Name Description
    System.Int32 priority

    Stream priority. Lower numbers represent higher priorities.

    CUStreamFlags flags

    Parameters for stream creation (must be None)

    Properties

    | Improve this Doc View Source

    Stream

    returns the wrapped CUstream handle

    Declaration
    public CUstream Stream { get; set; }
    Property Value
    Type Description
    CUstream

    Methods

    | Improve this Doc View Source

    AddCallback(CUstreamCallback, IntPtr, CUStreamAddCallbackFlags)

    Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each cuStreamAddCallback call, the callback will be executed exactly once. The callback will block later work in the stream until it is finished.

    The callback may be passed Success or an error code. In the event of a device error, all subsequently executed callbacks will receive an appropriate CUResult.

    Callbacks must not make any CUDA API calls. Attempting to use a CUDA API will result in ErrorNotPermitted. Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier. Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.

    This API requires compute capability 1.1 or greater. See cuDeviceGetAttribute or ::cuDeviceGetProperties to query compute capability. Attempting to use this API with earlier compute versions will return ErrorNotSupported.

    Declaration
    public void AddCallback(CUstreamCallback callback, IntPtr userData, CUStreamAddCallbackFlags flags)
    Parameters
    Type Name Description
    CUstreamCallback callback

    The function to call once preceding stream operations are complete

    System.IntPtr userData

    User specified data to be passed to the callback function. Use GCAlloc to pin a managed object

    CUStreamAddCallbackFlags flags

    Callback flags (must be CUStreamAddCallbackFlags.None)

    | Improve this Doc View Source

    AddCallbackToNullStream(CUstreamCallback, IntPtr, CUStreamAddCallbackFlags)

    Here the Stream is the NULL stream

    Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each cuStreamAddCallback call, the callback will be executed exactly once. The callback will block later work in the stream until it is finished.

    The callback may be passed Success or an error code. In the event of a device error, all subsequently executed callbacks will receive an appropriate CUResult.

    Callbacks must not make any CUDA API calls. Attempting to use a CUDA API will result in ErrorNotPermitted. Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier. Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.

    This API requires compute capability 1.1 or greater. See cuDeviceGetAttribute or ::cuDeviceGetProperties to query compute capability. Attempting to use this API with earlier compute versions will return ErrorNotSupported.

    Declaration
    public static void AddCallbackToNullStream(CUstreamCallback callback, IntPtr userData, CUStreamAddCallbackFlags flags)
    Parameters
    Type Name Description
    CUstreamCallback callback

    The function to call once preceding stream operations are complete

    System.IntPtr userData

    User specified data to be passed to the callback function. Use GCAlloc to pin a managed object

    CUStreamAddCallbackFlags flags

    Callback flags (must be CUStreamAddCallbackFlags.None)

    | Improve this Doc View Source

    cuStreamGetFlags()

    Query the flags of this stream.

    Declaration
    public CUStreamFlags cuStreamGetFlags()
    Returns
    Type Description
    CUStreamFlags

    the stream's flags

    The value returned in flags is a logical 'OR' of all flags that were used while creating this stream.

    | Improve this Doc View Source

    Dispose()

    Dispose

    Declaration
    public void Dispose()
    | Improve this Doc View Source

    Dispose(Boolean)

    For IDisposable

    Declaration
    protected virtual void Dispose(bool fDisposing)
    Parameters
    Type Name Description
    System.Boolean fDisposing
    | Improve this Doc View Source

    Finalize()

    For dispose

    Declaration
    protected void Finalize()
    | Improve this Doc View Source

    GetPriority()

    Query the priority of this stream

    Declaration
    public int GetPriority()
    Returns
    Type Description
    System.Int32

    the stream's priority

    | Improve this Doc View Source

    Query()

    Returns true if all operations in the stream have completed, or false if not.

    Declaration
    public bool Query()
    Returns
    Type Description
    System.Boolean
    | Improve this Doc View Source

    Synchronize()

    Waits until the device has completed all operations in the stream. If the context was created with the BlockingSync flag, the CPU thread will block until the stream is finished with all of its tasks.

    Declaration
    public void Synchronize()
    | Improve this Doc View Source

    WaitEvent(CUevent)

    Make a compute stream wait on an event

    Makes all future work submitted to the Stream wait until hEvent reports completion before beginning execution. This synchronization will be performed efficiently on the device.

    The stream will wait only for the completion of the most recent host call to Record() on hEvent. Once this call has returned, any functions (including Record() and Dispose() may be called on hEvent again, and the subsequent calls will not have any effect on this stream.

    If hStream is 0 (the NULL stream) any future work submitted in any stream will wait for hEvent to complete before beginning execution. This effectively creates a barrier for all future work submitted to the context.

    If Record() has not been called on hEvent, this call acts as if the record has already completed, and so is a functional no-op.

    Declaration
    public void WaitEvent(CUevent cuevent)
    Parameters
    Type Name Description
    CUevent cuevent
    | Improve this Doc View Source

    WaitValue(CUdeviceptr, UInt32, CUstreamWaitValue_flags)

    Wait on a memory location

    Enqueues a synchronization of the stream on the given memory location. Work ordered after the operation will block until the given condition on the memory is satisfied. By default, the condition is to wait for (int32_t)(*addr - value) >= 0, a cyclic greater-or-equal.

    Other condition types can be specified via \p flags.

    If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

    Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The only requirement for basic support is that on Windows, a device must be in TCC mode.

    Declaration
    public void WaitValue(CUdeviceptr addr, uint value, CUstreamWaitValue_flags flags)
    Parameters
    Type Name Description
    CUdeviceptr addr

    The memory location to wait on.

    System.UInt32 value

    The value to compare with the memory location.

    CUstreamWaitValue_flags flags

    See::CUstreamWaitValue_flags.

    | Improve this Doc View Source

    WaitValue(CUdeviceptr, UInt64, CUstreamWaitValue_flags)

    Wait on a memory location

    Enqueues a synchronization of the stream on the given memory location. Work ordered after the operation will block until the given condition on the memory is satisfied. By default, the condition is to wait for (int32_t)(*addr - value) >= 0, a cyclic greater-or-equal.

    Other condition types can be specified via \p flags.

    If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

    Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The requirements are compute capability 7.0 or greater, and on Windows, that the device be in TCC mode.

    Declaration
    public void WaitValue(CUdeviceptr addr, ulong value, CUstreamWaitValue_flags flags)
    Parameters
    Type Name Description
    CUdeviceptr addr

    The memory location to wait on.

    System.UInt64 value

    The value to compare with the memory location.

    CUstreamWaitValue_flags flags

    See::CUstreamWaitValue_flags.

    | Improve this Doc View Source

    WriteValue(CUdeviceptr, UInt32, CUstreamWriteValue_flags)

    Write a value to memory

    Write a value to memory.Unless the ::CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER flag is passed, the write is preceded by a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream rather than a CUDA thread.

    If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

    Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The only requirement for basic support is that on Windows, a device must be in TCC mode.

    Declaration
    public void WriteValue(CUdeviceptr addr, uint value, CUstreamWriteValue_flags flags)
    Parameters
    Type Name Description
    CUdeviceptr addr

    The device address to write to.

    System.UInt32 value

    The value to write.

    CUstreamWriteValue_flags flags

    See::CUstreamWriteValue_flags.

    | Improve this Doc View Source

    WriteValue(CUdeviceptr, UInt64, CUstreamWriteValue_flags)

    Write a value to memory

    Write a value to memory.Unless the ::CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER flag is passed, the write is preceded by a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream rather than a CUDA thread.

    If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

    Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The requirements are compute capability 7.0 or greater, and on Windows, that the device be in TCC mode.

    Declaration
    public void WriteValue(CUdeviceptr addr, ulong value, CUstreamWriteValue_flags flags)
    Parameters
    Type Name Description
    CUdeviceptr addr

    The device address to write to.

    System.UInt64 value

    The value to write.

    CUstreamWriteValue_flags flags

    See::CUstreamWriteValue_flags.

    Implements

    System.IDisposable
    • Improve this Doc
    • View Source
    Back to top Generated by DocFX