Class CudaStream

Wrapps a CUstream handle. In case of a so called NULL stream, use the native CUstream struct instead.

Inheritance

System.Object

CudaStream

Implements

System.IDisposable

Inherited Members

System.Object.Equals(System.Object)

System.Object.Equals(System.Object, System.Object)

System.Object.GetHashCode()

System.Object.GetType()

System.Object.MemberwiseClone()

System.Object.ReferenceEquals(System.Object, System.Object)

System.Object.ToString()

Namespace: ManagedCuda

Assembly: ManagedCuda.dll

Syntax

public class CudaStream : IDisposable

Constructors

| Improve this Doc View Source

CudaStream()

Creates a new Stream using None

Declaration

public CudaStream()

| Improve this Doc View Source

CudaStream(CUstream)

Creates a new wrapper for an existing stream

Declaration

public CudaStream(CUstream _stream)

Parameters

Type	Name	Description
CUstream	_stream

| Improve this Doc View Source

CudaStream(CUStreamFlags)

Creates a new Stream

Declaration

public CudaStream(CUStreamFlags flags)

Parameters

Type	Name	Description
CUStreamFlags	flags	Parameters for stream creation (must be None)

| Improve this Doc View Source

CudaStream(Int32)

Creates a new Stream using None and with the given priority

This API alters the scheduler priority of work in the stream. Work in a higher priority stream may preempt work already executing in a low priority stream.

priority follows a convention where lower numbers represent higher priorities.

'0' represents default priority.

Declaration

public CudaStream(int priority)

Parameters

Type	Name	Description
System.Int32	priority	Stream priority. Lower numbers represent higher priorities.

| Improve this Doc View Source

CudaStream(Int32, CUStreamFlags)

Creates a new Stream using None and with the given priority

This API alters the scheduler priority of work in the stream. Work in a higher priority stream may preempt work already executing in a low priority stream.

priority follows a convention where lower numbers represent higher priorities.

'0' represents default priority.

Declaration

public CudaStream(int priority, CUStreamFlags flags)

Parameters

Type	Name	Description
System.Int32	priority	Stream priority. Lower numbers represent higher priorities.
CUStreamFlags	flags	Parameters for stream creation (must be None)

Properties

| Improve this Doc View Source

Stream

returns the wrapped CUstream handle

Declaration

public CUstream Stream { get; set; }

Property Value

Type	Description
CUstream

Methods

| Improve this Doc View Source

AddCallback(CUstreamCallback, IntPtr, CUStreamAddCallbackFlags)

Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each cuStreamAddCallback call, the callback will be executed exactly once. The callback will block later work in the stream until it is finished.

The callback may be passed Success or an error code. In the event of a device error, all subsequently executed callbacks will receive an appropriate CUResult.

Callbacks must not make any CUDA API calls. Attempting to use a CUDA API will result in ErrorNotPermitted. Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier. Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.

This API requires compute capability 1.1 or greater. See cuDeviceGetAttribute or ::cuDeviceGetProperties to query compute capability. Attempting to use this API with earlier compute versions will return ErrorNotSupported.

Declaration

public void AddCallback(CUstreamCallback callback, IntPtr userData, CUStreamAddCallbackFlags flags)

Parameters

Type	Name	Description
CUstreamCallback	callback	The function to call once preceding stream operations are complete
System.IntPtr	userData	User specified data to be passed to the callback function. Use GCAlloc to pin a managed object
CUStreamAddCallbackFlags	flags	Callback flags (must be CUStreamAddCallbackFlags.None)

| Improve this Doc View Source

AddCallbackToNullStream(CUstreamCallback, IntPtr, CUStreamAddCallbackFlags)

Here the Stream is the NULL stream

The callback may be passed Success or an error code. In the event of a device error, all subsequently executed callbacks will receive an appropriate CUResult.

Declaration

public static void AddCallbackToNullStream(CUstreamCallback callback, IntPtr userData, CUStreamAddCallbackFlags flags)

Parameters

Type	Name	Description
CUstreamCallback	callback	The function to call once preceding stream operations are complete
System.IntPtr	userData	User specified data to be passed to the callback function. Use GCAlloc to pin a managed object
CUStreamAddCallbackFlags	flags	Callback flags (must be CUStreamAddCallbackFlags.None)

| Improve this Doc View Source

cuStreamGetFlags()

Query the flags of this stream.

Declaration

public CUStreamFlags cuStreamGetFlags()

Returns

Type	Description
CUStreamFlags	the stream's flags The value returned in `flags` is a logical 'OR' of all flags that were used while creating this stream.

| Improve this Doc View Source

Dispose()

Dispose

Declaration

public void Dispose()

| Improve this Doc View Source

Dispose(Boolean)

For IDisposable

Declaration

protected virtual void Dispose(bool fDisposing)

Parameters

Type	Name	Description
System.Boolean	fDisposing

| Improve this Doc View Source

Finalize()

For dispose

Declaration

protected void Finalize()

| Improve this Doc View Source

GetPriority()

Query the priority of this stream

Declaration

public int GetPriority()

Returns

Type	Description
System.Int32	the stream's priority

| Improve this Doc View Source

Query()

Returns true if all operations in the stream have completed, or false if not.

Declaration

public bool Query()

Returns

Type	Description
System.Boolean

| Improve this Doc View Source

Synchronize()

Waits until the device has completed all operations in the stream. If the context was created with the BlockingSync flag, the CPU thread will block until the stream is finished with all of its tasks.

Declaration

public void Synchronize()

| Improve this Doc View Source

WaitEvent(CUevent)

Make a compute stream wait on an event

Makes all future work submitted to the Stream wait until hEvent reports completion before beginning execution. This synchronization will be performed efficiently on the device.

The stream will wait only for the completion of the most recent host call to Record() on hEvent. Once this call has returned, any functions (including Record() and Dispose() may be called on hEvent again, and the subsequent calls will not have any effect on this stream.

If hStream is 0 (the NULL stream) any future work submitted in any stream will wait for hEvent to complete before beginning execution. This effectively creates a barrier for all future work submitted to the context.

If Record() has not been called on hEvent, this call acts as if the record has already completed, and so is a functional no-op.

Declaration

public void WaitEvent(CUevent cuevent)

Parameters

Type	Name	Description
CUevent	cuevent

| Improve this Doc View Source

WaitValue(CUdeviceptr, UInt32, CUstreamWaitValue_flags)

Wait on a memory location

Enqueues a synchronization of the stream on the given memory location. Work ordered after the operation will block until the given condition on the memory is satisfied. By default, the condition is to wait for (int32_t)(*addr - value) >= 0, a cyclic greater-or-equal.

Other condition types can be specified via \p flags.

If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The only requirement for basic support is that on Windows, a device must be in TCC mode.

Declaration

public void WaitValue(CUdeviceptr addr, uint value, CUstreamWaitValue_flags flags)

Parameters

Type	Name	Description
CUdeviceptr	addr	The memory location to wait on.
System.UInt32	value	The value to compare with the memory location.
CUstreamWaitValue_flags	flags	See::CUstreamWaitValue_flags.

| Improve this Doc View Source

WaitValue(CUdeviceptr, UInt64, CUstreamWaitValue_flags)

Wait on a memory location

Other condition types can be specified via \p flags.

If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

Support for this can be queried with ::cuDeviceGetAttribute() and ::CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The requirements are compute capability 7.0 or greater, and on Windows, that the device be in TCC mode.

Declaration

public void WaitValue(CUdeviceptr addr, ulong value, CUstreamWaitValue_flags flags)

Parameters

Type	Name	Description
CUdeviceptr	addr	The memory location to wait on.
System.UInt64	value	The value to compare with the memory location.
CUstreamWaitValue_flags	flags	See::CUstreamWaitValue_flags.

| Improve this Doc View Source

WriteValue(CUdeviceptr, UInt32, CUstreamWriteValue_flags)

Write a value to memory

Write a value to memory.Unless the ::CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER flag is passed, the write is preceded by a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream rather than a CUDA thread.

If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

Declaration

public void WriteValue(CUdeviceptr addr, uint value, CUstreamWriteValue_flags flags)

Parameters

Type	Name	Description
CUdeviceptr	addr	The device address to write to.
System.UInt32	value	The value to write.
CUstreamWriteValue_flags	flags	See::CUstreamWriteValue_flags.

| Improve this Doc View Source

WriteValue(CUdeviceptr, UInt64, CUstreamWriteValue_flags)

Write a value to memory

If the memory was registered via ::cuMemHostRegister(), the device pointer should be obtained with::cuMemHostGetDevicePointer(). This function cannot be used with managed memory(::cuMemAllocManaged).

Declaration

public void WriteValue(CUdeviceptr addr, ulong value, CUstreamWriteValue_flags flags)

Parameters

Type	Name	Description
CUdeviceptr	addr	The device address to write to.
System.UInt64	value	The value to write.
CUstreamWriteValue_flags	flags	See::CUstreamWriteValue_flags.

Implements

System.IDisposable