Class DriverAPINativeMethods.FunctionManagement
Combines all function / kernel API calls
Inheritance
Inherited Members
Namespace: ManagedCuda
Assembly: ManagedCuda.dll
Syntax
public static class FunctionManagement
Methods
cuFuncGetAttribute(ref Int32, CUFunctionAttribute, CUfunction)
Returns in pi
the integer value of the attribute attrib
on the kernel given by hfunc
. See CUFunctionAttribute.
Declaration
public static CUResult cuFuncGetAttribute(ref int pi, CUFunctionAttribute attrib, CUfunction hfunc)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | pi | Returned attribute value |
CUFunctionAttribute | attrib | Attribute requested |
CUfunction | hfunc | Function to query attribute of |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidHandle, ErrorInvalidValue.
|
cuFuncSetAttribute(CUfunction, CUFunctionAttribute, Int32)
Sets information about a function
This call sets the value of a specified attribute \p attrib on the kernel given by \p hfunc to an integer value specified by \p val
This function returns CUDA_SUCCESS if the new value of the attribute could be successfully set. If the set fails, this call will return an error.
Not all attributes can have values set. Attempting to set a value on a read-only attribute will result in an error (CUDA_ERROR_INVALID_VALUE)
Supported attributes for the cuFuncSetAttribute call are:
::CU_FUNC_ATTRIBUTE_MAX_DYNAMIC_SHARED_SIZE_BYTES: This maximum size in bytes of dynamically-allocated shared memory.The value should contain the requested maximum size of dynamically-allocated shared memory.The sum of this value and the function attribute::CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES cannot exceed the device attribute ::CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK_OPTIN. The maximal size of requestable dynamic shared memory may differ by GPU architecture.
::CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT: On devices where the L1 cache and shared memory use the same hardware resources, this sets the shared memory carveout preference, in percent of the total resources.This is only a hint, and the driver can choose a different ratio if required to execute the function.
Declaration
public static CUResult cuFuncSetAttribute(CUfunction hfunc, CUFunctionAttribute attrib, int value)
Parameters
Type | Name | Description |
---|---|---|
CUfunction | hfunc | Function to query attribute of |
CUFunctionAttribute | attrib | Attribute requested |
System.Int32 | value | The value to set |
Returns
Type | Description |
---|---|
CUResult |
cuFuncSetBlockShape(CUfunction, Int32, Int32, Int32)
Specifies the x
, y
, and z
dimensions of the thread blocks that are created when the kernel given by hfunc
is launched.
Declaration
[Obsolete("Don't use this CUDA API call with CUDA version >= 4.0.")]
public static CUResult cuFuncSetBlockShape(CUfunction hfunc, int x, int y, int z)
Parameters
Type | Name | Description |
---|---|---|
CUfunction | hfunc | Kernel to specify dimensions of |
System.Int32 | x | X dimension |
System.Int32 | y | Y dimension |
System.Int32 | z | Z dimension |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidHandle, ErrorInvalidValue.
|
cuFuncSetCacheConfig(CUfunction, CUFuncCache)
On devices where the L1 cache and shared memory use the same hardware resources, this sets through config
the preferred cache configuration for the device function hfunc
. This is only a preference. The driver will use the
requested configuration if possible, but it is free to choose a different configuration if required to execute hfunc
.
This setting does nothing on devices where the size of the L1 cache and shared memory are fixed.
Switching between configuration modes may insert a device-side synchronization point for streamed kernel launches.
The supported cache modes are defined in CUFuncCache
Declaration
public static CUResult cuFuncSetCacheConfig(CUfunction hfunc, CUFuncCache config)
Parameters
Type | Name | Description |
---|---|---|
CUfunction | hfunc | Kernel to configure cache for |
CUFuncCache | config | Requested cache configuration |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext.
|
cuFuncSetSharedMemConfig(CUfunction, CUsharedconfig)
Sets the shared memory configuration for a device function.
On devices with configurable shared memory banks, this function will force all subsequent launches of the specified device function to have the given shared memory bank size configuration. On any given launch of the function, the shared memory configuration of the device will be temporarily changed if needed to suit the function's preferred configuration. Changes in shared memory configuration between subsequent launches of functions, may introduce a device side synchronization point.
Any per-function setting of shared memory bank size set via cuFuncSetSharedMemConfig(CUfunction, CUsharedconfig) will override the context wide setting set with cuCtxSetSharedMemConfig(CUsharedconfig).
Changing the shared memory bank size will not increase shared memory usage or affect occupancy of kernels, but may have major effects on performance. Larger bank sizes will allow for greater potential bandwidth to shared memory, but will change what kinds of accesses to shared memory will result in bank conflicts.
This function will do nothing on devices with fixed shared memory bank size.
The supported bank configurations are
- DefaultBankSize: set bank width to the default initial setting (currently, four bytes).
- FourByteBankSize: set shared memory bank width to be natively four bytes.
- EightByteBankSize: set shared memory bank width to be natively eight bytes.
Declaration
public static CUResult cuFuncSetSharedMemConfig(CUfunction hfunc, CUsharedconfig config)
Parameters
Type | Name | Description |
---|---|---|
CUfunction | hfunc | kernel to be given a shared memory config |
CUsharedconfig | config | requested shared memory configuration |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorInvalidValue, ErrorDeinitialized, ErrorNotInitialized, ErrorInvalidContext. |
cuFuncSetSharedSize(CUfunction, UInt32)
Sets through bytes
the amount of dynamic shared memory that will be available to each thread block when the kernel
given by hfunc
is launched.
Declaration
[Obsolete("Don't use this CUDA API call with CUDA version >= 4.0.")]
public static CUResult cuFuncSetSharedSize(CUfunction hfunc, uint bytes)
Parameters
Type | Name | Description |
---|---|---|
CUfunction | hfunc | Kernel to specify dynamic shared-memory size for |
System.UInt32 | bytes | Dynamic shared-memory size per thread in bytes |
Returns
Type | Description |
---|---|
CUResult | CUDA Error Codes: Success, ErrorDeinitialized, ErrorNotInitialized,
ErrorInvalidContext, ErrorInvalidHandle, ErrorInvalidValue.
|