Class CudaDNNContext

An opaque structure holding the cuDNN library context.

The cuDNN library context must be created using cudnnCreate() and the returned handle must be passed to all subsequent library function calls. The context should be destroyed at the end using cudnnDestroy(). The context is associated with only one GPU device, the current device at the time of the call to cudnnCreate(). However multiple contexts can be created on the same GPU device.

Inheritance

System.Object

CudaDNNContext

Implements

System.IDisposable

Inherited Members

System.Object.Equals(System.Object)

System.Object.Equals(System.Object, System.Object)

System.Object.GetHashCode()

System.Object.GetType()

System.Object.MemberwiseClone()

System.Object.ReferenceEquals(System.Object, System.Object)

System.Object.ToString()

Namespace: ManagedCuda.CudaDNN

Assembly: CudaDNN.dll

Syntax

public class CudaDNNContext : IDisposable

Constructors

| Improve this Doc View Source

CudaDNNContext()

Declaration

public CudaDNNContext()

Properties

| Improve this Doc View Source

Handle

Returns the inner handle.

Declaration

public cudnnHandle Handle { get; }

Property Value

Type	Description
cudnnHandle

Methods

| Improve this Doc View Source

ActivationBackward(ActivationDescriptor, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This routine computes the gradient of a neuron activation function.

Declaration

public void ActivationBackward(ActivationDescriptor activationDesc, double alpha, TensorDescriptor yDesc, CudaDeviceVariable<double> y, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx)

Parameters

Type	Name	Description
ActivationDescriptor	activationDesc	Handle to the previously created activation descriptor object.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
TensorDescriptor	xDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Double>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

ActivationBackward(ActivationDescriptor, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This routine computes the gradient of a neuron activation function.

Declaration

public void ActivationBackward(ActivationDescriptor activationDesc, float alpha, TensorDescriptor yDesc, CudaDeviceVariable<float> y, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx)

Parameters

Type	Name	Description
ActivationDescriptor	activationDesc	Handle to the previously created activation descriptor object.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
TensorDescriptor	xDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Single>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

ActivationForward(ActivationDescriptor, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This routine applies a specified neuron activation function element-wise over each input value.

Declaration

public void ActivationForward(ActivationDescriptor activationDesc, double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
ActivationDescriptor	activationDesc	Handle to the previously created activation descriptor object.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

ActivationForward(ActivationDescriptor, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This routine applies a specified neuron activation function element-wise over each input value.

Declaration

public void ActivationForward(ActivationDescriptor activationDesc, float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
ActivationDescriptor	activationDesc	Handle to the previously created activation descriptor object.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

AddTensor(Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function adds the scaled values of one bias tensor to another tensor. Each dimension of the bias tensor must match the coresponding dimension of the srcDest tensor or must be equal to 1. In the latter case, the same value from the bias tensor for thoses dimensions will be used to blend into the srcDest tensor.

Declaration

public void AddTensor(double alpha, TensorDescriptor aDesc, CudaDeviceVariable<double> a, double beta, TensorDescriptor cDesc, CudaDeviceVariable<double> c)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	aDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	a	Pointer to data of the tensor described by the biasDesc descriptor.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	cDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	c	Pointer to data of the tensor described by the srcDestDesc descriptor.

| Improve this Doc View Source

AddTensor(Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void AddTensor(float alpha, TensorDescriptor aDesc, CudaDeviceVariable<float> a, float beta, TensorDescriptor cDesc, CudaDeviceVariable<float> c)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	aDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	a	Pointer to data of the tensor described by the biasDesc descriptor.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	cDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	c	Pointer to data of the tensor described by the srcDestDesc descriptor.

| Improve this Doc View Source

BatchNormalizationBackward(cudnnBatchNormMode, Double, Double, Double, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Double, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)

This function performs the backward BatchNormalization layer computation.

Declaration

public void BatchNormalizationBackward(cudnnBatchNormMode mode, double alphaDataDiff, double betaDataDiff, double alphaParamDiff, double betaParamDiff, TensorDescriptor xDesc, CudaDeviceVariable<double> x, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx, TensorDescriptor dBnScaleBiasDesc, CudaDeviceVariable<double> bnScale, CudaDeviceVariable<double> dBnScaleResult, CudaDeviceVariable<double> dBnBiasResult, double epsilon, CudaDeviceVariable<double> savedMean, CudaDeviceVariable<double> savedInvVariance)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Double	alphaDataDiff	Pointer to scaling factors in host memory used to blend the gradient output dx with a prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Double	betaDataDiff	Pointer to scaling factors in host memory used to blend the gradient output dx with a prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Double	alphaParamDiff	Pointer to scaling factors (in host memory) used to blend the gradient outputs dBnScaleResult and dBnBiasResult with prior values in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Double	betaParamDiff	Pointer to scaling factors (in host memory) used to blend the gradient outputs dBnScaleResult and dBnBiasResult with prior values in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor for the layer's x data.
CudaDeviceVariable<System.Double>	x	Pointers in device memory for the layer's x data.
TensorDescriptor	dyDesc	Tensor descriptor for the layer's backpropagated differential dy (inputs).
CudaDeviceVariable<System.Double>	dy	Pointers in device memory for the layer's backpropagated differential dy (inputs).
TensorDescriptor	dxDesc	Tensor descriptor for the layer's resulting differential with respect to x, dx (output).
CudaDeviceVariable<System.Double>	dx	Pointer in device memory for the layer's resulting differential with respect to x, dx (output).
TensorDescriptor	dBnScaleBiasDesc	Shared tensor descriptor for all the 5 tensors below in the argument list (bnScale, resultBnScaleDiff, resultBnBiasDiff, savedMean, savedInvVariance). The dimensions for this tensor descriptor are dependent on normalization mode. Note: The data type of this tensor descriptor must be 'float' for FP16 and FP32 input tensors, and 'double' for FP64 input tensors.
CudaDeviceVariable<System.Double>	bnScale	Pointers in device memory for the batch normalization scale parameter (in original paper bias is referred to as gamma). Note that bnBias parameter is not needed for this layer's computation.
CudaDeviceVariable<System.Double>	dBnScaleResult	Pointer in device memory for the resulting scale differentials computed by this routine. Note that scale and bias gradients are not backpropagated below this layer (since they are dead-end computation DAG nodes).
CudaDeviceVariable<System.Double>	dBnBiasResult	Pointer in device memory for the resulting bias differentials computed by this routine. Note that scale and bias gradients are not backpropagated below this layer (since they are dead-end computation DAG nodes).
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.
CudaDeviceVariable<System.Double>	savedMean	Optional cache parameter saved intermediate results computed during the forward pass. For this to work correctly, the layer's x and bnScale, bnBias data has to remain unchanged until the backward function is called. Note that both savedMean and savedInvVariance parameters can be NULL but only at the same time. It is recommended to use this cache since the memory overhead is relatively small.
CudaDeviceVariable<System.Double>	savedInvVariance	Optional cache parameter saved intermediate results computed during the forward pass. For this to work correctly, the layer's x and bnScale, bnBias data has to remain unchanged until the backward function is called. Note that both savedMean and savedInvVariance parameters can be NULL but only at the same time. It is recommended to use this cache since the memory overhead is relatively small.

| Improve this Doc View Source

BatchNormalizationBackward(cudnnBatchNormMode, Single, Single, Single, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Double, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)

This function performs the backward BatchNormalization layer computation.

Declaration

public void BatchNormalizationBackward(cudnnBatchNormMode mode, float alphaDataDiff, float betaDataDiff, float alphaParamDiff, float betaParamDiff, TensorDescriptor xDesc, CudaDeviceVariable<float> x, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx, TensorDescriptor dBnScaleBiasDesc, CudaDeviceVariable<float> bnScale, CudaDeviceVariable<float> dBnScaleResult, CudaDeviceVariable<float> dBnBiasResult, double epsilon, CudaDeviceVariable<float> savedMean, CudaDeviceVariable<float> savedInvVariance)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Single	alphaDataDiff	Pointer to scaling factors in host memory used to blend the gradient output dx with a prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Single	betaDataDiff	Pointer to scaling factors in host memory used to blend the gradient output dx with a prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Single	alphaParamDiff	Pointer to scaling factors (in host memory) used to blend the gradient outputs dBnScaleResult and dBnBiasResult with prior values in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Single	betaParamDiff	Pointer to scaling factors (in host memory) used to blend the gradient outputs dBnScaleResult and dBnBiasResult with prior values in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor for the layer's x data.
CudaDeviceVariable<System.Single>	x	Pointers in device memory for the layer's x data.
TensorDescriptor	dyDesc	Tensor descriptor for the layer's backpropagated differential dy (inputs).
CudaDeviceVariable<System.Single>	dy	Pointers in device memory for the layer's backpropagated differential dy (inputs).
TensorDescriptor	dxDesc	Tensor descriptor for the layer's resulting differential with respect to x, dx (output).
CudaDeviceVariable<System.Single>	dx	Pointer in device memory for the layer's resulting differential with respect to x, dx (output).
TensorDescriptor	dBnScaleBiasDesc	Shared tensor descriptor for all the 5 tensors below in the argument list (bnScale, resultBnScaleDiff, resultBnBiasDiff, savedMean, savedInvVariance). The dimensions for this tensor descriptor are dependent on normalization mode. Note: The data type of this tensor descriptor must be 'float' for FP16 and FP32 input tensors, and 'double' for FP64 input tensors.
CudaDeviceVariable<System.Single>	bnScale	Pointers in device memory for the batch normalization scale parameter (in original paper bias is referred to as gamma). Note that bnBias parameter is not needed for this layer's computation.
CudaDeviceVariable<System.Single>	dBnScaleResult	Pointer in device memory for the resulting scale differentials computed by this routine. Note that scale and bias gradients are not backpropagated below this layer (since they are dead-end computation DAG nodes).
CudaDeviceVariable<System.Single>	dBnBiasResult	Pointer in device memory for the resulting bias differentials computed by this routine. Note that scale and bias gradients are not backpropagated below this layer (since they are dead-end computation DAG nodes).
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.
CudaDeviceVariable<System.Single>	savedMean	Optional cache parameter saved intermediate results computed during the forward pass. For this to work correctly, the layer's x and bnScale, bnBias data has to remain unchanged until the backward function is called. Note that both savedMean and savedInvVariance parameters can be NULL but only at the same time. It is recommended to use this cache since the memory overhead is relatively small.
CudaDeviceVariable<System.Single>	savedInvVariance	Optional cache parameter saved intermediate results computed during the forward pass. For this to work correctly, the layer's x and bnScale, bnBias data has to remain unchanged until the backward function is called. Note that both savedMean and savedInvVariance parameters can be NULL but only at the same time. It is recommended to use this cache since the memory overhead is relatively small.

| Improve this Doc View Source

BatchNormalizationForwardInference(cudnnBatchNormMode, Double, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Double)

This function performs the forward BatchNormalization layer computation for the inference phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015.

Declaration

public void BatchNormalizationForwardInference(cudnnBatchNormMode mode, double alpha, double beta, TensorDescriptor xDesc, CudaDeviceVariable<double> x, TensorDescriptor yDesc, CudaDeviceVariable<double> y, TensorDescriptor bnScaleBiasMeanVarDesc, CudaDeviceVariable<double> bnScale, CudaDeviceVariable<double> bnBias, CudaDeviceVariable<double> estimatedMean, CudaDeviceVariable<double> estimatedVariance, double epsilon)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor layer's x data.
CudaDeviceVariable<System.Double>	x	Pointer in device memory for the layer's x data.
TensorDescriptor	yDesc	Tensor descriptor the layer's y data.
CudaDeviceVariable<System.Double>	y	Pointer in device memory for the layer's y data.
TensorDescriptor	bnScaleBiasMeanVarDesc	Shared tensor descriptor desc for all the 4 tensors below in the argument list. The dimensions for this tensor descriptor are dependent on the normalization mode.
CudaDeviceVariable<System.Double>	bnScale	Pointer in device memory for the batch normalization scale parameters (in original paper scale is referred to as gamma).
CudaDeviceVariable<System.Double>	bnBias	Pointers in device memory for the batch normalization bias parameters (in original paper bias is referred to as beta). Note that bnBias parameter can replace the previous layer's bias parameter for improved efficiency.
CudaDeviceVariable<System.Double>	estimatedMean	Mean tensor (has the same descriptor as the bias and scale). It is suggested that resultRunningMean from the cudnnBatchNormalizationForwardTraining call accumulated during the training phase be passed as input here.
CudaDeviceVariable<System.Double>	estimatedVariance	Variance tensor (has the same descriptor as the bias and scale). It is suggested that resultRunningVariance from the cudnnBatchNormalizationForwardTraining call accumulated during the training phase be passed as input here.
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.

| Improve this Doc View Source

BatchNormalizationForwardInference(cudnnBatchNormMode, Single, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Double)

Declaration

public void BatchNormalizationForwardInference(cudnnBatchNormMode mode, float alpha, float beta, TensorDescriptor xDesc, CudaDeviceVariable<float> x, TensorDescriptor yDesc, CudaDeviceVariable<float> y, TensorDescriptor bnScaleBiasMeanVarDesc, CudaDeviceVariable<float> bnScale, CudaDeviceVariable<float> bnBias, CudaDeviceVariable<float> estimatedMean, CudaDeviceVariable<float> estimatedVariance, double epsilon)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor layer's x data.
CudaDeviceVariable<System.Single>	x	Pointer in device memory for the layer's x data.
TensorDescriptor	yDesc	Tensor descriptor the layer's y data.
CudaDeviceVariable<System.Single>	y	Pointer in device memory for the layer's y data.
TensorDescriptor	bnScaleBiasMeanVarDesc	Shared tensor descriptor desc for all the 4 tensors below in the argument list. The dimensions for this tensor descriptor are dependent on the normalization mode.
CudaDeviceVariable<System.Single>	bnScale	Pointer in device memory for the batch normalization scale parameters (in original paper scale is referred to as gamma).
CudaDeviceVariable<System.Single>	bnBias	Pointers in device memory for the batch normalization bias parameters (in original paper bias is referred to as beta). Note that bnBias parameter can replace the previous layer's bias parameter for improved efficiency.
CudaDeviceVariable<System.Single>	estimatedMean	Mean tensor (has the same descriptor as the bias and scale). It is suggested that resultRunningMean from the cudnnBatchNormalizationForwardTraining call accumulated during the training phase be passed as input here.
CudaDeviceVariable<System.Single>	estimatedVariance	Variance tensor (has the same descriptor as the bias and scale). It is suggested that resultRunningVariance from the cudnnBatchNormalizationForwardTraining call accumulated during the training phase be passed as input here.
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.

| Improve this Doc View Source

BatchNormalizationForwardTraining(cudnnBatchNormMode, Double, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Double, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Double, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)

This function performs the forward BatchNormalization layer computation for the training phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015.

Declaration

public void BatchNormalizationForwardTraining(cudnnBatchNormMode mode, double alpha, double beta, TensorDescriptor xDesc, CudaDeviceVariable<double> x, TensorDescriptor yDesc, CudaDeviceVariable<double> y, TensorDescriptor bnScaleBiasMeanVarDesc, CudaDeviceVariable<double> bnScale, CudaDeviceVariable<double> bnBias, double exponentialAverageFactor, CudaDeviceVariable<double> resultRunningMean, CudaDeviceVariable<double> resultRunningVariance, double epsilon, CudaDeviceVariable<double> resultSaveMean, CudaDeviceVariable<double> resultSaveVariance)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor layer's x data.
CudaDeviceVariable<System.Double>	x	Pointer in device memory for the layer's x data.
TensorDescriptor	yDesc	Tensor descriptor the layer's y data.
CudaDeviceVariable<System.Double>	y	Pointer in device memory for the layer's y data.
TensorDescriptor	bnScaleBiasMeanVarDesc	Shared tensor descriptor desc for all the 6 tensors below in the argument list. The dimensions for this tensor descriptor are dependent on the normalization mode.
CudaDeviceVariable<System.Double>	bnScale	Pointer in device memory for the batch normalization scale parameters (in original paper scale is referred to as gamma).
CudaDeviceVariable<System.Double>	bnBias	Pointers in device memory for the batch normalization bias parameters (in original paper bias is referred to as beta). Note that bnBias parameter can replace the previous layer's bias parameter for improved efficiency.
System.Double	exponentialAverageFactor	Factor used in the moving average computation runningMean = newMeanfactor + runningMean(1-factor). Use a factor=1/(1+n) at Nth call to the function to get Cumulative Moving Average (CMA) behavior CMA[n] = (x[1]+...+x[n])/n. Since CMA[n+1] = (nCMA[n]+x[n+1])/(n+1)= ((n+1)CMA[n]-CMA[n])/(n+1) + x[n+1]/(n+1) = CMA[n](1-1/(n+1))+x[n +1]1/(n+1)
CudaDeviceVariable<System.Double>	resultRunningMean	Running mean tensor (it has the same descriptor as the bias and scale). If this tensor is initially uninitialized, it is required that exponentialAverageFactor=1 is used for the very first call of a complete training cycle. This is necessary to properly initialize the moving average. Both resultRunningMean and resultRunningInvVariance can be NULL but only at the same time.
CudaDeviceVariable<System.Double>	resultRunningVariance	Running variance tensor (it has the same descriptor as the bias and scale). If this tensors is initially uninitialized, it is required that exponentialAverageFactor=1 is used for the very first call of a complete training cycle. This is necessary to properly initialize the moving average. Both resultRunningMean and resultRunningInvVariance can be NULL but only at the same time. The value stored in resultRunningInvVariance (or passed as an input in inference mode) is the moving average of the expression 1 / sqrt(eps+variance[x]) where variance is computed either over batch or spatial+batch dimensions depending on the mode.
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.
CudaDeviceVariable<System.Double>	resultSaveMean	Optional cache to save intermediate results computed during the forward pass - these can then be reused to speed up the backward pass. For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called. Note that both resultSaveMean and resultSaveInvVariance can be NULL but only at the same time. It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.
CudaDeviceVariable<System.Double>	resultSaveVariance	Optional cache to save intermediate results computed during the forward pass - these can then be reused to speed up the backward pass. For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called. Note that both resultSaveMean and resultSaveInvVariance can be NULL but only at the same time. It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.

| Improve this Doc View Source

BatchNormalizationForwardTraining(cudnnBatchNormMode, Single, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Double, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Double, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)

Declaration

public void BatchNormalizationForwardTraining(cudnnBatchNormMode mode, float alpha, float beta, TensorDescriptor xDesc, CudaDeviceVariable<float> x, TensorDescriptor yDesc, CudaDeviceVariable<float> y, TensorDescriptor bnScaleBiasMeanVarDesc, CudaDeviceVariable<float> bnScale, CudaDeviceVariable<float> bnBias, double exponentialAverageFactor, CudaDeviceVariable<float> resultRunningMean, CudaDeviceVariable<float> resultRunningVariance, double epsilon, CudaDeviceVariable<float> resultSaveMean, CudaDeviceVariable<float> resultSaveVariance)

Parameters

Type	Name	Description
cudnnBatchNormMode	mode	Mode of operation (spatial or per-activation).
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]resultValue + beta[0]priorDstValue.
TensorDescriptor	xDesc	Tensor descriptor layer's x data.
CudaDeviceVariable<System.Single>	x	Pointer in device memory for the layer's x data.
TensorDescriptor	yDesc	Tensor descriptor the layer's y data.
CudaDeviceVariable<System.Single>	y	Pointer in device memory for the layer's y data.
TensorDescriptor	bnScaleBiasMeanVarDesc	Shared tensor descriptor desc for all the 6 tensors below in the argument list. The dimensions for this tensor descriptor are dependent on the normalization mode.
CudaDeviceVariable<System.Single>	bnScale	Pointer in device memory for the batch normalization scale parameters (in original paper scale is referred to as gamma).
CudaDeviceVariable<System.Single>	bnBias	Pointers in device memory for the batch normalization bias parameters (in original paper bias is referred to as beta). Note that bnBias parameter can replace the previous layer's bias parameter for improved efficiency.
System.Double	exponentialAverageFactor	Factor used in the moving average computation runningMean = newMeanfactor + runningMean(1-factor). Use a factor=1/(1+n) at Nth call to the function to get Cumulative Moving Average (CMA) behavior CMA[n] = (x[1]+...+x[n])/n. Since CMA[n+1] = (nCMA[n]+x[n+1])/(n+1)= ((n+1)CMA[n]-CMA[n])/(n+1) + x[n+1]/(n+1) = CMA[n](1-1/(n+1))+x[n +1]1/(n+1)
CudaDeviceVariable<System.Single>	resultRunningMean	Running mean tensor (it has the same descriptor as the bias and scale). If this tensor is initially uninitialized, it is required that exponentialAverageFactor=1 is used for the very first call of a complete training cycle. This is necessary to properly initialize the moving average. Both resultRunningMean and resultRunningInvVariance can be NULL but only at the same time.
CudaDeviceVariable<System.Single>	resultRunningVariance	Running variance tensor (it has the same descriptor as the bias and scale). If this tensors is initially uninitialized, it is required that exponentialAverageFactor=1 is used for the very first call of a complete training cycle. This is necessary to properly initialize the moving average. Both resultRunningMean and resultRunningInvVariance can be NULL but only at the same time. The value stored in resultRunningInvVariance (or passed as an input in inference mode) is the moving average of the expression 1 / sqrt(eps+variance[x]) where variance is computed either over batch or spatial+batch dimensions depending on the mode.
System.Double	epsilon	Epsilon value used in the batch normalization formula. Minimum allowed value is currently 1e-5. Same epsilon value should be used in forward and backward functions.
CudaDeviceVariable<System.Single>	resultSaveMean	Optional cache to save intermediate results computed during the forward pass - these can then be reused to speed up the backward pass. For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called. Note that both resultSaveMean and resultSaveInvVariance can be NULL but only at the same time. It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.
CudaDeviceVariable<System.Single>	resultSaveVariance	Optional cache to save intermediate results computed during the forward pass - these can then be reused to speed up the backward pass. For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called. Note that both resultSaveMean and resultSaveInvVariance can be NULL but only at the same time. It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.

| Improve this Doc View Source

ConvolutionBackwardBias(Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function computes the convolution gradient with respect to the bias, which is the sum of every element belonging to the same feature map across all of the images of the input tensor. Therefore, the number of elements produced is equal to the number of features maps of the input tensor.

Declaration

public void ConvolutionBackwardBias(double alpha, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, double beta, TensorDescriptor dbDesc, CudaDeviceVariable<double> db)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dyDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dbDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	db	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

ConvolutionBackwardBias(Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void ConvolutionBackwardBias(float alpha, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, float beta, TensorDescriptor dbDesc, CudaDeviceVariable<float> db)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dyDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dbDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	db	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

ConvolutionBackwardData(Double, FilterDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, ConvolutionDescriptor, cudnnConvolutionBwdDataAlgo, CudaDeviceVariable<Byte>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function computes the convolution gradient with respect to the output tensor using the specified algo, returning results in gradDesc. Scaling factors alpha and beta can be used to scale the input tensor and the output tensor respectively.

Declaration

public void ConvolutionBackwardData(double alpha, FilterDescriptor wDesc, CudaDeviceVariable<double> w, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, ConvolutionDescriptor convDesc, cudnnConvolutionBwdDataAlgo algo, CudaDeviceVariable<byte> workSpace, double beta, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Double>	w	Data pointer to GPU memory associated with the filter descriptor filterDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the input differential tensor descriptor diffDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionBwdDataAlgo	algo	Enumerant that specifies which backward data convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	dx	Data pointer to GPU memory associated with the output tensor descriptor gradDesc that carries the result.

| Improve this Doc View Source

ConvolutionBackwardData(Single, FilterDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, ConvolutionDescriptor, cudnnConvolutionBwdDataAlgo, CudaDeviceVariable<Byte>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void ConvolutionBackwardData(float alpha, FilterDescriptor wDesc, CudaDeviceVariable<float> w, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, ConvolutionDescriptor convDesc, cudnnConvolutionBwdDataAlgo algo, CudaDeviceVariable<byte> workSpace, float beta, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Single>	w	Data pointer to GPU memory associated with the filter descriptor filterDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the input differential tensor descriptor diffDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionBwdDataAlgo	algo	Enumerant that specifies which backward data convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	dx	Data pointer to GPU memory associated with the output tensor descriptor gradDesc that carries the result.

| Improve this Doc View Source

ConvolutionBackwardFilter(Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, ConvolutionDescriptor, cudnnConvolutionBwdFilterAlgo, CudaDeviceVariable<Byte>, Double, FilterDescriptor, CudaDeviceVariable<Double>)

This function computes the convolution gradient with respect to filter coefficients using the specified algo, returning results in gradDesc.Scaling factors alpha and beta can be used to scale the input tensor and the output tensor respectively.

Declaration

public void ConvolutionBackwardFilter(double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, ConvolutionDescriptor convDesc, cudnnConvolutionBwdFilterAlgo algo, CudaDeviceVariable<byte> workSpace, double beta, FilterDescriptor dwDesc, CudaDeviceVariable<double> dw)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the input differential tensor descriptor diffDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionBwdFilterAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
FilterDescriptor	dwDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Double>	dw	Data pointer to GPU memory associated with the filter descriptor gradDesc that carries the result.

| Improve this Doc View Source

ConvolutionBackwardFilter(Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, ConvolutionDescriptor, cudnnConvolutionBwdFilterAlgo, CudaDeviceVariable<Byte>, Single, FilterDescriptor, CudaDeviceVariable<Single>)

Declaration

public void ConvolutionBackwardFilter(float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, ConvolutionDescriptor convDesc, cudnnConvolutionBwdFilterAlgo algo, CudaDeviceVariable<byte> workSpace, float beta, FilterDescriptor dwDesc, CudaDeviceVariable<float> dw)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the input differential tensor descriptor diffDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionBwdFilterAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
FilterDescriptor	dwDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Single>	dw	Data pointer to GPU memory associated with the filter descriptor gradDesc that carries the result.

| Improve this Doc View Source

ConvolutionBiasActivationForward(Double, TensorDescriptor, CudaDeviceVariable<Double>, FilterDescriptor, CudaDeviceVariable<Double>, ConvolutionDescriptor, cudnnConvolutionFwdAlgo, CudaDeviceVariable<Byte>, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, ActivationDescriptor, TensorDescriptor, CudaDeviceVariable<Double>)

This function applies a bias and then an activation to the convolutions or crosscorrelations of cudnnConvolutionForward(), returning results in y.The full computation follows the equation y = act(alpha1* conv(x) + alpha2* z + bias ).

The routine cudnnGetConvolution2dForwardOutputDim or cudnnGetConvolutionNdForwardOutputDim can be used to determine the proper dimensions of the output tensor descriptor yDesc with respect to xDesc, convDesc and wDesc.

Declaration

public void ConvolutionBiasActivationForward(double alpha1, TensorDescriptor xDesc, CudaDeviceVariable<double> x, FilterDescriptor wDesc, CudaDeviceVariable<double> w, ConvolutionDescriptor convDesc, cudnnConvolutionFwdAlgo algo, CudaDeviceVariable<byte> workSpace, double alpha2, TensorDescriptor zDesc, CudaDeviceVariable<double> z, TensorDescriptor biasDesc, CudaDeviceVariable<double> bias, ActivationDescriptor activationDesc, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
System.Double	alpha1	Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as described by the above equation.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor xDesc.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Double>	w	Data pointer to GPU memory associated with the filter descriptor wDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionFwdAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm.If no workspace is needed for a particular algorithm, that pointer can be nil
System.Double	alpha2	Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as described by the above equation.
TensorDescriptor	zDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	z	Data pointer to GPU memory associated with the tensor descriptor zDesc.
TensorDescriptor	biasDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	bias	Data pointer to GPU memory associated with the tensor descriptor biasDesc.
ActivationDescriptor	activationDesc	Handle to a previously initialized activation descriptor.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the tensor descriptor yDesc that carries the result of the convolution.

| Improve this Doc View Source

ConvolutionBiasActivationForward(Single, TensorDescriptor, CudaDeviceVariable<Single>, FilterDescriptor, CudaDeviceVariable<Single>, ConvolutionDescriptor, cudnnConvolutionFwdAlgo, CudaDeviceVariable<Byte>, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, ActivationDescriptor, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void ConvolutionBiasActivationForward(float alpha1, TensorDescriptor xDesc, CudaDeviceVariable<float> x, FilterDescriptor wDesc, CudaDeviceVariable<float> w, ConvolutionDescriptor convDesc, cudnnConvolutionFwdAlgo algo, CudaDeviceVariable<byte> workSpace, float alpha2, TensorDescriptor zDesc, CudaDeviceVariable<float> z, TensorDescriptor biasDesc, CudaDeviceVariable<float> bias, ActivationDescriptor activationDesc, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
System.Single	alpha1	Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as described by the above equation.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor xDesc.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Single>	w	Data pointer to GPU memory associated with the filter descriptor wDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionFwdAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm.If no workspace is needed for a particular algorithm, that pointer can be nil
System.Single	alpha2	Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as described by the above equation.
TensorDescriptor	zDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	z	Data pointer to GPU memory associated with the tensor descriptor zDesc.
TensorDescriptor	biasDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	bias	Data pointer to GPU memory associated with the tensor descriptor biasDesc.
ActivationDescriptor	activationDesc	Handle to a previously initialized activation descriptor.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the tensor descriptor yDesc that carries the result of the convolution.

| Improve this Doc View Source

ConvolutionForward(Double, TensorDescriptor, CudaDeviceVariable<Double>, FilterDescriptor, CudaDeviceVariable<Double>, ConvolutionDescriptor, cudnnConvolutionFwdAlgo, CudaDeviceVariable<Byte>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function executes convolutions or cross-correlations over src using the specified filters, returning results in dest. Scaling factors alpha and beta can be used to scale the input tensor and the output tensor respectively.

Declaration

public void ConvolutionForward(double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, FilterDescriptor wDesc, CudaDeviceVariable<double> w, ConvolutionDescriptor convDesc, cudnnConvolutionFwdAlgo algo, CudaDeviceVariable<byte> workSpace, double beta, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Double>	w	Data pointer to GPU memory associated with the filter descriptor filterDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionFwdAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the tensor descriptor destDesc that carries the result of the convolution.

| Improve this Doc View Source

ConvolutionForward(Single, TensorDescriptor, CudaDeviceVariable<Single>, FilterDescriptor, CudaDeviceVariable<Single>, ConvolutionDescriptor, cudnnConvolutionFwdAlgo, CudaDeviceVariable<Byte>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void ConvolutionForward(float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, FilterDescriptor wDesc, CudaDeviceVariable<float> w, ConvolutionDescriptor convDesc, cudnnConvolutionFwdAlgo algo, CudaDeviceVariable<byte> workSpace, float beta, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
CudaDeviceVariable<System.Single>	w	Data pointer to GPU memory associated with the filter descriptor filterDesc.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
cudnnConvolutionFwdAlgo	algo	Enumerant that specifies which convolution algorithm shoud be used to compute the results
CudaDeviceVariable<System.Byte>	workSpace	Data pointer to GPU memory to a workspace needed to able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the tensor descriptor destDesc that carries the result of the convolution.

| Improve this Doc View Source

DeriveBNTensorDescriptor(TensorDescriptor, TensorDescriptor, cudnnBatchNormMode)

Derives a tensor descriptor from layer data descriptor for BatchNormalization scale, invVariance, bnBias, bnScale tensors.Use this tensor desc for bnScaleBiasMeanVarDesc and bnScaleBiasDiffDesc in Batch Normalization forward and backward functions.

Declaration

public void DeriveBNTensorDescriptor(TensorDescriptor derivedBnDesc, TensorDescriptor xDesc, cudnnBatchNormMode mode)

Parameters

Type	Name	Description
TensorDescriptor	derivedBnDesc
TensorDescriptor	xDesc
cudnnBatchNormMode	mode

| Improve this Doc View Source

Dispose()

Dispose

Declaration

public void Dispose()

| Improve this Doc View Source

Dispose(Boolean)

For IDisposable

Declaration

protected virtual void Dispose(bool fDisposing)

Parameters

Type	Name	Description
System.Boolean	fDisposing

| Improve this Doc View Source

DropoutBackward(DropoutDescriptor, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, CudaDeviceVariable<Byte>)

This function performs backward dropout operation over dy returning results in dx. If during
forward dropout operation value from x was propagated to y then during backward operation value
from dy will be propagated to dx, otherwise, dx value will be set to 0.

Declaration

public void DropoutBackward(DropoutDescriptor dropoutDesc, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx, CudaDeviceVariable<byte> reserveSpace)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Handle to a previously created dropout descriptor object.
TensorDescriptor	dyDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Pointer to data of the tensor described by the dyDesc descriptor.
TensorDescriptor	dxDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	dx	Pointer to data of the tensor described by the dxDesc descriptor.
CudaDeviceVariable<System.Byte>	reserveSpace	Data pointer to GPU memory used by this function. It is expected that contents of reserveSpace doe not change between cudnnDropoutForward and cudnnDropoutBackward calls.

| Improve this Doc View Source

DropoutBackward(DropoutDescriptor, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, CudaDeviceVariable<Byte>)

Declaration

public void DropoutBackward(DropoutDescriptor dropoutDesc, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx, CudaDeviceVariable<byte> reserveSpace)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Handle to a previously created dropout descriptor object.
TensorDescriptor	dyDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Pointer to data of the tensor described by the dyDesc descriptor.
TensorDescriptor	dxDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	dx	Pointer to data of the tensor described by the dxDesc descriptor.
CudaDeviceVariable<System.Byte>	reserveSpace	Data pointer to GPU memory used by this function. It is expected that contents of reserveSpace doe not change between cudnnDropoutForward and cudnnDropoutBackward calls.

| Improve this Doc View Source

DropoutForward(DropoutDescriptor, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, CudaDeviceVariable<Byte>)

This function performs forward dropout operation over x returning results in y. If dropout was
used as a parameter to cudnnSetDropoutDescriptor, the approximately dropout fraction of x values
will be replaces by 0, and the rest will be scaled by 1/(1-dropout) This function should not be
running concurrently with another cudnnDropoutForward function using the same states.

Declaration

public void DropoutForward(DropoutDescriptor dropoutDesc, TensorDescriptor xDesc, CudaDeviceVariable<double> x, TensorDescriptor yDesc, CudaDeviceVariable<double> y, CudaDeviceVariable<byte> reserveSpace)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Handle to a previously created dropout descriptor object.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
CudaDeviceVariable<System.Byte>	reserveSpace	Data pointer to GPU memory used by this function. It is expected that contents of reserveSpace doe not change between cudnnDropoutForward and cudnnDropoutBackward calls.

| Improve this Doc View Source

DropoutForward(DropoutDescriptor, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, CudaDeviceVariable<Byte>)

Declaration

public void DropoutForward(DropoutDescriptor dropoutDesc, TensorDescriptor xDesc, CudaDeviceVariable<float> x, TensorDescriptor yDesc, CudaDeviceVariable<float> y, CudaDeviceVariable<byte> reserveSpace)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Handle to a previously created dropout descriptor object.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
CudaDeviceVariable<System.Byte>	reserveSpace	Data pointer to GPU memory used by this function. It is expected that contents of reserveSpace doe not change between cudnnDropoutForward and cudnnDropoutBackward calls.

| Improve this Doc View Source

Finalize()

For dispose

Declaration

protected void Finalize()

| Improve this Doc View Source

FindConvolutionBackwardDataAlgorithm(FilterDescriptor, TensorDescriptor, ConvolutionDescriptor, TensorDescriptor, Int32)

This function attempts all cuDNN algorithms for cudnnConvolutionBackwardData_v3 and outputs performance metrics to a user- allocated array of cudnnConvolutionBwdDataAlgoPerf_t. These metrics are written in sorted fashion where the first element has the lowest compute time.

Declaration

public cudnnConvolutionBwdDataAlgoPerf[] FindConvolutionBackwardDataAlgorithm(FilterDescriptor wDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, TensorDescriptor dxDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionBwdDataAlgoPerf[]	An array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

FindConvolutionBackwardFilterAlgorithm(TensorDescriptor, TensorDescriptor, ConvolutionDescriptor, FilterDescriptor, Int32)

This function attempts all cuDNN algorithms for cudnnConvolutionBackwardFilter_v3 and outputs performance metrics to a user- allocated array of cudnnConvolutionBwdFilterAlgoPerf_t. These metrics are written in sorted fashion where the first element has the lowest compute time.

Declaration

public cudnnConvolutionBwdFilterAlgoPerf[] FindConvolutionBackwardFilterAlgorithm(TensorDescriptor xDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, FilterDescriptor dwDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
FilterDescriptor	dwDesc	Handle to a previously initialized filter descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionBwdFilterAlgoPerf[]	An array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

FindConvolutionForwardAlgorithm(TensorDescriptor, FilterDescriptor, ConvolutionDescriptor, TensorDescriptor, Int32)

This function attempts all cuDNN algorithms and outputs performance metrics to a user-allocated array of cudnnConvolutionFwdAlgoPerf_t. These metrics are written in sorted fashion where the first element has the lowest compute time.

Declaration

public cudnnConvolutionFwdAlgoPerf[] FindConvolutionForwardAlgorithm(TensorDescriptor srcDesc, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, TensorDescriptor destDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
TensorDescriptor	srcDesc	Handle to the previously initialized input tensor descriptor.
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	destDesc	Handle to the previously initialized output tensor descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionFwdAlgoPerf[]	An array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

GetConvolutionBackwardDataAlgorithm(FilterDescriptor, TensorDescriptor, ConvolutionDescriptor, TensorDescriptor, cudnnConvolutionBwdDataPreference, SizeT)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardData_v3 for the given layer specifications. Based on the input preference, this function will either return the fastest algorithm or the fastest algorithm within a given memory limit. For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionBackwardDataAlgorithm.

Declaration

public cudnnConvolutionBwdDataAlgo GetConvolutionBackwardDataAlgorithm(FilterDescriptor wDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, TensorDescriptor dxDesc, cudnnConvolutionBwdDataPreference preference, SizeT memoryLimitInbytes)

Parameters

Type	Name	Description
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
cudnnConvolutionBwdDataPreference	preference	Enumerant to express the preference criteria in terms of memory requirement and speed.
SizeT	memoryLimitInbytes	It is to specify the maximum amount of GPU memory the user is willing to use as a workspace. This is currently a placeholder and is not used.

Returns

Type	Description
cudnnConvolutionBwdDataAlgo	Enumerant that specifies which convolution algorithm should be used to compute the results according to the specified preference

| Improve this Doc View Source

GetConvolutionBackwardDataAlgorithm(FilterDescriptor, TensorDescriptor, ConvolutionDescriptor, TensorDescriptor, Int32)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardData for the given layer specifications.This function will return all algorithms sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults.For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionBackwardDataAlgorithm.

Declaration

public cudnnConvolutionBwdDataAlgoPerf[] GetConvolutionBackwardDataAlgorithm(FilterDescriptor filterDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, TensorDescriptor dxDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionBwdDataAlgoPerf[]	array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

GetConvolutionBackwardDataAlgorithmMaxCount()

Declaration

public int GetConvolutionBackwardDataAlgorithmMaxCount()

Returns

Type	Description
System.Int32

| Improve this Doc View Source

GetConvolutionBackwardDataWorkspaceSize(FilterDescriptor, TensorDescriptor, ConvolutionDescriptor, TensorDescriptor, cudnnConvolutionBwdDataAlgo)

This function returns the amount of GPU memory workspace the user needs to allocate to be able to call cudnnConvolutionBackwardData_v3 with the specified algorithm. The workspace allocated will then be passed to the routine cudnnConvolutionBackwardData_v3. The specified algorithm can be the result of the call to cudnnGetConvolutionBackwardDataAlgorithm or can be chosen arbitrarily by the user. Note that not every algorithm is available for every configuration of the input tensor and/or every configuration of the convolution descriptor.

Declaration

public SizeT GetConvolutionBackwardDataWorkspaceSize(FilterDescriptor wDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, TensorDescriptor dxDesc, cudnnConvolutionBwdDataAlgo algo)

Parameters

Type	Name	Description
FilterDescriptor	wDesc	Handle to a previously initialized filter descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	dxDesc	Handle to the previously initialized output tensor descriptor.
cudnnConvolutionBwdDataAlgo	algo	Enumerant that specifies the chosen convolution algorithm

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetConvolutionBackwardFilterAlgorithm(TensorDescriptor, TensorDescriptor, ConvolutionDescriptor, FilterDescriptor, cudnnConvolutionBwdFilterPreference, SizeT)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardFilter_v3 for the given layer specifications. Based on the input preference, this function will either return the fastest algorithm or the fastest algorithm within a given memory limit. For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionBackwardFilterAlgorithm.

Declaration

public cudnnConvolutionBwdFilterAlgo GetConvolutionBackwardFilterAlgorithm(TensorDescriptor xDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, FilterDescriptor dwDesc, cudnnConvolutionBwdFilterPreference preference, SizeT memoryLimitInbytes)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
FilterDescriptor	dwDesc	Handle to a previously initialized filter descriptor.
cudnnConvolutionBwdFilterPreference	preference	Enumerant to express the preference criteria in terms of memory requirement and speed.
SizeT	memoryLimitInbytes	It is to specify the maximum amount of GPU memory the user is willing to use as a workspace. This is currently a placeholder and is not used.

Returns

Type	Description
cudnnConvolutionBwdFilterAlgo	Enumerant that specifies which convolution algorithm should be used to compute the results according to the specified preference

| Improve this Doc View Source

GetConvolutionBackwardFilterAlgorithm(TensorDescriptor, TensorDescriptor, FilterDescriptor, ConvolutionDescriptor, Int32)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardFilter for the given layer specifications.This function will return all algorithms sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults.For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionBackwardFilterAlgorithm.

Declaration

public cudnnConvolutionBwdFilterAlgoPerf[] GetConvolutionBackwardFilterAlgorithm(TensorDescriptor xDesc, TensorDescriptor dyDesc, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionBwdFilterAlgoPerf[]	array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

GetConvolutionBackwardFilterAlgorithmMaxCount()

Declaration

public int GetConvolutionBackwardFilterAlgorithmMaxCount()

Returns

Type	Description
System.Int32

| Improve this Doc View Source

GetConvolutionBackwardFilterWorkspaceSize(TensorDescriptor, TensorDescriptor, ConvolutionDescriptor, FilterDescriptor, cudnnConvolutionBwdFilterAlgo)

This function returns the amount of GPU memory workspace the user needs to allocate to be able to call cudnnConvolutionBackwardFilter_v3 with the specified algorithm. The workspace allocated will then be passed to the routine cudnnConvolutionBackwardFilter_v3. The specified algorithm can be the result of the call to cudnnGetConvolutionBackwardFilterAlgorithm or can be chosen arbitrarily by the user. Note that not every algorithm is available for every configuration of the input tensor and/or every configuration of the convolution descriptor.

Declaration

public SizeT GetConvolutionBackwardFilterWorkspaceSize(TensorDescriptor xDesc, TensorDescriptor dyDesc, ConvolutionDescriptor convDesc, FilterDescriptor gradDesc, cudnnConvolutionBwdFilterAlgo algo)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
FilterDescriptor	gradDesc	Handle to a previously initialized filter descriptor.
cudnnConvolutionBwdFilterAlgo	algo	Enumerant that specifies the chosen convolution algorithm sizeInBytes output Amount of GPU memory needed as workspace to be able to execute

Returns

Type	Description
SizeT	Amount of GPU memory needed as workspace to be able to execute a forward convolution with the specified algo

| Improve this Doc View Source

GetConvolutionForwardAlgorithm(TensorDescriptor, FilterDescriptor, ConvolutionDescriptor, TensorDescriptor, cudnnConvolutionFwdPreference, SizeT)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionForward for the given layer specifications. Based on the input preference, this function will either return the fastest algorithm or the fastest algorithm within a given memory limit. For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionForwardAlgorithm.

Declaration

public cudnnConvolutionFwdAlgo GetConvolutionForwardAlgorithm(TensorDescriptor xDesc, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, TensorDescriptor yDesc, cudnnConvolutionFwdPreference preference, SizeT memoryLimitInbytes)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
cudnnConvolutionFwdPreference	preference	Enumerant to express the preference criteria in terms of memory requirement and speed.
SizeT	memoryLimitInbytes	It is used when enumerant preference is set to CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT to specify the maximum amount of GPU memory the user is willing to use as a workspace

Returns

Type	Description
cudnnConvolutionFwdAlgo	Enumerant that specifies which convolution algorithm should be used to compute the results according to the specified preference

| Improve this Doc View Source

GetConvolutionForwardAlgorithm(TensorDescriptor, FilterDescriptor, ConvolutionDescriptor, TensorDescriptor, Int32)

This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionForward for the given layer specifications.This function will return all algorithms sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults.For an exhaustive search for the fastest algorithm, please use cudnnFindConvolutionForwardAlgorithm.

Declaration

public cudnnConvolutionFwdAlgoPerf[] GetConvolutionForwardAlgorithm(TensorDescriptor xDesc, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, TensorDescriptor yDesc, int requestedAlgoCount)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
System.Int32	requestedAlgoCount	The maximum number of elements to be stored in perfResults.

Returns

Type	Description
cudnnConvolutionFwdAlgoPerf[]	array to store performance metrics sorted ascending by compute time.

| Improve this Doc View Source

GetConvolutionForwardAlgorithmMaxCount()

Declaration

public int GetConvolutionForwardAlgorithmMaxCount()

Returns

Type	Description
System.Int32

| Improve this Doc View Source

GetConvolutionForwardWorkspaceSize(TensorDescriptor, FilterDescriptor, ConvolutionDescriptor, TensorDescriptor, cudnnConvolutionFwdAlgo)

This function returns the amount of GPU memory workspace the user needs to allocate to be able to call cudnnConvolutionForward with the specified algorithm. The workspace allocated will then be passed to the routine cudnnConvolutionForward. The specified algorithm can be the result of the call to cudnnGetConvolutionForwardAlgorithm or can be chosen arbitrarily by the user. Note that not every algorithm is available for every configuration of the input tensor and/or every configuration of the convolution descriptor.

Declaration

public SizeT GetConvolutionForwardWorkspaceSize(TensorDescriptor xDesc, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, TensorDescriptor yDesc, cudnnConvolutionFwdAlgo algo)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
FilterDescriptor	filterDesc	Handle to a previously initialized filter descriptor.
ConvolutionDescriptor	convDesc	Previously initialized convolution descriptor.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
cudnnConvolutionFwdAlgo	algo	Enumerant that specifies the chosen convolution algorithm

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetDropoutDescriptor(DropoutDescriptor, ref Single, ref UInt64)

This function queries the fields of a previously initialized dropout descriptor.

Declaration

public CudaDeviceVariable<byte> GetDropoutDescriptor(DropoutDescriptor dropoutDesc, ref float droupout, ref ulong seed)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Previously initialized dropout descriptor.
System.Single	droupout	The probability with which the value from input is set to 0 during the dropout layer.
System.UInt64	seed	Seed used to initialize random number generator states.

Returns

Type	Description
CudaDeviceVariable<System.Byte>	user-allocated GPU memory that holds random number generator states.

| Improve this Doc View Source

GetDropoutReserveSpaceSize(TensorDescriptor)

This function is used to query the amount of reserve needed to run dropout with the input dimensions given by xDesc.
The same reserve space is expected to be passed to cudnnDropoutForward and cudnnDropoutBackward, and its contents is
expected to remain unchanged between cudnnDropoutForward and cudnnDropoutBackward calls.

Declaration

public SizeT GetDropoutReserveSpaceSize(TensorDescriptor xDesc)

Parameters

Type	Name	Description
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor, describing input to a dropout operation.

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetDropoutStateSize()

This function is used to query the amount of space required to store the states of the random number generators used by cudnnDropoutForward function.

Declaration

public SizeT GetDropoutStateSize()

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetReductionIndicesSize(ReduceTensorDescriptor, TensorDescriptor, TensorDescriptor)

Helper function to return the minimum size of the index space to be passed to the reduction given the input and output tensors

Declaration

public SizeT GetReductionIndicesSize(ReduceTensorDescriptor reduceTensorDesc, TensorDescriptor aDesc, TensorDescriptor cDesc)

Parameters

Type	Name	Description
ReduceTensorDescriptor	reduceTensorDesc
TensorDescriptor	aDesc
TensorDescriptor	cDesc

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetReductionWorkspaceSize(ReduceTensorDescriptor, TensorDescriptor, TensorDescriptor)

Helper function to return the minimum size of the workspace to be passed to the reduction given the input and output tensors

Declaration

public SizeT GetReductionWorkspaceSize(ReduceTensorDescriptor reduceTensorDesc, TensorDescriptor aDesc, TensorDescriptor cDesc)

Parameters

Type	Name	Description
ReduceTensorDescriptor	reduceTensorDesc
TensorDescriptor	aDesc
TensorDescriptor	cDesc

Returns

Type	Description
SizeT

| Improve this Doc View Source

GetStream()

This function gets the stream to be used by the cudnn library to execute its routines.

Declaration

public CudaStream GetStream()

Returns

Type	Description
CudaStream

| Improve this Doc View Source

Im2Col(TensorDescriptor, CudaDeviceVariable<Double>, FilterDescriptor, ConvolutionDescriptor, CudaDeviceVariable<Byte>)

Declaration

public void Im2Col(TensorDescriptor srcDesc, CudaDeviceVariable<double> srcData, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, CudaDeviceVariable<byte> colBuffer)

Parameters

Type	Name	Description
TensorDescriptor	srcDesc
CudaDeviceVariable<System.Double>	srcData
FilterDescriptor	filterDesc
ConvolutionDescriptor	convDesc
CudaDeviceVariable<System.Byte>	colBuffer

| Improve this Doc View Source

Im2Col(TensorDescriptor, CudaDeviceVariable<Single>, FilterDescriptor, ConvolutionDescriptor, CudaDeviceVariable<Byte>)

Declaration

public void Im2Col(TensorDescriptor srcDesc, CudaDeviceVariable<float> srcData, FilterDescriptor filterDesc, ConvolutionDescriptor convDesc, CudaDeviceVariable<byte> colBuffer)

Parameters

Type	Name	Description
TensorDescriptor	srcDesc
CudaDeviceVariable<System.Single>	srcData
FilterDescriptor	filterDesc
ConvolutionDescriptor	convDesc
CudaDeviceVariable<System.Byte>	colBuffer

| Improve this Doc View Source

OpTensor(OpTensorDescriptor, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function implements the equation C = op(alpha1[0] * A, alpha2[0] * B) + beta[0] * C, given tensors A, B, and C and scaling factors alpha1, alpha2, and beta.The op to use is indicated by the descriptor opTensorDesc.Currently-supported ops are listed by the cudnnOpTensorOp_t enum. Each dimension of the input tensor A must match the corresponding dimension of the destination tensor C, and each dimension of the input tensor B must match the corresponding dimension of the destination tensor C or must be equal to 1. In the latter case, the same value from the input tensor B for those dimensions will be used to blend into the C tensor.The data types of the input tensors A and B must match. If the data type of the destination tensor C is double, then the data type of the input tensors also must be double. If the data type of the destination tensor C is double, then opTensorCompType in opTensorDesc must be double. Else opTensorCompType must be float. If the input tensor B is the same tensor as the destination tensor C, then the input tensor A also must be the same tensor as the destination tensor C.

Declaration

public void OpTensor(OpTensorDescriptor op_desc, double alpha1, TensorDescriptor a_desc, CudaDeviceVariable<double> a, double alpha2, TensorDescriptor b_desc, CudaDeviceVariable<double> b, double beta, TensorDescriptor c_desc, CudaDeviceVariable<double> c)

Parameters

Type	Name	Description
OpTensorDescriptor	op_desc	Handle to a previously initialized op tensor descriptor.
System.Double	alpha1	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	a_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	a	Pointer to data of the tensor described by the a_desc.
System.Double	alpha2	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	b_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	b	Pointer to data of the tensor described by the b_desc.
System.Double	beta	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	c_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	c	Output pointer to data of the tensor described by the c_desc.

| Improve this Doc View Source

OpTensor(OpTensorDescriptor, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void OpTensor(OpTensorDescriptor op_desc, float alpha1, TensorDescriptor a_desc, CudaDeviceVariable<float> a, float alpha2, TensorDescriptor b_desc, CudaDeviceVariable<float> b, float beta, TensorDescriptor c_desc, CudaDeviceVariable<float> c)

Parameters

Type	Name	Description
OpTensorDescriptor	op_desc	Handle to a previously initialized op tensor descriptor.
System.Single	alpha1	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	a_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	a	Pointer to data of the tensor described by the a_desc.
System.Single	alpha2	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	b_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	b	Pointer to data of the tensor described by the b_desc.
System.Single	beta	Pointer to the scaling factor(in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	c_desc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	c	Output pointer to data of the tensor described by the c_desc.

| Improve this Doc View Source

PoolingBackward(PoolingDescriptor, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function computes the gradient of a pooling operation.

Declaration

public void PoolingBackward(PoolingDescriptor poolingDesc, double alpha, TensorDescriptor yDesc, CudaDeviceVariable<double> y, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx)

Parameters

Type	Name	Description
PoolingDescriptor	poolingDesc	Handle to the previously initialized pooling descriptor.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
TensorDescriptor	xDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Double>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

PoolingBackward(PoolingDescriptor, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This function computes the gradient of a pooling operation.

Declaration

public void PoolingBackward(PoolingDescriptor poolingDesc, float alpha, TensorDescriptor yDesc, CudaDeviceVariable<float> y, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx)

Parameters

Type	Name	Description
PoolingDescriptor	poolingDesc	Handle to the previously initialized pooling descriptor.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
TensorDescriptor	xDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the output tensor descriptor destDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Single>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

PoolingForward(PoolingDescriptor, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function computes pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.

Declaration

public void PoolingForward(PoolingDescriptor poolingDesc, double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
PoolingDescriptor	poolingDesc	Handle to a previously initialized pooling descriptor.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

PoolingForward(PoolingDescriptor, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This function computes pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.

Declaration

public void PoolingForward(PoolingDescriptor poolingDesc, float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
PoolingDescriptor	poolingDesc	Handle to a previously initialized pooling descriptor.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

QueryRuntimeError(cudnnErrQueryMode)

cuDNN library functions perform extensive input argument checking before launching GPU kernels.The last step is to verify that the GPU kernel actually started. When a kernel fails to start, CUDNN_STATUS_EXECUTION_FAILED is returned by the corresponding API call. Typically, after a GPU kernel starts, no runtime checks are performed by the kernel itself -- numerical results are simply written to output buffers.

When the CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode is selected in cudnnBatchNormalizationForwardTraining or cudnnBatchNormalizationBackward, the algorithm may encounter numerical overflows where CUDNN_BATCHNORM_SPATIAL performs just fine albeit at a slower speed.

The user can invoke cudnnQueryRuntimeError to make sure numerical overflows did not occur during the kernel execution.Those issues are reported by the kernel that performs computations.

Declaration

public cudnnStatus QueryRuntimeError(cudnnErrQueryMode mode)

Parameters

Type	Name	Description
cudnnErrQueryMode	mode	Remote error query mode.

Returns

Type	Description
cudnnStatus	the user's error code

| Improve this Doc View Source

ReduceTensor(ReduceTensorDescriptor, CudaDeviceVariable<UInt32>, CudaDeviceVariable<Byte>, SizeT, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function reduces tensor A by implementing the equation C = alpha * reduce op ( A )

beta* C, given tensors A and C and scaling factors alpha and beta.The reduction op to use is indicated by the descriptor reduceTensorDesc.Currently-supported ops are listed by the cudnnReduceTensorOp_t enum.

Declaration

public void ReduceTensor(ReduceTensorDescriptor reduceTensorDesc, CudaDeviceVariable<uint> indices, CudaDeviceVariable<byte> workspace, SizeT workspaceSizeInBytes, double alpha, TensorDescriptor aDesc, CudaDeviceVariable<double> A, double beta, TensorDescriptor cDesc, CudaDeviceVariable<double> C)

Parameters

Type	Name	Description
ReduceTensorDescriptor	reduceTensorDesc	Handle to a previously initialized reduce tensor descriptor.
CudaDeviceVariable<System.UInt32>	indices	Handle to a previously allocated space for writing indices.
CudaDeviceVariable<System.Byte>	workspace	Handle to a previously allocated space for the reduction implementation.
SizeT	workspaceSizeInBytes	Size of the above previously allocated space.
System.Double	alpha	Pointer to scaling factor (in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	aDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	A	Pointer to data of the tensor described by the aDesc descriptor.
System.Double	beta	Pointer to scaling factor (in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	cDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	C	Pointer to data of the tensor described by the cDesc descriptor.

| Improve this Doc View Source

ReduceTensor(ReduceTensorDescriptor, CudaDeviceVariable<UInt32>, CudaDeviceVariable<Byte>, SizeT, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This function reduces tensor A by implementing the equation C = alpha * reduce op ( A )

beta* C, given tensors A and C and scaling factors alpha and beta.The reduction op to use is indicated by the descriptor reduceTensorDesc.Currently-supported ops are listed by the cudnnReduceTensorOp_t enum.

Declaration

public void ReduceTensor(ReduceTensorDescriptor reduceTensorDesc, CudaDeviceVariable<uint> indices, CudaDeviceVariable<byte> workspace, SizeT workspaceSizeInBytes, float alpha, TensorDescriptor aDesc, CudaDeviceVariable<float> A, float beta, TensorDescriptor cDesc, CudaDeviceVariable<float> C)

Parameters

Type	Name	Description
ReduceTensorDescriptor	reduceTensorDesc	Handle to a previously initialized reduce tensor descriptor.
CudaDeviceVariable<System.UInt32>	indices	Handle to a previously allocated space for writing indices.
CudaDeviceVariable<System.Byte>	workspace	Handle to a previously allocated space for the reduction implementation.
SizeT	workspaceSizeInBytes	Size of the above previously allocated space.
System.Single	alpha	Pointer to scaling factor (in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	aDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	A	Pointer to data of the tensor described by the aDesc descriptor.
System.Single	beta	Pointer to scaling factor (in host memory) used to blend the source value with prior value in the destination tensor as indicated by the above op equation.
TensorDescriptor	cDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	C	Pointer to data of the tensor described by the cDesc descriptor.

| Improve this Doc View Source

RestoreDropoutDescriptor(DropoutDescriptor, CudaDeviceVariable<Byte>, ref Single, ref UInt64)

This function restores a dropout descriptor to a previously saved-off state.

Declaration

public void RestoreDropoutDescriptor(DropoutDescriptor dropoutDesc, CudaDeviceVariable<byte> states, ref float droupout, ref ulong seed)

Parameters

Type	Name	Description
DropoutDescriptor	dropoutDesc	Previously created dropout descriptor.
CudaDeviceVariable<System.Byte>	states	Pointer to GPU memory that holds random number generator states initialized by a prior call to cudnnSetDropoutDescriptor.
System.Single	droupout	Probability with which the value from an input tensor is set to 0 when performing dropout.
System.UInt64	seed	Seed used in prior call to cudnnSetDropoutDescriptor that initialized #states' buffer. Using a different seed from this has no effect. A change of seed, and subsequent update to random number generator states can be achieved by calling cudnnSetDropoutDescriptor.

| Improve this Doc View Source

ScaleTensor(TensorDescriptor, CudaDeviceVariable<Double>, Double)

This function scale all the elements of a tensor by a give factor.

Declaration

public void ScaleTensor(TensorDescriptor yDesc, CudaDeviceVariable<double> y, double alpha)

Parameters

Type	Name	Description
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	y	Pointer to data of the tensor described by the srcDestDesc descriptor.
System.Double	alpha	Pointer in Host memory to a value that all elements of the tensor will be scaled with.

| Improve this Doc View Source

ScaleTensor(TensorDescriptor, CudaDeviceVariable<Single>, Single)

This function scale all the elements of a tensor by a give factor.

Declaration

public void ScaleTensor(TensorDescriptor yDesc, CudaDeviceVariable<float> y, float alpha)

Parameters

Type	Name	Description
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	y	Pointer to data of the tensor described by the srcDestDesc descriptor.
System.Single	alpha	Pointer in Host memory to a value that all elements of the tensor will be scaled with.

| Improve this Doc View Source

SetRNNDescriptor(RNNDescriptor, Int32, Int32, DropoutDescriptor, cudnnRNNInputMode, cudnnDirectionMode, cudnnRNNMode, cudnnRNNAlgo, cudnnDataType)

This function initializes a previously created RNN descriptor object.

Declaration

public void SetRNNDescriptor(RNNDescriptor rnnDesc, int hiddenSize, int numLayers, DropoutDescriptor dropoutDesc, cudnnRNNInputMode inputMode, cudnnDirectionMode direction, cudnnRNNMode mode, cudnnRNNAlgo algo, cudnnDataType dataType)

Parameters

Type	Name	Description
RNNDescriptor	rnnDesc	A previously created RNN descriptor.
System.Int32	hiddenSize	Size of the internal hidden state for each layer.
System.Int32	numLayers	Number of stacked layers.
DropoutDescriptor	dropoutDesc	Handle to a previously created and initialized dropout descriptor. Dropout will be applied between layers(eg.a single layer network will have no dropout applied).
cudnnRNNInputMode	inputMode	Specifies the behavior at the input to the first layer
cudnnDirectionMode	direction	Specifies the recurrence pattern. (eg. bidirectional)
cudnnRNNMode	mode	Specifies the type of RNN to compute.
cudnnRNNAlgo	algo	Specifies which RNN algorithm should be used to compute the results.
cudnnDataType	dataType	Compute precision.

| Improve this Doc View Source

SetStream(CudaStream)

This function sets the stream to be used by the cudnn library to execute its routines.

Declaration

public void SetStream(CudaStream stream)

Parameters

Type	Name	Description
CudaStream	stream	the stream to be used by the library.

| Improve this Doc View Source

SetTensor(TensorDescriptor, CudaDeviceVariable<Double>, Double)

This function sets all the elements of a tensor to a given value

Declaration

public void SetTensor(TensorDescriptor yDesc, CudaDeviceVariable<double> y, double value)

Parameters

Type	Name	Description
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	y	Pointer to data of the tensor described by the srcDestDesc descriptor.
System.Double	value	Pointer in Host memory to a value that all elements of the tensor will be set to.

| Improve this Doc View Source

SetTensor(TensorDescriptor, CudaDeviceVariable<Single>, Single)

This function sets all the elements of a tensor to a given value

Declaration

public void SetTensor(TensorDescriptor yDesc, CudaDeviceVariable<float> y, float value)

Parameters

Type	Name	Description
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	y	Pointer to data of the tensor described by the srcDestDesc descriptor.
System.Single	value	Pointer in Host memory to a value that all elements of the tensor will be set to.

| Improve this Doc View Source

SoftmaxBackward(cudnnSoftmaxAlgorithm, cudnnSoftmaxMode, Double, TensorDescriptor, CudaDeviceVariable<Double>, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This routine computes the gradient of the softmax function.

Declaration

public void SoftmaxBackward(cudnnSoftmaxAlgorithm algorithm, cudnnSoftmaxMode mode, double alpha, TensorDescriptor yDesc, CudaDeviceVariable<double> y, TensorDescriptor dyDesc, CudaDeviceVariable<double> dy, double beta, TensorDescriptor dxDesc, CudaDeviceVariable<double> dx)

Parameters

Type	Name	Description
cudnnSoftmaxAlgorithm	algorithm	Enumerant to specify the softmax algorithm.
cudnnSoftmaxMode	mode	Enumerant to specify the softmax mode.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Double>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Double>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

SoftmaxBackward(cudnnSoftmaxAlgorithm, cudnnSoftmaxMode, Single, TensorDescriptor, CudaDeviceVariable<Single>, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This routine computes the gradient of the softmax function.

Declaration

public void SoftmaxBackward(cudnnSoftmaxAlgorithm algorithm, cudnnSoftmaxMode mode, float alpha, TensorDescriptor yDesc, CudaDeviceVariable<float> y, TensorDescriptor dyDesc, CudaDeviceVariable<float> dy, float beta, TensorDescriptor dxDesc, CudaDeviceVariable<float> dx)

Parameters

Type	Name	Description
cudnnSoftmaxAlgorithm	algorithm	Enumerant to specify the softmax algorithm.
cudnnSoftmaxMode	mode	Enumerant to specify the softmax mode.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
TensorDescriptor	dyDesc	Handle to the previously initialized input differential tensor descriptor.
CudaDeviceVariable<System.Single>	dy	Data pointer to GPU memory associated with the tensor descriptor srcDiffData.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	dxDesc	Handle to the previously initialized output differential tensor descriptor.
CudaDeviceVariable<System.Single>	dx	Data pointer to GPU memory associated with the output tensor descriptor destDiffDesc.

| Improve this Doc View Source

SoftmaxForward(cudnnSoftmaxAlgorithm, cudnnSoftmaxMode, Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This routine computes the softmax function.

Declaration

public void SoftmaxForward(cudnnSoftmaxAlgorithm algorithm, cudnnSoftmaxMode mode, double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
cudnnSoftmaxAlgorithm	algorithm	Enumerant to specify the softmax algorithm.
cudnnSoftmaxMode	mode	Enumerant to specify the softmax mode.
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Double>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Double>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

SoftmaxForward(cudnnSoftmaxAlgorithm, cudnnSoftmaxMode, Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

This routine computes the softmax function.

Declaration

public void SoftmaxForward(cudnnSoftmaxAlgorithm algorithm, cudnnSoftmaxMode mode, float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
cudnnSoftmaxAlgorithm	algorithm	Enumerant to specify the softmax algorithm.
cudnnSoftmaxMode	mode	Enumerant to specify the softmax mode.
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to the previously initialized input tensor descriptor.
CudaDeviceVariable<System.Single>	x	Data pointer to GPU memory associated with the tensor descriptor srcDesc.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]result + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to the previously initialized output tensor descriptor.
CudaDeviceVariable<System.Single>	y	Data pointer to GPU memory associated with the output tensor descriptor destDesc.

| Improve this Doc View Source

TransformTensor(Double, TensorDescriptor, CudaDeviceVariable<Double>, Double, TensorDescriptor, CudaDeviceVariable<Double>)

This function copies the scaled data from one tensor to another tensor with a different layout. Those descriptors need to have the same dimensions but not necessarily the same strides. The input and output tensors must not overlap in any way (i.e., tensors cannot be transformed in place). This function can be used to convert a tensor with an unsupported format to a supported one.

Declaration

public void TransformTensor(double alpha, TensorDescriptor xDesc, CudaDeviceVariable<double> x, double beta, TensorDescriptor yDesc, CudaDeviceVariable<double> y)

Parameters

Type	Name	Description
System.Double	alpha	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	x	Pointer to data of the tensor described by the srcDesc descriptor.
System.Double	beta	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Double>	y	Pointer to data of the tensor described by the destDesc descriptor.

| Improve this Doc View Source

TransformTensor(Single, TensorDescriptor, CudaDeviceVariable<Single>, Single, TensorDescriptor, CudaDeviceVariable<Single>)

Declaration

public void TransformTensor(float alpha, TensorDescriptor xDesc, CudaDeviceVariable<float> x, float beta, TensorDescriptor yDesc, CudaDeviceVariable<float> y)

Parameters

Type	Name	Description
System.Single	alpha	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	xDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	x	Pointer to data of the tensor described by the srcDesc descriptor.
System.Single	beta	Pointer to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]srcValue + beta[0]priorDstValue. Please refer to this section for additional details.
TensorDescriptor	yDesc	Handle to a previously initialized tensor descriptor.
CudaDeviceVariable<System.Single>	y	Pointer to data of the tensor described by the destDesc descriptor.

Implements

System.IDisposable