Class CudaBlas
Wrapper for CUBLAS
Inheritance
Inherited Members
Namespace: ManagedCuda.CudaBlas
Assembly: CudaBlas.dll
Syntax
public class CudaBlas
Constructors
| Improve this Doc View SourceCudaBlas()
Creates a new cudaBlas handler
Declaration
public CudaBlas()
CudaBlas(CUstream)
Creates a new cudaBlas handler
Declaration
public CudaBlas(CUstream stream)
Parameters
| Type | Name | Description |
|---|---|---|
| CUstream | stream |
CudaBlas(CUstream, AtomicsMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(CUstream stream, AtomicsMode atomicsmode)
Parameters
| Type | Name | Description |
|---|---|---|
| CUstream | stream | |
| AtomicsMode | atomicsmode |
CudaBlas(CUstream, PointerMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(CUstream stream, PointerMode pointermode)
Parameters
| Type | Name | Description |
|---|---|---|
| CUstream | stream | |
| PointerMode | pointermode |
CudaBlas(CUstream, PointerMode, AtomicsMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(CUstream stream, PointerMode pointermode, AtomicsMode atomicsmode)
Parameters
| Type | Name | Description |
|---|---|---|
| CUstream | stream | |
| PointerMode | pointermode | |
| AtomicsMode | atomicsmode |
CudaBlas(AtomicsMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(AtomicsMode atomicsmode)
Parameters
| Type | Name | Description |
|---|---|---|
| AtomicsMode | atomicsmode |
CudaBlas(PointerMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(PointerMode pointermode)
Parameters
| Type | Name | Description |
|---|---|---|
| PointerMode | pointermode |
CudaBlas(PointerMode, AtomicsMode)
Creates a new cudaBlas handler
Declaration
public CudaBlas(PointerMode pointermode, AtomicsMode atomicsmode)
Parameters
| Type | Name | Description |
|---|---|---|
| PointerMode | pointermode | |
| AtomicsMode | atomicsmode |
Properties
| Improve this Doc View SourceAtomicsMode
Declaration
public AtomicsMode AtomicsMode { get; set; }
Property Value
| Type | Description |
|---|---|
| AtomicsMode |
CublasHandle
Returns the wrapped cublas handle
Declaration
public CudaBlasHandle CublasHandle { get; }
Property Value
| Type | Description |
|---|---|
| CudaBlasHandle |
MathMode
Declaration
public Math MathMode { get; set; }
Property Value
| Type | Description |
|---|---|
| Math |
PointerMode
Declaration
public PointerMode PointerMode { get; set; }
Property Value
| Type | Description |
|---|---|
| PointerMode |
Stream
Declaration
public CUstream Stream { get; set; }
Property Value
| Type | Description |
|---|---|
| CUstream |
Methods
| Improve this Doc View SourceAbsoluteSum(CudaDeviceVariable<cuDoubleComplex>, Int32)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public double AbsoluteSum(CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Double |
AbsoluteSum(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<double> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | result |
AbsoluteSum(CudaDeviceVariable<cuDoubleComplex>, Int32, ref Double)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<cuDoubleComplex> x, int incx, ref double result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| System.Double | result |
AbsoluteSum(CudaDeviceVariable<cuFloatComplex>, Int32)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public float AbsoluteSum(CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Single |
AbsoluteSum(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<float> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | result |
AbsoluteSum(CudaDeviceVariable<cuFloatComplex>, Int32, ref Single)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<cuFloatComplex> x, int incx, ref float result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| System.Single | result |
AbsoluteSum(CudaDeviceVariable<Double>, Int32)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public double AbsoluteSum(CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Double |
AbsoluteSum(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | result |
AbsoluteSum(CudaDeviceVariable<Double>, Int32, ref Double)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<double> x, int incx, ref double result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| System.Double | result |
AbsoluteSum(CudaDeviceVariable<Single>, Int32)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public float AbsoluteSum(CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Single |
AbsoluteSum(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | result |
AbsoluteSum(CudaDeviceVariable<Single>, Int32, ref Single)
This function computes the sum of the absolute values of the elements of vector x.
Declaration
public void AbsoluteSum(CudaDeviceVariable<float> x, int incx, ref float result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| System.Single | result |
Axpy(CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Axpy(CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Axpy(CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | alpha | |
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy |
Axpy(CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | alpha | |
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy |
Axpy(cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| cuDoubleComplex | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Axpy(cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| cuFloatComplex | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Axpy(Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | alpha | |
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy |
Axpy(Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function multiplies the vector x by the scalar and adds it to the vector y overwriting the latest vector with the result.
Declaration
public void Axpy(float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | alpha | |
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<cuDoubleReal>, Int32, CudaDeviceVariable<cuDoubleReal>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<cuDoubleReal> x, int incx, CudaDeviceVariable<cuDoubleReal> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleReal> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleReal> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<cuFloatReal>, Int32, CudaDeviceVariable<cuFloatReal>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<cuFloatReal> x, int incx, CudaDeviceVariable<cuFloatReal> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatReal> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatReal> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<double1>, Int32, CudaDeviceVariable<double1>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<double1> x, int incx, CudaDeviceVariable<double1> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<double1> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<double1> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<float1>, Int32, CudaDeviceVariable<float1>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<float1> x, int incx, CudaDeviceVariable<float1> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<float1> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<float1> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy |
Copy(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function copies the vector x into the vector y.
Declaration
public void Copy(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy |
Ctpttr(FillMode, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the conversion from the triangular packed format to the triangular format.
If uplo == CUBLAS_FILL_MODE_LOWER then the elements of AP are copied into the lower triangular part of the triangular matrix A and the upper part of A is left untouched.
If uplo == CUBLAS_FILL_MODE_UPPER then the elements of AP are copied into the upper triangular part of the triangular matrix A and the lower part of A is left untouched.
Declaration
public void Ctpttr(FillMode uplo, int n, CudaDeviceVariable<cuFloatComplex> AP, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix AP contains lower or upper part of matrix A. |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda x n , with lda>=max(1,n). The opposite side of A is left untouched. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Ctrttp(FillMode, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function performs the conversion from the triangular format to the triangular packed format.
If uplo == CUBLAS_FILL_MODE_LOWER then the lower triangular part of the triangular matrix A is copied into the array AP.
If uplo == CUBLAS_FILL_MODE_UPPER then then the upper triangular part of the triangular matrix A is copied into the array AP
Declaration
public void Ctrttp(FillMode uplo, int n, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates which matrix A lower or upper part is referenced |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda x n , with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
Dgmm(SideMode, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-matrix multiplication C = A x diag(X) if mode == CUBLAS_SIDE_RIGHT, or C = diag(X) x A if mode == CUBLAS_SIDE_LEFT.
where A and C are matrices stored in column-major format with dimensions m*n. X is a vector of size n if mode == CUBLAS_SIDE_RIGHT and of size m if mode == CUBLAS_SIDE_LEFT. X is gathered from one-dimensional array x with stride incx. The absolute value of incx is the stride and the sign of incx is direction of the stride. If incx is positive, then we forward x from the first element. Otherwise, we backward x from the last element.
Declaration
public void Dgmm(SideMode mode, int m, int n, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> X, int incx, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | mode | left multiply if mode == CUBLAS_SIDE_LEFT or right multiply if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | m | number of rows of matrix A and C. |
| System.Int32 | n | number of columns of matrix A and C. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda x n with lda >= max(1,m) |
| System.Int32 | lda | leading dimension of two-dimensional array used to store the matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | X | one-dimensional array of size |incx|*m if mode == CUBLAS_SIDE_LEFT and |incx|*n if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | incx | stride of one-dimensional array x. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc*n with ldc >= max(1,m). |
| System.Int32 | ldc | leading dimension of a two-dimensional array used to store the matrix C. |
Dgmm(SideMode, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-matrix multiplication C = A x diag(X) if mode == CUBLAS_SIDE_RIGHT, or C = diag(X) x A if mode == CUBLAS_SIDE_LEFT.
where A and C are matrices stored in column-major format with dimensions m*n. X is a vector of size n if mode == CUBLAS_SIDE_RIGHT and of size m if mode == CUBLAS_SIDE_LEFT. X is gathered from one-dimensional array x with stride incx. The absolute value of incx is the stride and the sign of incx is direction of the stride. If incx is positive, then we forward x from the first element. Otherwise, we backward x from the last element.
Declaration
public void Dgmm(SideMode mode, int m, int n, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> X, int incx, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | mode | left multiply if mode == CUBLAS_SIDE_LEFT or right multiply if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | m | number of rows of matrix A and C. |
| System.Int32 | n | number of columns of matrix A and C. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda x n with lda >= max(1,m) |
| System.Int32 | lda | leading dimension of two-dimensional array used to store the matrix A. |
| CudaDeviceVariable<cuFloatComplex> | X | one-dimensional array of size |incx|*m if mode == CUBLAS_SIDE_LEFT and |incx|*n if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | incx | stride of one-dimensional array x. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc*n with ldc >= max(1,m). |
| System.Int32 | ldc | leading dimension of a two-dimensional array used to store the matrix C. |
Dgmm(SideMode, Int32, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-matrix multiplication C = A x diag(X) if mode == CUBLAS_SIDE_RIGHT, or C = diag(X) x A if mode == CUBLAS_SIDE_LEFT.
where A and C are matrices stored in column-major format with dimensions m*n. X is a vector of size n if mode == CUBLAS_SIDE_RIGHT and of size m if mode == CUBLAS_SIDE_LEFT. X is gathered from one-dimensional array x with stride incx. The absolute value of incx is the stride and the sign of incx is direction of the stride. If incx is positive, then we forward x from the first element. Otherwise, we backward x from the last element.
Declaration
public void Dgmm(SideMode mode, int m, int n, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> X, int incx, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | mode | left multiply if mode == CUBLAS_SIDE_LEFT or right multiply if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | m | number of rows of matrix A and C. |
| System.Int32 | n | number of columns of matrix A and C. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda x n with lda >= max(1,m) |
| System.Int32 | lda | leading dimension of two-dimensional array used to store the matrix A. |
| CudaDeviceVariable<System.Double> | X | one-dimensional array of size |incx|*m if mode == CUBLAS_SIDE_LEFT and |incx|*n if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | incx | stride of one-dimensional array x. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc*n with ldc >= max(1,m). |
| System.Int32 | ldc | leading dimension of a two-dimensional array used to store the matrix C. |
Dgmm(SideMode, Int32, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-matrix multiplication C = A x diag(X) if mode == CUBLAS_SIDE_RIGHT, or C = diag(X) x A if mode == CUBLAS_SIDE_LEFT.
where A and C are matrices stored in column-major format with dimensions m*n. X is a vector of size n if mode == CUBLAS_SIDE_RIGHT and of size m if mode == CUBLAS_SIDE_LEFT. X is gathered from one-dimensional array x with stride incx. The absolute value of incx is the stride and the sign of incx is direction of the stride. If incx is positive, then we forward x from the first element. Otherwise, we backward x from the last element.
Declaration
public void Dgmm(SideMode mode, int m, int n, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> X, int incx, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | mode | left multiply if mode == CUBLAS_SIDE_LEFT or right multiply if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | m | number of rows of matrix A and C. |
| System.Int32 | n | number of columns of matrix A and C. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda x n with lda >= max(1,m) |
| System.Int32 | lda | leading dimension of two-dimensional array used to store the matrix A. |
| CudaDeviceVariable<System.Single> | X | one-dimensional array of size |incx|*m if mode == CUBLAS_SIDE_LEFT and |incx|*n if mode == CUBLAS_SIDE_RIGHT |
| System.Int32 | incx | stride of one-dimensional array x. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc*n with ldc >= max(1,m). |
| System.Int32 | ldc | leading dimension of a two-dimensional array used to store the matrix C. |
Dispose()
Dispose
Declaration
public void Dispose()
Dispose(Boolean)
For IDisposable
Declaration
protected virtual void Dispose(bool fDisposing)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Boolean | fDisposing |
Dot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function computes the dot product of vectors x and y.
Declaration
public cuDoubleComplex Dot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| cuDoubleComplex |
Dot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<cuDoubleComplex> | result |
Dot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, ref cuDoubleComplex)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, ref cuDoubleComplex result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| cuDoubleComplex | result |
Dot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function computes the dot product of vectors x and y.
Declaration
public cuFloatComplex Dot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| cuFloatComplex |
Dot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<cuFloatComplex> | result |
Dot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, ref cuFloatComplex)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, ref cuFloatComplex result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| cuFloatComplex | result |
Dot(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function computes the dot product of vectors x and y.
Declaration
public double Dot(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| System.Double |
Dot(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Double> | result |
Dot(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, ref Double)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, ref double result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| System.Double | result |
Dot(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function computes the dot product of vectors x and y.
Declaration
public float Dot(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| System.Single |
Dot(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Single> | result |
Dot(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, ref Single)
This function computes the dot product of vectors x and y.
Declaration
public void Dot(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, ref float result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| System.Single | result |
DotConj(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public cuDoubleComplex DotConj(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| cuDoubleComplex |
DotConj(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public void DotConj(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<cuDoubleComplex> | result |
DotConj(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, ref cuDoubleComplex)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public void DotConj(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, ref cuDoubleComplex result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| cuDoubleComplex | result |
DotConj(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public cuFloatComplex DotConj(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Returns
| Type | Description |
|---|---|
| cuFloatComplex |
DotConj(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public void DotConj(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<cuFloatComplex> | result |
DotConj(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, ref cuFloatComplex)
This function computes the dot product of vectors x and y.
Notice that the conjugate of the element of vector x should be used.
Declaration
public void DotConj(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, ref cuFloatComplex result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| cuFloatComplex | result |
Dtpttr(FillMode, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the conversion from the triangular packed format to the triangular format.
If uplo == CUBLAS_FILL_MODE_LOWER then the elements of AP are copied into the lower triangular part of the triangular matrix A and the upper part of A is left untouched.
If uplo == CUBLAS_FILL_MODE_UPPER then the elements of AP are copied into the upper triangular part of the triangular matrix A and the lower part of A is left untouched.
Declaration
public void Dtpttr(FillMode uplo, int n, CudaDeviceVariable<double> AP, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix AP contains lower or upper part of matrix A. |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<System.Double> | AP | array with A stored in packed format. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda x n , with lda>=max(1,n). The opposite side of A is left untouched. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Dtrttp(FillMode, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Single>)
This function performs the conversion from the triangular format to the triangular packed format.
If uplo == CUBLAS_FILL_MODE_LOWER then the lower triangular part of the triangular matrix A is copied into the array AP.
If uplo == CUBLAS_FILL_MODE_UPPER then then the upper triangular part of the triangular matrix A is copied into the array AP
Declaration
public void Dtrttp(FillMode uplo, int n, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates which matrix A lower or upper part is referenced |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda x n , with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Finalize()
For dispose
Declaration
protected void Finalize()
Gbmv(Operation, Int32, Int32, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, double beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Double | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gbmv(Operation, Int32, Int32, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gbmv(Operation trans, int m, int n, int kl, int ku, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, float beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Int32 | kl | number of subdiagonals of matrix A. |
| System.Int32 | ku | number of superdiagonals of matrix A. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Single | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Geam(Operation, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Geam(Operation, Operation, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-matrix addition/transposition C = alpha * Op(A) + beta * Op(B) where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mn, op(B) mn and C m*n, respectively.
Declaration
public void Geam(Operation transa, Operation transb, int m, int n, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
GelsBatchedC(Operation, Int32, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>)
This function find the least squares solution of a batch of overdetermined systems. On exit, each Aarray[i] is overwritten with their QR factorization and each Carray[i] is overwritten with the least square solution GelsBatched supports only the non-transpose operation and only solves overdetermined systems (m >= n).
GelsBatched only supports compute capability 2.0 or above.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GelsBatchedC(Operation trans, int m, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> devInfoArray)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(Aarray[i]) that is non- or (conj.) transpose. Only non-transpose operation is currently supported. |
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of each Aarray[i] and rows of each Carray[i]. |
| System.Int32 | nrhs | number of columns of each Carray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i] |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to device array, with each array of dim. m x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | devInfoArray | null or optional array of integers of dimension batchsize. |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GelsBatchedD(Operation, Int32, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>)
This function find the least squares solution of a batch of overdetermined systems. On exit, each Aarray[i] is overwritten with their QR factorization and each Carray[i] is overwritten with the least square solution GelsBatched supports only the non-transpose operation and only solves overdetermined systems (m >= n).
GelsBatched only supports compute capability 2.0 or above.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GelsBatchedD(Operation trans, int m, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> devInfoArray)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(Aarray[i]) that is non- or (conj.) transpose. Only non-transpose operation is currently supported. |
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of each Aarray[i] and rows of each Carray[i]. |
| System.Int32 | nrhs | number of columns of each Carray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i] |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to device array, with each array of dim. m x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | devInfoArray | null or optional array of integers of dimension batchsize. |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GelsBatchedS(Operation, Int32, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>)
This function find the least squares solution of a batch of overdetermined systems. On exit, each Aarray[i] is overwritten with their QR factorization and each Carray[i] is overwritten with the least square solution GelsBatched supports only the non-transpose operation and only solves overdetermined systems (m >= n).
GelsBatched only supports compute capability 2.0 or above.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GelsBatchedS(Operation trans, int m, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> devInfoArray)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(Aarray[i]) that is non- or (conj.) transpose. Only non-transpose operation is currently supported. |
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of each Aarray[i] and rows of each Carray[i]. |
| System.Int32 | nrhs | number of columns of each Carray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i] |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to device array, with each array of dim. m x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | devInfoArray | null or optional array of integers of dimension batchsize. |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GelsBatchedZ(Operation, Int32, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>)
This function find the least squares solution of a batch of overdetermined systems. On exit, each Aarray[i] is overwritten with their QR factorization and each Carray[i] is overwritten with the least square solution GelsBatched supports only the non-transpose operation and only solves overdetermined systems (m >= n).
GelsBatched only supports compute capability 2.0 or above.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GelsBatchedZ(Operation trans, int m, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> devInfoArray)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(Aarray[i]) that is non- or (conj.) transpose. Only non-transpose operation is currently supported. |
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of each Aarray[i] and rows of each Carray[i]. |
| System.Int32 | nrhs | number of columns of each Carray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i] |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to device array, with each array of dim. m x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | devInfoArray | null or optional array of integers of dimension batchsize. |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
Gemm(Operation, Operation, Int32, Int32, Int32, half, CudaDeviceVariable<half>, Int32, CudaDeviceVariable<half>, Int32, half, CudaDeviceVariable<half>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, half alpha, CudaDeviceVariable<half> A, int lda, CudaDeviceVariable<half> B, int ldb, half beta, CudaDeviceVariable<half> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| half | alpha | scalar used for multiplication. |
| CudaDeviceVariable<half> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<half> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| half | beta | scalar used for multiplication. |
| CudaDeviceVariable<half> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<half>, CudaDeviceVariable<half>, Int32, CudaDeviceVariable<half>, Int32, CudaDeviceVariable<half>, CudaDeviceVariable<half>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<half> alpha, CudaDeviceVariable<half> A, int lda, CudaDeviceVariable<half> B, int ldb, CudaDeviceVariable<half> beta, CudaDeviceVariable<half> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<half> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<half> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<half> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<half> | beta | scalar used for multiplication. |
| CudaDeviceVariable<half> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemm(Operation, Operation, Int32, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void Gemm(Operation transa, Operation transb, int m, int n, int k, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, half, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, half, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, half alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, half beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| half | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| half | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<half>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<half>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<half> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, CudaDeviceVariable<half> beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<half> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<half> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, Int32, cudaDataType, GemmAlgo)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, cudaDataType Atype, int lda, CudaDeviceVariable<CUdeviceptr> Barray, cudaDataType Btype, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<CUdeviceptr> Carray, cudaDataType Ctype, int ldc, int batchCount, cudaDataType computeType, GemmAlgo algo)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| cudaDataType | Atype | |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| cudaDataType | Btype | |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| cudaDataType | Ctype | |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
| cudaDataType | computeType | |
| GemmAlgo | algo |
GemmBatched(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, cuDoubleComplex, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, cuDoubleComplex alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, cuDoubleComplex beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| cuDoubleComplex | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, cuFloatComplex, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, cuFloatComplex, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, cuFloatComplex alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, cuFloatComplex beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| cuFloatComplex | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, Double, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Double, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, double alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, double beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| System.Double | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmBatched(Operation, Operation, Int32, Int32, Int32, IntPtr, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, IntPtr, CudaDeviceVariable<CUdeviceptr>, cudaDataType, Int32, Int32, cudaDataType, GemmAlgo)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, IntPtr alpha, CudaDeviceVariable<CUdeviceptr> Aarray, cudaDataType Atype, int lda, CudaDeviceVariable<CUdeviceptr> Barray, cudaDataType Btype, int ldb, IntPtr beta, CudaDeviceVariable<CUdeviceptr> Carray, cudaDataType Ctype, int ldc, int batchCount, cudaDataType computeType, GemmAlgo algo)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| System.IntPtr | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| cudaDataType | Atype | |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| cudaDataType | Btype | |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| System.IntPtr | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| cudaDataType | Ctype | |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
| cudaDataType | computeType | |
| GemmAlgo | algo |
GemmBatched(Operation, Operation, Int32, Int32, Int32, Single, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Single, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function performs the matrix-matrix multiplications of an array of matrices. where and are scalars, and , and are arrays of pointers to matrices stored in column-major format with dimensions op(A[i])m x k, op(B[i])k x n and op(C[i])m x n, respectively.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. For small sizes, typically smaller than 100x100, this function improves significantly performance compared to making calls to its corresponding cublas<type>gemm routine. However, on GPU architectures that support concurrent kernels, it might be advantageous to make multiple calls to cublas<type>gemm into different streams as the matrix sizes increase.
Declaration
public void GemmBatched(Operation transa, Operation transb, int m, int n, int k, float alpha, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, float beta, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A[i]) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B[i]) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A[i]) and C[i]. |
| System.Int32 | n | number of columns of op(B[i]) and C[i]. |
| System.Int32 | k | number of columns of op(A[i]) and rows of op(B[i]). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of device pointers, with each array/device pointer of dim. lda x k with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x m with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of device pointers, with each array of dim. ldb x n with ldb>=max(1,k) if transa==CUBLAS_OP_N and ldb x k with ldb>=max(1,n) max(1,) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each matrix B[i]. |
| System.Single | beta | scalar used for multiplication. If beta == 0, C does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of device pointers. It has dimensions ldc x n with ldc>=max(1,m). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix C[i]. |
| System.Int32 | batchCount | number of pointers contained in A, B and C. |
GemmEx(Operation, Operation, Int32, Int32, Int32, CudaDeviceVariable<Single>, CUdeviceptr, DataType, Int32, CUdeviceptr, DataType, Int32, CudaDeviceVariable<Single>, CUdeviceptr, DataType, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void GemmEx(Operation transa, Operation transb, int m, int n, int k, CudaDeviceVariable<float> alpha, CUdeviceptr A, DataType Atype, int lda, CUdeviceptr B, DataType Btype, int ldb, CudaDeviceVariable<float> beta, CUdeviceptr C, DataType Ctype, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CUdeviceptr | A | array of dimensions lda * k. |
| DataType | Atype | enumerant specifying the datatype of matrix A. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CUdeviceptr | B | array of dimensions ldb * n. |
| DataType | Btype | enumerant specifying the datatype of matrix B. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CUdeviceptr | C | array of dimensions ldb * n. |
| DataType | Ctype | enumerant specifying the datatype of matrix C. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
GemmEx(Operation, Operation, Int32, Int32, Int32, Single, CUdeviceptr, DataType, Int32, CUdeviceptr, DataType, Int32, Single, CUdeviceptr, DataType, Int32)
This function performs the matrix-matrix multiplication C = alpha * Op(A) * Op(B) + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in column-major format with dimensions op(A) mk, op(B) kn and C m*n, respectively.
Declaration
public void GemmEx(Operation transa, Operation transb, int m, int n, int k, float alpha, CUdeviceptr A, DataType Atype, int lda, CUdeviceptr B, DataType Btype, int ldb, float beta, CUdeviceptr C, DataType Ctype, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | transa | operation op(A) that is non- or (conj.) transpose. |
| Operation | transb | operation op(B) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix op(A) and C. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CUdeviceptr | A | array of dimensions lda * k. |
| DataType | Atype | enumerant specifying the datatype of matrix A. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CUdeviceptr | B | array of dimensions ldb * n. |
| DataType | Btype | enumerant specifying the datatype of matrix B. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CUdeviceptr | C | array of dimensions ldb * n. |
| DataType | Ctype | enumerant specifying the datatype of matrix C. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Gemv(Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, double beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Double | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Gemv(Operation, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the matrix-vector multiplication y = alpha * Op(A) * x + beta * y where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha and beta are scalars.
Declaration
public void Gemv(Operation trans, int m, int n, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, float beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | m | number of rows of matrix A. |
| System.Int32 | n | number of columns of matrix A. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Single | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
GeqrfBatchedC(Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>)
This function performs the QR factorization of each Aarray[i] for i = 0, ...,batchSize-1 using Householder reflections. Each matrix Q[i] is represented as a product of elementary reflectors and is stored in the lower part of each Aarray[i]. This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
cublas<t>geqrfBatched supports arbitrary dimension.
cublas<t>geqrfBatched only supports compute capability 2.0 or above.
Declaration
public int GeqrfBatchedC(int m, int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> TauArray)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | TauArray | array of pointers to device vector, with each vector of dim. max(1,min(m,n)). |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GeqrfBatchedD(Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>)
This function performs the QR factorization of each Aarray[i] for i = 0, ...,batchSize-1 using Householder reflections. Each matrix Q[i] is represented as a product of elementary reflectors and is stored in the lower part of each Aarray[i]. This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
cublas<t>geqrfBatched supports arbitrary dimension.
cublas<t>geqrfBatched only supports compute capability 2.0 or above.
Declaration
public int GeqrfBatchedD(int m, int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> TauArray)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | TauArray | array of pointers to device vector, with each vector of dim. max(1,min(m,n)). |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GeqrfBatchedS(Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>)
This function performs the QR factorization of each Aarray[i] for i = 0, ...,batchSize-1 using Householder reflections. Each matrix Q[i] is represented as a product of elementary reflectors and is stored in the lower part of each Aarray[i]. This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
cublas<t>geqrfBatched supports arbitrary dimension.
cublas<t>geqrfBatched only supports compute capability 2.0 or above.
Declaration
public int GeqrfBatchedS(int m, int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> TauArray)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | TauArray | array of pointers to device vector, with each vector of dim. max(1,min(m,n)). |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
GeqrfBatchedZ(Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>)
This function performs the QR factorization of each Aarray[i] for i = 0, ...,batchSize-1 using Householder reflections. Each matrix Q[i] is represented as a product of elementary reflectors and is stored in the lower part of each Aarray[i]. This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
cublas<t>geqrfBatched supports arbitrary dimension.
cublas<t>geqrfBatched only supports compute capability 2.0 or above.
Declaration
public int GeqrfBatchedZ(int m, int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<CUdeviceptr> TauArray)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | m | number of rows Aarray[i]. |
| System.Int32 | n | number of columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to device array, with each array of dim. m x n with lda>=max(1,m). The array size determines the number of batches. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | TauArray | array of pointers to device vector, with each vector of dim. max(1,min(m,n)). |
Returns
| Type | Description |
|---|---|
| System.Int32 | 0, if the parameters passed to the function are valid, <0, if the parameter in postion -value is invalid |
Ger(CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void Ger(CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Ger(CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void Ger(CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Ger(Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void Ger(double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Ger(Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void Ger(float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerC(CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^H + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerC(CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerC(CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^H + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerC(CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerC(cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^H + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerC(cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerC(cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^H + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerC(cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerU(CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerU(CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerU(CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerU(CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerU(cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerU(cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GerU(cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the rank-1 update A = alpha * x * y^T + A where A is a m*n matrix stored in column-major format, x and y are vectors, and alpha is a scalar. m = x.Size, n = y.Size.
Declaration
public void GerU(cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
GetMatrix<T>(Int32, Int32, CudaDeviceVariable<T>, Int32, T[], Int32)
copies a tile of rows x cols elements from a matrix devSource in GPU memory
space to a matrix hostDest in CPU memory space. Both matrices are assumed to be stored in column
major format, with the leading dimension (i.e. number of rows) of
source matrix devSource provided in devSource, and the leading dimension of matrix hostDest
provided in ldHostDest.
Declaration
public static void GetMatrix<T>(int rows, int cols, CudaDeviceVariable<T> devSource, int ldDevSource, T[] hostDest, int ldHostDest)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | rows | |
| System.Int32 | cols | |
| CudaDeviceVariable<T> | devSource | |
| System.Int32 | ldDevSource | |
| T[] | hostDest | |
| System.Int32 | ldHostDest |
Type Parameters
| Name | Description |
|---|---|
| T |
GetMatrixAsync<T>(Int32, Int32, CudaDeviceVariable<T>, Int32, T[], Int32, CUstream)
copies a tile of rows x cols elements from a matrix devSource in GPU memory
space to a matrix hostDest in CPU memory space. Both matrices are assumed to be stored in column
major format, with the leading dimension (i.e. number of rows) of
source matrix devSource provided in devSource, and the leading dimension of matrix hostDest
provided in ldHostDest.
Declaration
public static void GetMatrixAsync<T>(int rows, int cols, CudaDeviceVariable<T> devSource, int ldDevSource, T[] hostDest, int ldHostDest, CUstream stream)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | rows | |
| System.Int32 | cols | |
| CudaDeviceVariable<T> | devSource | |
| System.Int32 | ldDevSource | |
| T[] | hostDest | |
| System.Int32 | ldHostDest | |
| CUstream | stream |
Type Parameters
| Name | Description |
|---|---|
| T |
GetrfBatchedC(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<Int32>, Int32)
This function performs the LU factorization of an array of n x n matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimension n to 32.
Declaration
public void GetrfBatchedC(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of A[i]. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointer of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n x batchSize that contains the permutation vector of each factorization of A[i] stored in a linear fashion. |
| CudaDeviceVariable<System.Int32> | INFO | If info=0, the execution is successful. If info = -i, the i-th parameter had an illegal value. If info = i, aii is 0. The factorization has been completed, but U is exactly singular. |
| System.Int32 | batchSize | number of pointers contained in A |
GetrfBatchedD(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<Int32>, Int32)
This function performs the LU factorization of an array of n x n matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimension n to 32.
Declaration
public void GetrfBatchedD(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of A[i]. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointer of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n x batchSize that contains the permutation vector of each factorization of A[i] stored in a linear fashion. |
| CudaDeviceVariable<System.Int32> | INFO | If info=0, the execution is successful. If info = -i, the i-th parameter had an illegal value. If info = i, aii is 0. The factorization has been completed, but U is exactly singular. |
| System.Int32 | batchSize | number of pointers contained in A |
GetrfBatchedS(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<Int32>, Int32)
This function performs the LU factorization of an array of n x n matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimension n to 32.
Declaration
public void GetrfBatchedS(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of A[i]. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointer of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n x batchSize that contains the permutation vector of each factorization of A[i] stored in a linear fashion. |
| CudaDeviceVariable<System.Int32> | INFO | If info=0, the execution is successful. If info = -i, the i-th parameter had an illegal value. If info = i, aii is 0. The factorization has been completed, but U is exactly singular. |
| System.Int32 | batchSize | number of pointers contained in A |
GetrfBatchedZ(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<Int32>, Int32)
This function performs the LU factorization of an array of n x n matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimension n to 32.
Declaration
public void GetrfBatchedZ(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of A[i]. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointer of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix A[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n x batchSize that contains the permutation vector of each factorization of A[i] stored in a linear fashion. |
| CudaDeviceVariable<System.Int32> | INFO | If info=0, the execution is successful. If info = -i, the i-th parameter had an illegal value. If info = i, aii is 0. The factorization has been completed, but U is exactly singular. |
| System.Int32 | batchSize | number of pointers contained in A |
GetriBatchedC(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Aarray and Carray are arrays of pointers to matrices stored in column-major format with dimensions n*n and leading dimension lda and ldc respectively. This function performs the inversion of matrices A[i] for i = 0, ..., batchSize-1.
Prior to calling GetriBatched, the matrix A[i] must be factorized first using the routine GetrfBatched. After the call of GetrfBatched, the matrix pointing by Aarray[i] will contain the LU factors of the matrix A[i] and the vector pointing by (PivotArray+i) will contain the pivoting sequence.
Following the LU factorization, GetriBatched uses forward and backward triangular solvers to complete inversion of matrices A[i] for i = 0, ..., batchSize-1. The inversion is out-of-place, so memory space of Carray[i] cannot overlap memory space of Array[i].
Declaration
public void GetriBatchedC(int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dimension n*n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n*batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to array, with each array of dimension n*n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | INFO | array of size batchSize that info(=infoArray[i]) contains the information of inversion of A[i]. If info=0, the execution is successful. If info = k, U(k,k) is 0. The U is exactly singular and the inversion failed. |
| System.Int32 | batchSize | number of pointers contained in A |
GetriBatchedD(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Aarray and Carray are arrays of pointers to matrices stored in column-major format with dimensions n*n and leading dimension lda and ldc respectively. This function performs the inversion of matrices A[i] for i = 0, ..., batchSize-1.
Prior to calling GetriBatched, the matrix A[i] must be factorized first using the routine GetrfBatched. After the call of GetrfBatched, the matrix pointing by Aarray[i] will contain the LU factors of the matrix A[i] and the vector pointing by (PivotArray+i) will contain the pivoting sequence.
Following the LU factorization, GetriBatched uses forward and backward triangular solvers to complete inversion of matrices A[i] for i = 0, ..., batchSize-1. The inversion is out-of-place, so memory space of Carray[i] cannot overlap memory space of Array[i].
Declaration
public void GetriBatchedD(int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dimension n*n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n*batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to array, with each array of dimension n*n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | INFO | array of size batchSize that info(=infoArray[i]) contains the information of inversion of A[i]. If info=0, the execution is successful. If info = k, U(k,k) is 0. The U is exactly singular and the inversion failed. |
| System.Int32 | batchSize | number of pointers contained in A |
GetriBatchedS(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Aarray and Carray are arrays of pointers to matrices stored in column-major format with dimensions n*n and leading dimension lda and ldc respectively. This function performs the inversion of matrices A[i] for i = 0, ..., batchSize-1.
Prior to calling GetriBatched, the matrix A[i] must be factorized first using the routine GetrfBatched. After the call of GetrfBatched, the matrix pointing by Aarray[i] will contain the LU factors of the matrix A[i] and the vector pointing by (PivotArray+i) will contain the pivoting sequence.
Following the LU factorization, GetriBatched uses forward and backward triangular solvers to complete inversion of matrices A[i] for i = 0, ..., batchSize-1. The inversion is out-of-place, so memory space of Carray[i] cannot overlap memory space of Array[i].
Declaration
public void GetriBatchedS(int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dimension n*n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n*batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to array, with each array of dimension n*n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | INFO | array of size batchSize that info(=infoArray[i]) contains the information of inversion of A[i]. If info=0, the execution is successful. If info = k, U(k,k) is 0. The U is exactly singular and the inversion failed. |
| System.Int32 | batchSize | number of pointers contained in A |
GetriBatchedZ(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Aarray and Carray are arrays of pointers to matrices stored in column-major format with dimensions n*n and leading dimension lda and ldc respectively. This function performs the inversion of matrices A[i] for i = 0, ..., batchSize-1.
Prior to calling GetriBatched, the matrix A[i] must be factorized first using the routine GetrfBatched. After the call of GetrfBatched, the matrix pointing by Aarray[i] will contain the LU factors of the matrix A[i] and the vector pointing by (PivotArray+i) will contain the pivoting sequence.
Following the LU factorization, GetriBatched uses forward and backward triangular solvers to complete inversion of matrices A[i] for i = 0, ..., batchSize-1. The inversion is out-of-place, so memory space of Carray[i] cannot overlap memory space of Array[i].
Declaration
public void GetriBatchedZ(int n, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> P, CudaDeviceVariable<CUdeviceptr> Carray, int ldc, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dimension n*n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | P | array of size n*batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. |
| CudaDeviceVariable<CUdeviceptr> | Carray | array of pointers to array, with each array of dimension n*n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store each matrix Carray[i]. |
| CudaDeviceVariable<System.Int32> | INFO | array of size batchSize that info(=infoArray[i]) contains the information of inversion of A[i]. If info=0, the execution is successful. If info = k, U(k,k) is 0. The U is exactly singular and the inversion failed. |
| System.Int32 | batchSize | number of pointers contained in A |
GetrsBatchedC(Operation, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of systems of linear equations of the form:
op(A[i]) X[i] = a B[i]
where A[i] is a matrix which has been LU factorized with pivoting, X[i] and B[i] are n x nrhs matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GetrsBatchedC(Operation trans, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> devIpiv, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| System.Int32 | nrhs | number of columns of Barray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | devIpiv | array of size n x batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. If devIpiv is nil, pivoting for all Aarray[i] is ignored. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of pointers to array, with each array of dim. n x nrhs with ldb>=max(1,n). |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each solution matrix Barray[i]. |
| System.Int32 | batchSize | number of pointers contained in A |
Returns
| Type | Description |
|---|---|
| System.Int32 | If info=0, the execution is successful. If info = -j, the j-th parameter had an illegal value. |
GetrsBatchedD(Operation, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of systems of linear equations of the form:
op(A[i]) X[i] = a B[i]
where A[i] is a matrix which has been LU factorized with pivoting, X[i] and B[i] are n x nrhs matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GetrsBatchedD(Operation trans, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> devIpiv, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| System.Int32 | nrhs | number of columns of Barray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | devIpiv | array of size n x batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. If devIpiv is nil, pivoting for all Aarray[i] is ignored. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of pointers to array, with each array of dim. n x nrhs with ldb>=max(1,n). |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each solution matrix Barray[i]. |
| System.Int32 | batchSize | number of pointers contained in A |
Returns
| Type | Description |
|---|---|
| System.Int32 | If info=0, the execution is successful. If info = -j, the j-th parameter had an illegal value. |
GetrsBatchedS(Operation, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of systems of linear equations of the form:
op(A[i]) X[i] = a B[i]
where A[i] is a matrix which has been LU factorized with pivoting, X[i] and B[i] are n x nrhs matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GetrsBatchedS(Operation trans, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> devIpiv, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| System.Int32 | nrhs | number of columns of Barray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | devIpiv | array of size n x batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. If devIpiv is nil, pivoting for all Aarray[i] is ignored. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of pointers to array, with each array of dim. n x nrhs with ldb>=max(1,n). |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each solution matrix Barray[i]. |
| System.Int32 | batchSize | number of pointers contained in A |
Returns
| Type | Description |
|---|---|
| System.Int32 | If info=0, the execution is successful. If info = -j, the j-th parameter had an illegal value. |
GetrsBatchedZ(Operation, Int32, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of systems of linear equations of the form:
op(A[i]) X[i] = a B[i]
where A[i] is a matrix which has been LU factorized with pivoting, X[i] and B[i] are n x nrhs matrices.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor.
Declaration
public int GetrsBatchedZ(Operation trans, int n, int nrhs, CudaDeviceVariable<CUdeviceptr> Aarray, int lda, CudaDeviceVariable<int> devIpiv, CudaDeviceVariable<CUdeviceptr> Barray, int ldb, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows and columns of Aarray[i]. |
| System.Int32 | nrhs | number of columns of Barray[i]. |
| CudaDeviceVariable<CUdeviceptr> | Aarray | array of pointers to array, with each array of dim. n x n with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store each matrix Aarray[i]. |
| CudaDeviceVariable<System.Int32> | devIpiv | array of size n x batchSize that contains the pivoting sequence of each factorization of Aarray[i] stored in a linear fashion. If devIpiv is nil, pivoting for all Aarray[i] is ignored. |
| CudaDeviceVariable<CUdeviceptr> | Barray | array of pointers to array, with each array of dim. n x nrhs with ldb>=max(1,n). |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store each solution matrix Barray[i]. |
| System.Int32 | batchSize | number of pointers contained in A |
Returns
| Type | Description |
|---|---|
| System.Int32 | If info=0, the execution is successful. If info = -j, the j-th parameter had an illegal value. |
GetVector<T>(CudaDeviceVariable<T>, Int32, T[], Int32)
copies elements from a vector devSourceVector in GPU memory space to a vector hostDestVector
in CPU memory space. Storage spacing between consecutive elements
is incrHostDest for the source vector devSourceVector and incrDevSource for the destination vector
hostDestVector. Column major format for two-dimensional matrices
is assumed throughout CUBLAS. Therefore, if the increment for a vector
is equal to 1, this access a column vector while using an increment
equal to the leading dimension of the respective matrix accesses a
row vector.
Declaration
public static void GetVector<T>(CudaDeviceVariable<T> devSourceVector, int incrDevSource, T[] hostDestVector, int incrHostDest)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<T> | devSourceVector | Source vector in device memory |
| System.Int32 | incrDevSource | |
| T[] | hostDestVector | Destination vector in host memory |
| System.Int32 | incrHostDest |
Type Parameters
| Name | Description |
|---|---|
| T | Vector datatype |
GetVectorAsync<T>(CudaDeviceVariable<T>, Int32, T[], Int32, CUstream)
copies elements from a vector devSourceVector in GPU memory space to a vector hostDestVector
in CPU memory space. Storage spacing between consecutive elements
is incrHostDest for the source vector devSourceVector and incrDevSource for the destination vector
hostDestVector. Column major format for two-dimensional matrices
is assumed throughout CUBLAS. Therefore, if the increment for a vector
is equal to 1, this access a column vector while using an increment
equal to the leading dimension of the respective matrix accesses a
row vector.
Declaration
public static void GetVectorAsync<T>(CudaDeviceVariable<T> devSourceVector, int incrDevSource, T[] hostDestVector, int incrHostDest, CUstream stream)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<T> | devSourceVector | Source vector in device memory |
| System.Int32 | incrDevSource | |
| T[] | hostDestVector | Destination vector in host memory |
| System.Int32 | incrHostDest | |
| CUstream | stream |
Type Parameters
| Name | Description |
|---|---|
| T | Vector datatype |
GetVersion()
Declaration
public Version GetVersion()
Returns
| Type | Description |
|---|---|
| System.Version |
Hbmv(FillMode, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hbmv(FillMode uplo, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hbmv(FillMode, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hbmv(FillMode uplo, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hbmv(FillMode, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hbmv(FillMode uplo, int k, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hbmv(FillMode, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hbmv(FillMode uplo, int k, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hemm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a Hermitian matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Hemm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Hemm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a Hermitian matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Hemm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Hemm(SideMode, FillMode, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a Hermitian matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Hemm(SideMode side, FillMode uplo, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Hemm(SideMode, FillMode, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a Hermitian matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Hemm(SideMode side, FillMode uplo, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Hemv(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hemv(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hemv(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hemv(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hemv(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hemv(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hemv(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hemv(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Her(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Her(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Her(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her(FillMode, Double, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Her(FillMode uplo, double alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her(FillMode, Single, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Her(FillMode uplo, float alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her2(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Her2(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her2(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Her2(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her2(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Her2(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her2(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Her2(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Her2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * (Op(A)Op(B)^H + Op(B)Op(A)^H) + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Her2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Her2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * (Op(A)Op(B)^H + Op(B)Op(A)^H) + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Her2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Her2k(FillMode, Operation, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, Double, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * (Op(A)Op(B)^H + Op(B)Op(A)^H) + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Her2k(FillMode uplo, Operation trans, int n, int k, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, double beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Her2k(FillMode, Operation, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, Single, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * (Op(A)Op(B)^H + Op(B)Op(A)^H) + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Her2k(FillMode uplo, Operation trans, int n, int k, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, float beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * Op(A)Op(A)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Herk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<double> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * Op(A)Op(A)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Herk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<float> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herk(FillMode, Operation, Int32, Int32, Double, CudaDeviceVariable<cuDoubleComplex>, Int32, Double, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * Op(A)Op(A)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Herk(FillMode uplo, Operation trans, int n, int k, double alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, double beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herk(FillMode, Operation, Int32, Int32, Single, CudaDeviceVariable<cuFloatComplex>, Int32, Single, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian rank-k update C = alpha * Op(A)Op(A)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Herk(FillMode uplo, Operation trans, int n, int k, float alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, float beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs a variation of the Hermitian rank-k update C = alpha * Op(A) * Op(B)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Herkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimension ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | real scalar used for multiplication, if beta==0 then C does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimension ldc x n, with ldc>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs a variation of the Hermitian rank-k update C = alpha * Op(A) * Op(B)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Herkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimension ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | real scalar used for multiplication, if beta==0 then C does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimension ldc x n, with ldc>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herkx(FillMode, Operation, Int32, Int32, ref cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, ref Double, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs a variation of the Hermitian rank-k update C = alpha * Op(A) * Op(B)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Herkx(FillMode uplo, Operation trans, int n, int k, ref cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, ref double beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimension ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | real scalar used for multiplication, if beta==0 then C does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimension ldc x n, with ldc>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Herkx(FillMode, Operation, Int32, Int32, ref cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, ref Single, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs a variation of the Hermitian rank-k update C = alpha * Op(A) * Op(B)^H + beta * C where alpha and beta are scalars, and C is a Hermitian matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and Op(B) nk, respectively.
Declaration
public void Herkx(FillMode uplo, Operation trans, int n, int k, ref cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, ref float beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimension ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | real scalar used for multiplication, if beta==0 then C does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimension ldc x n, with ldc>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Hpmv(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hpmv(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> AP, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hpmv(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hpmv(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> AP, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hpmv(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the Hermitian packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hpmv(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> AP, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hpmv(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the Hermitian packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n Hermitian matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Hpmv(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> AP, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Hpr(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |
Hpr(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
Hpr(FillMode, Double, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr(FillMode uplo, double alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |
Hpr(FillMode, Single, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function performs the Hermitian rank-1 update A = alpha * x * x^H + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr(FillMode uplo, float alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
Hpr2(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function performs the packed Hermitian rank-2 update A = alpha * (x * y^H + y * x^H) + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr2(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |
Hpr2(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function performs the packed Hermitian rank-2 update A = alpha * (x * y^H + y * x^H) + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr2(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
Hpr2(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function performs the packed Hermitian rank-2 update A = alpha * (x * y^H + y * x^H) + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr2(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |
Hpr2(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>)
This function performs the packed Hermitian rank-2 update A = alpha * (x * y^H + y * x^H) + A where A is a n*n Hermitian Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Hpr2(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | AP | array with A stored in packed format. |
MatinvBatchedC(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Declaration
public void MatinvBatchedC(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> Ainv, int lda_inv, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | |
| CudaDeviceVariable<CUdeviceptr> | A | |
| System.Int32 | lda | |
| CudaDeviceVariable<CUdeviceptr> | Ainv | |
| System.Int32 | lda_inv | |
| CudaDeviceVariable<System.Int32> | INFO | |
| System.Int32 | batchSize |
MatinvBatchedD(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Declaration
public void MatinvBatchedD(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> Ainv, int lda_inv, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | |
| CudaDeviceVariable<CUdeviceptr> | A | |
| System.Int32 | lda | |
| CudaDeviceVariable<CUdeviceptr> | Ainv | |
| System.Int32 | lda_inv | |
| CudaDeviceVariable<System.Int32> | INFO | |
| System.Int32 | batchSize |
MatinvBatchedS(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Declaration
public void MatinvBatchedS(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> Ainv, int lda_inv, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | |
| CudaDeviceVariable<CUdeviceptr> | A | |
| System.Int32 | lda | |
| CudaDeviceVariable<CUdeviceptr> | Ainv | |
| System.Int32 | lda_inv | |
| CudaDeviceVariable<System.Int32> | INFO | |
| System.Int32 | batchSize |
MatinvBatchedZ(Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<Int32>, Int32)
Declaration
public void MatinvBatchedZ(int n, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> Ainv, int lda_inv, CudaDeviceVariable<int> INFO, int batchSize)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | n | |
| CudaDeviceVariable<CUdeviceptr> | A | |
| System.Int32 | lda | |
| CudaDeviceVariable<CUdeviceptr> | Ainv | |
| System.Int32 | lda_inv | |
| CudaDeviceVariable<System.Int32> | INFO | |
| System.Int32 | batchSize |
Max(CudaDeviceVariable<cuDoubleComplex>, Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Max(CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Max(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Max(CudaDeviceVariable<cuDoubleComplex>, Int32, ref Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<cuDoubleComplex> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Max(CudaDeviceVariable<cuFloatComplex>, Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Max(CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Max(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Max(CudaDeviceVariable<cuFloatComplex>, Int32, ref Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<cuFloatComplex> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Max(CudaDeviceVariable<Double>, Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Max(CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Max(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Max(CudaDeviceVariable<Double>, Int32, ref Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<double> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Max(CudaDeviceVariable<Single>, Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Max(CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Max(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Max(CudaDeviceVariable<Single>, Int32, ref Int32)
This function finds the (smallest) index of the element of the maximum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Max(CudaDeviceVariable<float> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Min(CudaDeviceVariable<cuDoubleComplex>, Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Min(CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Min(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Min(CudaDeviceVariable<cuDoubleComplex>, Int32, ref Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<cuDoubleComplex> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Min(CudaDeviceVariable<cuFloatComplex>, Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Min(CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Min(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Min(CudaDeviceVariable<cuFloatComplex>, Int32, ref Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<cuFloatComplex> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Min(CudaDeviceVariable<Double>, Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Min(CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Min(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Min(CudaDeviceVariable<Double>, Int32, ref Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<double> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Min(CudaDeviceVariable<Single>, Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public int Min(CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Int32 |
Min(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Int32>)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<int> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Int32> | result |
Min(CudaDeviceVariable<Single>, Int32, ref Int32)
This function finds the (smallest) index of the element of the minimum magnitude.
First index starts at 1 (Fortran notation)
Declaration
public void Min(CudaDeviceVariable<float> x, int incx, ref int result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| System.Int32 | result |
Norm2(CudaDeviceVariable<cuDoubleComplex>, Int32)
This function computes the Euclidean norm of the vector x.
Declaration
public double Norm2(CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Double |
Norm2(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<double> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | result |
Norm2(CudaDeviceVariable<cuDoubleComplex>, Int32, ref Double)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<cuDoubleComplex> x, int incx, ref double result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| System.Double | result |
Norm2(CudaDeviceVariable<cuFloatComplex>, Int32)
This function computes the Euclidean norm of the vector x.
Declaration
public float Norm2(CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Single |
Norm2(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<float> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | result |
Norm2(CudaDeviceVariable<cuFloatComplex>, Int32, ref Single)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<cuFloatComplex> x, int incx, ref float result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| System.Single | result |
Norm2(CudaDeviceVariable<Double>, Int32)
This function computes the Euclidean norm of the vector x.
Declaration
public double Norm2(CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Double |
Norm2(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | result |
Norm2(CudaDeviceVariable<Double>, Int32, ref Double)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<double> x, int incx, ref double result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| System.Double | result |
Norm2(CudaDeviceVariable<Single>, Int32)
This function computes the Euclidean norm of the vector x.
Declaration
public float Norm2(CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
Returns
| Type | Description |
|---|---|
| System.Single |
Norm2(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | result |
Norm2(CudaDeviceVariable<Single>, Int32, ref Single)
This function computes the Euclidean norm of the vector x.
Declaration
public void Norm2(CudaDeviceVariable<float> x, int incx, ref float result)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| System.Single | result |
Rot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<double> c, CudaDeviceVariable<cuDoubleComplex> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Double> | c | Cosine component |
| CudaDeviceVariable<cuDoubleComplex> | s | Sine component |
Rot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<double> c, CudaDeviceVariable<double> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Double> | c | Cosine component |
| CudaDeviceVariable<System.Double> | s | Sine component |
Rot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, Double, cuDoubleComplex)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, double c, cuDoubleComplex s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| System.Double | c | Cosine component |
| cuDoubleComplex | s | Sine component |
Rot(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, Double, Double)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, double c, double s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy | |
| System.Double | c | Cosine component |
| System.Double | s | Sine component |
Rot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<float> c, CudaDeviceVariable<cuFloatComplex> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Single> | c | Cosine component |
| CudaDeviceVariable<cuFloatComplex> | s | Sine component |
Rot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<float> c, CudaDeviceVariable<float> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Single> | c | Cosine component |
| CudaDeviceVariable<System.Single> | s | Sine component |
Rot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, Single, cuFloatComplex)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, float c, cuFloatComplex s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| System.Single | c | Cosine component |
| cuFloatComplex | s | Sine component |
Rot(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, Single, Single)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, float c, float s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy | |
| System.Single | c | Cosine component |
| System.Single | s | Sine component |
Rot(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> c, CudaDeviceVariable<double> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Double> | c | Cosine component |
| CudaDeviceVariable<System.Double> | s | Sine component |
Rot(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, Double)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, double c, double s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| System.Double | c | Cosine component |
| System.Double | s | Sine component |
Rot(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> c, CudaDeviceVariable<float> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Single> | c | Cosine component |
| CudaDeviceVariable<System.Single> | s | Sine component |
Rot(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, Single)
This function applies Givens rotation matrix G = |c s; -s c| to vectors x and y.
Declaration
public void Rot(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, float c, float s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| System.Single | c | Cosine component |
| System.Single | s | Sine component |
Rotg(CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(CudaDeviceVariable<cuDoubleComplex> a, CudaDeviceVariable<cuDoubleComplex> b, CudaDeviceVariable<cuDoubleComplex> c, CudaDeviceVariable<cuDoubleComplex> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | a | |
| CudaDeviceVariable<cuDoubleComplex> | b | |
| CudaDeviceVariable<cuDoubleComplex> | c | Cosine component |
| CudaDeviceVariable<cuDoubleComplex> | s | Sine component |
Rotg(CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(CudaDeviceVariable<cuFloatComplex> a, CudaDeviceVariable<cuFloatComplex> b, CudaDeviceVariable<float> c, CudaDeviceVariable<cuFloatComplex> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | a | |
| CudaDeviceVariable<cuFloatComplex> | b | |
| CudaDeviceVariable<System.Single> | c | Cosine component |
| CudaDeviceVariable<cuFloatComplex> | s | Sine component |
Rotg(CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(CudaDeviceVariable<double> a, CudaDeviceVariable<double> b, CudaDeviceVariable<double> c, CudaDeviceVariable<double> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | a | |
| CudaDeviceVariable<System.Double> | b | |
| CudaDeviceVariable<System.Double> | c | Cosine component |
| CudaDeviceVariable<System.Double> | s | Sine component |
Rotg(CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(CudaDeviceVariable<float> a, CudaDeviceVariable<float> b, CudaDeviceVariable<float> c, CudaDeviceVariable<float> s)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | a | |
| CudaDeviceVariable<System.Single> | b | |
| CudaDeviceVariable<System.Single> | c | Cosine component |
| CudaDeviceVariable<System.Single> | s | Sine component |
Rotg(ref cuDoubleComplex, ref cuDoubleComplex, ref Double, ref cuDoubleComplex)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(ref cuDoubleComplex a, ref cuDoubleComplex b, ref double c, ref cuDoubleComplex s)
Parameters
| Type | Name | Description |
|---|---|---|
| cuDoubleComplex | a | |
| cuDoubleComplex | b | |
| System.Double | c | Cosine component |
| cuDoubleComplex | s | Sine component |
Rotg(ref cuFloatComplex, ref cuFloatComplex, ref Single, ref cuFloatComplex)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(ref cuFloatComplex a, ref cuFloatComplex b, ref float c, ref cuFloatComplex s)
Parameters
| Type | Name | Description |
|---|---|---|
| cuFloatComplex | a | |
| cuFloatComplex | b | |
| System.Single | c | Cosine component |
| cuFloatComplex | s | Sine component |
Rotg(ref Double, ref Double, ref Double, ref Double)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(ref double a, ref double b, ref double c, ref double s)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | a | |
| System.Double | b | |
| System.Double | c | Cosine component |
| System.Double | s | Sine component |
Rotg(ref Single, ref Single, ref Single, ref Single)
This function constructs the Givens rotation matrix G = |c s; -s c| that zeros out the second entry of a 2x1 vector (a; b)T
Declaration
public void Rotg(ref float a, ref float b, ref float c, ref float s)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | a | |
| System.Single | b | |
| System.Single | c | Cosine component |
| System.Single | s | Sine component |
Rotm(CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>)
This function constructs the modified Givens transformation H = |h11 h12; h21 h22| that zeros out the second entry of a 2x1 vector [sqrt(d1)*x1; sqrt(d2)*y1].
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<double> d1, CudaDeviceVariable<double> d2, CudaDeviceVariable<double> x1, CudaDeviceVariable<double> y1, CudaDeviceVariable<double> param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | d1 | |
| CudaDeviceVariable<System.Double> | d2 | |
| CudaDeviceVariable<System.Double> | x1 | |
| CudaDeviceVariable<System.Double> | y1 | |
| CudaDeviceVariable<System.Double> | param |
Rotm(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function applies the modified Givens transformation H = |h11 h12; h21 h22| to vectors x and y.
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Double> | param |
Rotm(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double[])
This function applies the modified Givens transformation H = |h11 h12; h21 h22| to vectors x and y.
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, double[] param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy | |
| System.Double[] | param |
Rotm(CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>)
This function constructs the modified Givens transformation H = |h11 h12; h21 h22| that zeros out the second entry of a 2x1 vector [sqrt(d1)*x1; sqrt(d2)*y1].
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<float> d1, CudaDeviceVariable<float> d2, CudaDeviceVariable<float> x1, CudaDeviceVariable<float> y1, CudaDeviceVariable<float> param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | d1 | |
| CudaDeviceVariable<System.Single> | d2 | |
| CudaDeviceVariable<System.Single> | x1 | |
| CudaDeviceVariable<System.Single> | y1 | |
| CudaDeviceVariable<System.Single> | param |
Rotm(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function applies the modified Givens transformation H = |h11 h12; h21 h22| to vectors x and y.
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| CudaDeviceVariable<System.Single> | param |
Rotm(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single[])
This function applies the modified Givens transformation H = |h11 h12; h21 h22| to vectors x and y.
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, float[] param)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy | |
| System.Single[] | param |
Rotm(ref Double, ref Double, ref Double, Double, Double[])
This function constructs the modified Givens transformation H = |h11 h12; h21 h22| that zeros out the second entry of a 2x1 vector [sqrt(d1)*x1; sqrt(d2)*y1].
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(ref double d1, ref double d2, ref double x1, double y1, double[] param)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | d1 | |
| System.Double | d2 | |
| System.Double | x1 | |
| System.Double | y1 | |
| System.Double[] | param |
Rotm(ref Single, ref Single, ref Single, Single, Single[])
This function constructs the modified Givens transformation H = |h11 h12; h21 h22| that zeros out the second entry of a 2x1 vector [sqrt(d1)*x1; sqrt(d2)*y1].
The elements h11, h21, h12 and h22 of 2x2 matrix H are stored in param[1], param[2], param[3] and param[4], respectively.
The flag = param[0] defines the following predefined values for the matrix H entries:
flag=-1.0: H = |h11 h12; h21 h22|
flag= 0.0: H = |1.0 h12; h21 1.0|
flag= 1.0: H = |h11 1.0; -1.0 h22|
flag=-2.0: H = |1.0 0.0; 0.0 1.0|
Notice that the values -1.0, 0.0 and 1.0 implied by the flag are not stored in param.
Declaration
public void Rotm(ref float d1, ref float d2, ref float x1, float y1, float[] param)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | d1 | |
| System.Single | d2 | |
| System.Single | x1 | |
| System.Single | y1 | |
| System.Single[] | param |
Sbmv(FillMode, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Sbmv(FillMode uplo, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Sbmv(FillMode, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Sbmv(FillMode uplo, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Sbmv(FillMode, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Sbmv(FillMode uplo, int k, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, double beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Double | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Sbmv(FillMode, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric banded matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix with k subdiagonals and superdiagonals, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Sbmv(FillMode uplo, int k, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, float beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Single | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Scale(CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Scale(CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Scale(CudaDeviceVariable<Double>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<double> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Scale(CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | alpha | |
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Scale(CudaDeviceVariable<Single>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<float> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Scale(CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | alpha | |
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
Scale(cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| cuDoubleComplex | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Scale(cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| cuFloatComplex | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Scale(Double, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(double alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | alpha | |
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx |
Scale(Double, CudaDeviceVariable<Double>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(double alpha, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Double | alpha | |
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx |
Scale(Single, CudaDeviceVariable<cuFloatComplex>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(float alpha, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | alpha | |
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx |
Scale(Single, CudaDeviceVariable<Single>, Int32)
This function scales the vector x by the scalar and overwrites it with the result.
Declaration
public void Scale(float alpha, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Single | alpha | |
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx |
SetMatrix<T>(Int32, Int32, T[], Int32, CudaDeviceVariable<T>, Int32)
copies a tile of rows x cols elements from a matrix hostSource in CPU memory
space to a matrix devDest in GPU memory space. Both matrices are assumed to be stored in column
major format, with the leading dimension (i.e. number of rows) of
source matrix hostSource provided in ldHostSource, and the leading dimension of matrix devDest
provided in ldDevDest.
Declaration
public static void SetMatrix<T>(int rows, int cols, T[] hostSource, int ldHostSource, CudaDeviceVariable<T> devDest, int ldDevDest)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | rows | |
| System.Int32 | cols | |
| T[] | hostSource | |
| System.Int32 | ldHostSource | |
| CudaDeviceVariable<T> | devDest | |
| System.Int32 | ldDevDest |
Type Parameters
| Name | Description |
|---|---|
| T |
SetMatrixAsync<T>(Int32, Int32, T[], Int32, CudaDeviceVariable<T>, Int32, CUstream)
copies a tile of rows x cols elements from a matrix hostSource in CPU memory
space to a matrix devDest in GPU memory space. Both matrices are assumed to be stored in column
major format, with the leading dimension (i.e. number of rows) of
source matrix hostSource provided in ldHostSource, and the leading dimension of matrix devDest
provided in ldDevDest.
Declaration
public static void SetMatrixAsync<T>(int rows, int cols, T[] hostSource, int ldHostSource, CudaDeviceVariable<T> devDest, int ldDevDest, CUstream stream)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | rows | |
| System.Int32 | cols | |
| T[] | hostSource | |
| System.Int32 | ldHostSource | |
| CudaDeviceVariable<T> | devDest | |
| System.Int32 | ldDevDest | |
| CUstream | stream |
Type Parameters
| Name | Description |
|---|---|
| T |
SetVector<T>(T[], Int32, CudaDeviceVariable<T>, Int32)
copies elements from a vector hostSourceVector in CPU memory space to a vector devDestVector
in GPU memory space. Storage spacing between consecutive elements
is incrHostSource for the source vector hostSourceVector and incrDevDest for the destination vector
devDestVector. Column major format for two-dimensional matrices
is assumed throughout CUBLAS. Therefore, if the increment for a vector
is equal to 1, this access a column vector while using an increment
equal to the leading dimension of the respective matrix accesses a
row vector.
Declaration
public static void SetVector<T>(T[] hostSourceVector, int incrHostSource, CudaDeviceVariable<T> devDestVector, int incrDevDest)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| T[] | hostSourceVector | Source vector in host memory |
| System.Int32 | incrHostSource | |
| CudaDeviceVariable<T> | devDestVector | Destination vector in device memory |
| System.Int32 | incrDevDest |
Type Parameters
| Name | Description |
|---|---|
| T | Vector datatype |
SetVectorAsync<T>(T[], Int32, CudaDeviceVariable<T>, Int32, CUstream)
copies elements from a vector hostSourceVector in CPU memory space to a vector devDestVector
in GPU memory space. Storage spacing between consecutive elements
is incrHostSource for the source vector hostSourceVector and incrDevDest for the destination vector
devDestVector. Column major format for two-dimensional matrices
is assumed throughout CUBLAS. Therefore, if the increment for a vector
is equal to 1, this access a column vector while using an increment
equal to the leading dimension of the respective matrix accesses a
row vector.
Declaration
public static void SetVectorAsync<T>(T[] hostSourceVector, int incrHostSource, CudaDeviceVariable<T> devDestVector, int incrDevDest, CUstream stream)
where T : struct
Parameters
| Type | Name | Description |
|---|---|---|
| T[] | hostSourceVector | Source vector in host memory |
| System.Int32 | incrHostSource | |
| CudaDeviceVariable<T> | devDestVector | Destination vector in device memory |
| System.Int32 | incrDevDest | |
| CUstream | stream |
Type Parameters
| Name | Description |
|---|---|
| T | Vector datatype |
Spmv(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Spmv(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> AP, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Spmv(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Spmv(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> AP, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Spmv(FillMode, Double, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Spmv(FillMode uplo, double alpha, CudaDeviceVariable<double> AP, CudaDeviceVariable<double> x, int incx, double beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Double | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Spmv(FillMode, Single, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric packed matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in packed format, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Spmv(FillMode uplo, float alpha, CudaDeviceVariable<float> AP, CudaDeviceVariable<float> x, int incx, float beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Single | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Spr(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Spr(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | AP | array with A stored in packed format. |
Spr(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Spr(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Spr(FillMode, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Spr(FillMode uplo, double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | AP | array with A stored in packed format. |
Spr(FillMode, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Spr(FillMode uplo, float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Spr2(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function performs the packed symmetric rank-2 update A = alpha * (x * y^T + y * x^T) + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Spr2(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | AP | array with A stored in packed format. |
Spr2(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function performs the packed symmetric rank-2 update A = alpha * (x * y^T + y * x^T) + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Spr2(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Spr2(FillMode, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>)
This function performs the packed symmetric rank-2 update A = alpha * (x * y^T + y * x^T) + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Spr2(FillMode uplo, double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | AP | array with A stored in packed format. |
Spr2(FillMode, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function performs the packed symmetric rank-2 update A = alpha * (x * y^T + y * x^T) + A where A is a n*n symmetric Matrix stored in packed format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Spr2(FillMode uplo, float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Stpttr(FillMode, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the conversion from the triangular packed format to the triangular format.
If uplo == CUBLAS_FILL_MODE_LOWER then the elements of AP are copied into the lower triangular part of the triangular matrix A and the upper part of A is left untouched.
If uplo == CUBLAS_FILL_MODE_UPPER then the elements of AP are copied into the upper triangular part of the triangular matrix A and the lower part of A is left untouched.
Declaration
public void Stpttr(FillMode uplo, int n, CudaDeviceVariable<float> AP, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix AP contains lower or upper part of matrix A. |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda x n , with lda>=max(1,n). The opposite side of A is left untouched. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Strttp(FillMode, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>)
This function performs the conversion from the triangular format to the triangular packed format.
If uplo == CUBLAS_FILL_MODE_LOWER then the lower triangular part of the triangular matrix A is copied into the array AP.
If uplo == CUBLAS_FILL_MODE_UPPER then then the upper triangular part of the triangular matrix A is copied into the array AP
Declaration
public void Strttp(FillMode uplo, int n, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates which matrix A lower or upper part is referenced |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda x n , with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | AP | array with A stored in packed format. |
Swap(CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleComplex> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<cuDoubleReal>, Int32, CudaDeviceVariable<cuDoubleReal>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<cuDoubleReal> x, int incx, CudaDeviceVariable<cuDoubleReal> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuDoubleReal> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuDoubleReal> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatComplex> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatComplex> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<cuFloatReal>, Int32, CudaDeviceVariable<cuFloatReal>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<cuFloatReal> x, int incx, CudaDeviceVariable<cuFloatReal> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<cuFloatReal> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<cuFloatReal> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<double1>, Int32, CudaDeviceVariable<double1>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<double1> x, int incx, CudaDeviceVariable<double1> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<double1> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<double1> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<float1>, Int32, CudaDeviceVariable<float1>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<float1> x, int incx, CudaDeviceVariable<float1> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<float1> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<float1> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Double> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Double> | y | |
| System.Int32 | incy |
Swap(CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function interchanges the elements of vector x and y.
Declaration
public void Swap(CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| CudaDeviceVariable<System.Single> | x | |
| System.Int32 | incx | |
| CudaDeviceVariable<System.Single> | y | |
| System.Int32 | incy |
Symm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symm(SideMode, FillMode, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric matrix-matrix multiplication C = alphaAB + betaC if side==SideMode.Left or C = alphaBA + betaC if side==SideMode.Right where A is a symmetric matrix stored in lower or upper mode, B and C are m*n matrices, and alpha and beta are scalars.
Declaration
public void Symm(SideMode side, FillMode uplo, int m, int n, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of B. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Int32 | m | number of rows of matrix C and B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix C and B, with matrix A sized accordingly. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Symv(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha A * x + beta * y where A is a nn symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha A * x + beta * y where A is a nn symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx, double beta, CudaDeviceVariable<double> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Double | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Symv(FillMode, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric matrix-vector multiplication y = alpha * A * x + beta * y where A is a n*n symmetric matrix stored in lower or upper mode, x and y are vectors, and alpha and beta are scalars. n is given by x.Size.
Declaration
public void Symv(FillMode uplo, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx, float beta, CudaDeviceVariable<float> y, int incy)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| System.Single | beta | scalar used for multiplication, if beta==0 then y does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
Syr(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr(FillMode, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-1 update A = alpha * x * x^T + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size.
Declaration
public void Syr(FillMode uplo, float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> x, int incx, CudaDeviceVariable<cuDoubleComplex> y, int incy, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuDoubleComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> x, int incx, CudaDeviceVariable<cuFloatComplex> y, int incy, CudaDeviceVariable<cuFloatComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<cuFloatComplex> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, double alpha, CudaDeviceVariable<double> x, int incx, CudaDeviceVariable<double> y, int incy, CudaDeviceVariable<double> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Double> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2(FillMode, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-2 update A = alpha * (x * y^T + y * y^T) + A where A is a n*n symmetric Matrix stored in column-major format, x is a vector, and alpha is a scalar. n is given by x.Size = y.Size.
Declaration
public void Syr2(FillMode uplo, float alpha, CudaDeviceVariable<float> x, int incx, CudaDeviceVariable<float> y, int incy, CudaDeviceVariable<float> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
| CudaDeviceVariable<System.Single> | y | vector with n elements. |
| System.Int32 | incy | stride between consecutive elements of y. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Syr2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syr2k(FillMode, Operation, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-k update C = alpha * (Op(A)Op(B)^T + Op(B)Op(A)^T) + beta * C where alpha and beta are scalars, and C is a symmetrux matrix stored in lower or upper mode, and A and B are matrices with dimensions Op(A) nk and Op(B) nk, respectively.
Declaration
public void Syr2k(FillMode uplo, Operation trans, int n, int k, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * k. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| cuDoubleComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| cuFloatComplex | beta | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, Double, CudaDeviceVariable<Double>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, double alpha, CudaDeviceVariable<double> A, int lda, double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| System.Double | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrk(FillMode, Operation, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, Single, CudaDeviceVariable<Single>, Int32)
This function performs the symmetric rank-k update C = alpha * Op(A)Op(A)^T + beta * C where alpha and beta are scalars, and A, B and C are matrices stored in lower or upper mode, and A is a matrix with dimensions op(A) nk.
Declaration
public void Syrk(FillMode uplo, Operation trans, int n, int k, float alpha, CudaDeviceVariable<float> A, int lda, float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| System.Int32 | n | number of columns of matrix op(B) and C. |
| System.Int32 | k | number of columns of op(A) and rows of op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * k. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| System.Single | beta | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldb * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, ref cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, ref cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, ref cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, ref cuDoubleComplex beta, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| cuDoubleComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuDoubleComplex | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, ref cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, ref cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, ref cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, ref cuFloatComplex beta, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| cuFloatComplex | alpha | scalar used for multiplication. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| cuFloatComplex | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, ref Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, ref Double, CudaDeviceVariable<Double>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, ref double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, ref double beta, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| System.Double | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Double> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Double | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Syrkx(FillMode, Operation, Int32, Int32, ref Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, ref Single, CudaDeviceVariable<Single>, Int32)
This function performs a variation of the symmetric rank- update C = alpha * (Op(A)Op(B))^T + beta * C where alpha and beta are scalars, C is a symmetric matrix stored in lower or upper mode, and A and B are matrices with dimensions op(A) nk and op(B) n*k, respectively.
Declaration
public void Syrkx(FillMode uplo, Operation trans, int n, int k, ref float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, ref float beta, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix C lower or upper part, is stored, the other symmetric part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or transpose. |
| System.Int32 | n | number of rows of matrix op(A), op(B) and C. |
| System.Int32 | k | number of columns of matrix op(A) and op(B). |
| System.Single | alpha | scalar used for multiplication. |
| CudaDeviceVariable<System.Single> | A | array of dimension lda x k with lda>=max(1,n) if transa == CUBLAS_OP_N and lda x n with lda>=max(1,k) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb x k with ldb>=max(1,n) if transa == CUBLAS_OP_N and ldb x n with ldb>=max(1,k) otherwise. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| System.Single | beta | scalar used for multiplication, if beta==0, then C does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc x n with ldc>=max(1,n). |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Tbmv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the triangular banded matrix-vector multiplication x= Op(A) x where A is a triangular banded matrix, and x is a vector. n is given by x.Size.
Declaration
public void Tbmv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbmv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the triangular banded matrix-vector multiplication x= Op(A) x where A is a triangular banded matrix, and x is a vector. n is given by x.Size.
Declaration
public void Tbmv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbmv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the triangular banded matrix-vector multiplication x= Op(A) x where A is a triangular banded matrix, and x is a vector. n is given by x.Size.
Declaration
public void Tbmv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbmv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the triangular banded matrix-vector multiplication x= Op(A) x where A is a triangular banded matrix, and x is a vector. n is given by x.Size.
Declaration
public void Tbmv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbsv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function solves the triangular banded linear system with a single right-hand-side Op(A) x = b where A is a triangular banded matrix, and x and b is a vector. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tbsv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbsv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function solves the triangular banded linear system with a single right-hand-side Op(A) x = b where A is a triangular banded matrix, and x and b is a vector. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tbsv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbsv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function solves the triangular banded linear system with a single right-hand-side Op(A) x = b where A is a triangular banded matrix, and x and b is a vector. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tbsv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tbsv(FillMode, Operation, DiagType, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function solves the triangular banded linear system with a single right-hand-side Op(A) x = b where A is a triangular banded matrix, and x and b is a vector. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tbsv(FillMode uplo, Operation trans, DiagType diag, int k, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | k | number of sub- and super-diagonals of matrix A. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpmv(FillMode, Operation, DiagType, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the triangular packed matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in packed format, and x is a vector. n is given by x.Size.
Declaration
public void Tpmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuDoubleComplex> AP, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpmv(FillMode, Operation, DiagType, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the triangular packed matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in packed format, and x is a vector. n is given by x.Size.
Declaration
public void Tpmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuFloatComplex> AP, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuFloatComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpmv(FillMode, Operation, DiagType, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function performs the triangular packed matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in packed format, and x is a vector. n is given by x.Size.
Declaration
public void Tpmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<double> AP, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Double> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpmv(FillMode, Operation, DiagType, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function performs the triangular packed matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in packed format, and x is a vector. n is given by x.Size.
Declaration
public void Tpmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<float> AP, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Single> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpsv(FillMode, Operation, DiagType, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function solves the packed triangular linear system with a single right-hand-side Op(A) x = b where A is a triangular matrix stored in packed format, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tpsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuDoubleComplex> AP, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpsv(FillMode, Operation, DiagType, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32)
This function solves the packed triangular linear system with a single right-hand-side Op(A) x = b where A is a triangular matrix stored in packed format, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tpsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuFloatComplex> AP, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuFloatComplex> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpsv(FillMode, Operation, DiagType, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32)
This function solves the packed triangular linear system with a single right-hand-side Op(A) x = b where A is a triangular matrix stored in packed format, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tpsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<double> AP, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Double> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Tpsv(FillMode, Operation, DiagType, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32)
This function solves the packed triangular linear system with a single right-hand-side Op(A) x = b where A is a triangular matrix stored in packed format, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size. No test for singularity or near-singularity is included in this function.
Declaration
public void Tpsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<float> AP, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Single> | AP | array of dimensions lda * n, with lda >= max(1,n). |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trmv(FillMode, Operation, DiagType, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the triangular matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x is a vector. n is given by x.Size.
Declaration
public void Trmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trmv(FillMode, Operation, DiagType, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the triangular matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x is a vector. n is given by x.Size.
Declaration
public void Trmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trmv(FillMode, Operation, DiagType, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the triangular matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x is a vector. n is given by x.Size.
Declaration
public void Trmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trmv(FillMode, Operation, DiagType, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the triangular matrix-vector multiplication x= Op(A) x where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x is a vector. n is given by x.Size.
Declaration
public void Trmv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| cuDoubleComplex | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, cuDoubleComplex, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, cuDoubleComplex alpha, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> B, int ldb, CudaDeviceVariable<cuDoubleComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| cuDoubleComplex | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuDoubleComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| cuFloatComplex | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, cuFloatComplex, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, cuFloatComplex alpha, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> B, int ldb, CudaDeviceVariable<cuFloatComplex> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| cuFloatComplex | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<cuFloatComplex> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| System.Double | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, Double, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, double alpha, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> B, int ldb, CudaDeviceVariable<double> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| System.Double | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Double> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function solves the triangular linear system with multiple right-hand-sides Op(A)X = alphaB side==SideMode.Left or XOp(A) = alphaB if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the maindiagonal, X and B are m*n matrices, and alpha is a scalar.
The solution X overwrites the right-hand-sides B on exit.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| System.Single | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
Trsm(SideMode, FillMode, Operation, DiagType, Int32, Int32, Single, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function performs the triangular matrix-matrix multiplication C = alphaOp(A) * B if side==SideMode.Left or C = alphaB * Op(A) if side==SideMode.Right where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, B and C are m*n matrices, and alpha is a scalar.
Notice that in order to achieve better parallelism CUBLAS differs from the BLAS API only for this routine. The BLAS API assumes an in-place implementation (with results written back to B), while the CUBLAS API assumes an out-of-place implementation (with results written into C). The application can obtain the in-place functionality of BLAS in the CUBLAS API by passing the address of the matrix B in place of the matrix C. No other overlapping in the input parameters is supported.
Declaration
public void Trsm(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, float alpha, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> B, int ldb, CudaDeviceVariable<float> C, int ldc)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A is on the left or right of X. |
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B, with matrix A sized accordingly. |
| System.Int32 | n | number of columns of matrix B, with matrix A sized accordingly. |
| System.Single | alpha | scalar used for multiplication. If alpha==0 then A is not referenced and B does not have to be a valid input. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * m. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | B | array of dimensions ldb * n. |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B. |
| CudaDeviceVariable<System.Single> | C | array of dimensions ldc * n. |
| System.Int32 | ldc | leading dimension of two-dimensional array used to store matrix C. |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuDoubleComplex> alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| CudaDeviceVariable<cuDoubleComplex> | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<cuFloatComplex>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<cuFloatComplex> alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| CudaDeviceVariable<cuFloatComplex> | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Double>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<double> alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| CudaDeviceVariable<System.Double> | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, CudaDeviceVariable<Single>, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, CudaDeviceVariable<float> alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| CudaDeviceVariable<System.Single> | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, ref cuDoubleComplex, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, ref cuDoubleComplex alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| cuDoubleComplex | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, ref cuFloatComplex, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, ref cuFloatComplex alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| cuFloatComplex | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, ref Double, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, ref double alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| System.Double | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
TrsmBatched(SideMode, FillMode, Operation, DiagType, Int32, Int32, ref Single, CudaDeviceVariable<CUdeviceptr>, Int32, CudaDeviceVariable<CUdeviceptr>, Int32, Int32)
This function solves an array of triangular linear systems with multiple right-hand-sides.
The solution overwrites the right-hand-sides on exit.
No test for singularity or near-singularity is included in this function.
This function is intended to be used for matrices of small sizes where the launch overhead is a significant factor. The current implementation limits the dimensions m and n to 32.
Declaration
public void TrsmBatched(SideMode side, FillMode uplo, Operation trans, DiagType diag, int m, int n, ref float alpha, CudaDeviceVariable<CUdeviceptr> A, int lda, CudaDeviceVariable<CUdeviceptr> B, int ldb, int batchCount)
Parameters
| Type | Name | Description |
|---|---|---|
| SideMode | side | indicates if matrix A[i] is on the left or right of X[i]. |
| FillMode | uplo | indicates if matrix A[i] lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A[i]) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A[i] are unity and should not be accessed. |
| System.Int32 | m | number of rows of matrix B[i], with matrix A[i] sized accordingly. |
| System.Int32 | n | number of columns of matrix B[i], with matrix A[i] is sized accordingly. |
| System.Single | alpha | scalar used for multiplication, if alpha==0 then A[i] is not referenced and B[i] does not have to be a valid input. |
| CudaDeviceVariable<CUdeviceptr> | A | array of device pointers with each array/device pointerarray of dim. lda x m with lda>=max(1,m) if transa==CUBLAS_OP_N and lda x n with lda>=max(1,n) otherwise. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A[i]. |
| CudaDeviceVariable<CUdeviceptr> | B | array of device pointers with each array/device pointerarrayof dim. ldb x n with ldb>=max(1,m) |
| System.Int32 | ldb | leading dimension of two-dimensional array used to store matrix B[i]. |
| System.Int32 | batchCount |
Trsv(FillMode, Operation, DiagType, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function solves the triangular linear system with a single right-hand-side Op(A)x = b where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size.
Declaration
public void Trsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trsv(FillMode, Operation, DiagType, CudaDeviceVariable<cuFloatComplex>, Int32, CudaDeviceVariable<cuFloatComplex>, Int32)
This function solves the triangular linear system with a single right-hand-side Op(A)x = b where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size.
Declaration
public void Trsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<cuFloatComplex> A, int lda, CudaDeviceVariable<cuFloatComplex> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<cuFloatComplex> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuFloatComplex> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trsv(FillMode, Operation, DiagType, CudaDeviceVariable<Double>, Int32, CudaDeviceVariable<Double>, Int32)
This function solves the triangular linear system with a single right-hand-side Op(A)x = b where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size.
Declaration
public void Trsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<double> A, int lda, CudaDeviceVariable<double> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Double> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Double> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Trsv(FillMode, Operation, DiagType, CudaDeviceVariable<Single>, Int32, CudaDeviceVariable<Single>, Int32)
This function solves the triangular linear system with a single right-hand-side Op(A)x = b where A is a triangular matrix stored in lower or upper mode with or without the main diagonal, and x and b are vectors. The solution x overwrites the right-hand-sides b on exit. n is given by x.Size.
Declaration
public void Trsv(FillMode uplo, Operation trans, DiagType diag, CudaDeviceVariable<float> A, int lda, CudaDeviceVariable<float> x, int incx)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements. |
| Operation | trans | operation op(A) that is non- or (conj.) transpose. |
| DiagType | diag | indicates if the elements on the main diagonal of matrix A are unity and should not be accessed. |
| CudaDeviceVariable<System.Single> | A | array of dimensions lda * n, with lda >= max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<System.Single> | x | vector with n elements. |
| System.Int32 | incx | stride between consecutive elements of x. |
Ztpttr(FillMode, Int32, CudaDeviceVariable<cuDoubleComplex>, CudaDeviceVariable<cuDoubleComplex>, Int32)
This function performs the conversion from the triangular packed format to the triangular format.
If uplo == CUBLAS_FILL_MODE_LOWER then the elements of AP are copied into the lower triangular part of the triangular matrix A and the upper part of A is left untouched.
If uplo == CUBLAS_FILL_MODE_UPPER then the elements of AP are copied into the upper triangular part of the triangular matrix A and the lower part of A is left untouched.
Declaration
public void Ztpttr(FillMode uplo, int n, CudaDeviceVariable<cuDoubleComplex> AP, CudaDeviceVariable<cuDoubleComplex> A, int lda)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates if matrix AP contains lower or upper part of matrix A. |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda x n , with lda>=max(1,n). The opposite side of A is left untouched. |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
Ztrttp(FillMode, Int32, CudaDeviceVariable<cuDoubleComplex>, Int32, CudaDeviceVariable<cuDoubleComplex>)
This function performs the conversion from the triangular format to the triangular packed format.
If uplo == CUBLAS_FILL_MODE_LOWER then the lower triangular part of the triangular matrix A is copied into the array AP.
If uplo == CUBLAS_FILL_MODE_UPPER then then the upper triangular part of the triangular matrix A is copied into the array AP
Declaration
public void Ztrttp(FillMode uplo, int n, CudaDeviceVariable<cuDoubleComplex> A, int lda, CudaDeviceVariable<cuDoubleComplex> AP)
Parameters
| Type | Name | Description |
|---|---|---|
| FillMode | uplo | indicates which matrix A lower or upper part is referenced |
| System.Int32 | n | number of rows and columns of matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | A | array of dimensions lda x n , with lda>=max(1,n). |
| System.Int32 | lda | leading dimension of two-dimensional array used to store matrix A. |
| CudaDeviceVariable<cuDoubleComplex> | AP | array with A stored in packed format. |