Compute-Unified Device Architecture (CUDA) programmers typically use high level libraries, such as cuBLAS, cuFFT, etc. and use CUDA itself as a glue language, to make everything work together. The roots of Basic Linear Algebra Subprogram (BLAS) go back to the late 1970s and were initially written in Fortran. cuBLAS is an implementation of BLAS on the CUDA architecture. cuBLAS Application Programming Interface (API) provide support for vector and matrix algebraic operations such as addition, multiplication, etc., allowing developers to accelerate their programs easily. Every cuBLAS API function comes in four different data types; single precision floating point, double precision floating point, complex single precision floating point, and complex double precision floating point. cuFFT is the CUDA Fast Fourier Transform API library, which allows working in the frequency domain by computing the frequency components of images or audio signals.