Parallelized Implementation of Universal Visual Computer

doi:10.1201/b10648-50

ABSTRACT

A CNN consists of a number of identical cells, which are arranged in a twodimensional structure and are only connected to neighboring cells, where each cell has input, current, and next states. Distant cells are inﬂuenced by the others through data propagation between neighboring cells. With the CNN approach, diﬀerent applications, such as visual processing and optimization, are achieved using the same algorithm with a diﬀerent set of parameters. Although the local connectivity of the cells is well suited for implementation on a GPU, there are two additional issues that we have to address: ﬁrst, the computational model of GPU is based on four-channel data, but the CNN data is conventionally organized in a one-channel format; second, the data transfer rate between the GPU and main memory is much slower than the transfer rate between the CPU and main memory.