ABSTRACT

Algorithm-level optimization and instruction-level optimization are tightly coupled with each other. Many programmers can optimize the implementation of a specific algorithm using MMX technology. However, without algorithm optimization, the speedup of the optimization will be limited. On the other hand, many algorithm developers can optimize the digital signal processing (DSP) algorithm in terms of the number of operations (multiplications or additions), but without implementation details, the number of operations cannot be directly translated into the number of clock cycles spent in CPU. There are also many algorithms that can accomplish the same task. For the best performance of DSP/ multimedia applications on personal computers we should consider algorithmMMX technology co-optimization.