ABSTRACT
Efficient rendering of multi-fragment effects has long been a great challenge in
computer graphics, which always require to process fragments in depth order
rather than rasterization order. The major problem is that modern GPUs are
optimized only to capture the nearest or furthest fragment per pixel each ge-
ometry pass. The classical depth peeling algorithm [Mammen 89, Everitt 01]
provides a simple but robust solution by peeling off one layer per pass, but multi-
rasterizations will lead to performance bottleneck for large-scale scene with high
complexity. The k-buffer [Bavoil et al. 07,Liu et al. 06] captures k fragments in
a single pass but suffers from serious read-modify-write(RMW) hazards.