Reviews of GPU Gems 3

Chap 1: Generating Complex Procedural Terrains Using the GPU
Chap 2: Crowd Rendering
Chap 3: DirectX 10 Blend Shapes
Chap 4: Next-Generation SpeedTree Rendering
Chap 5: Generic Adaptive Mesh Refinement
Chap 6: GPU-Generated Procedural Wind Animations for Trees
Chap 7: Point-Based Visualization of Metaballs on a GPU
Chap 8: Summed-Area Variance Shadow Maps
Chap 9: Interactive Cinematic Relighting with Global Illumination
Chap 10: Parallel-Split Shadow Maps on Programmable GPUs
Chap 11: Efficient and Robust Shadow Volumes Using Hierarchical Occlusion Culling and Geometry Shaders
Chap 12: High-Quality Ambient Occlusion
Chap 13: Volumetric Light Scattering as a Post-Process
Chap 14: Advanced Techniques for Realistic Real-Time Skin Rendering
Chap 15: Playable Universal Capture
Chap 16: Vegetation Procedural Animation and Shading in Crysis
Chap 17: Robust Multiple Specular Reflections and Refractions
Chap 18: Relaxed Cone Stepping for Relief Mapping
Chap 19: Deferred Shading in Tabula Rasa
Chap 20: GPU-Based Importance Sampling
Chap 21: True Imposters
Chap 22: Baking Normal Maps on the GPU
Chap 23: High-Speed, Off-Screen Particles
Chap 24: The Importance of Being Linear
Chap 25: Rendering Vector Art on the GPU
Chap 26: Object Detection by Color: Using the GPU for Real-Time Video Image Processing
Chap 27: Motion Blur as a Post-Processing Effect
Chap 28: Practical Post-Process Depth of Field
Chap 29: Real-Time Rigid Body Simulation on GPUs
Chap 30: Real-Time Simulation and Rendering of 3D Fluids
Chap 31: Fast N-Body Simulation with CUDA
Chap 32: Broad-Phase Collision Detection with CUDA
Chap 33: LCP Algorithm for Collision Detection Using CUDA
Chap 34: Signed Distance Fields Using Single-Pass GPU Scan Conversion of Tetrahedra
Chap 35: Fast Virus Signature Matching on the GPU
Chap 36: AES Encryption and Decryption on the GPU
Chap 37: Efficient Random Number Generation and Application Using CUDA
Chap 38: Imaging Earth's Subsurface Using CUDA
Chap 39: Parallel Prefix Sum (SCAN) with CUDA
Chap 40: Incremental Computation of the Gaussian
Chap 41: Using the Geometry Shader for Compact and Variable-Length GPU Feedback

GPU Gems is a classic book written by GPU experts from the industry (NVIDIA, Adobe, Microsoft, etc.) and academia, and this book introduce how to use GPU in graphics applications, and some tips on CUDA programming.

As a graphics fan(layman) and computer architecture/compiler student researcher, I have been writing CUDA for several years (though I'm still a naive CUDA programmer), and I've been thinking about why GPU and CUDA are designed so. Nowadays people use GPUs for ML workloads, but GPU was not originally designed for ML, what components should preserve and what components can be forsaken if we invent a Neural Network Accelerator (many prevalent NPUs such as GraphCore and Untether adopts at-memory-compute design, and has large memory-on-chips, why don't NVIDIA do the same thing)? Also, Graphics have been deeply influenced by DL these years (NERF, DLSS, etc.), is there any opportunity that we can re-design the rendering flow?

With these questions in mind, I start reading this book and leave some notes here. I will not copy and paste materials from the textbook (they are completely free online), but record some of my thoughts and insights instead.

Note that this book was written in 2008, and the latest generations then was Tesla. We should aware that technologies have evolved a lot since then, and some of the arguments and experiences do not hold in new GPU architectures. However, some classic algorithm are still in use (such as Parallel Reduction).

This book is organized as follows:

Chap 1-7: Geometry
Chap 8-13: Light & shadows
Chap 14-20: Rendering
Chap 21-28: Image Effects
Chap 29-34: Physics Simulation
Chap 35-41: GPU Computing

Chap 1: Generating Complex Procedural Terrains Using the GPU

Chap 2: Crowd Rendering

Chap 3: DirectX 10 Blend Shapes

Chap 4: Next-Generation SpeedTree Rendering

Chap 5: Generic Adaptive Mesh Refinement

Chap 6: GPU-Generated Procedural Wind Animations for Trees

Chap 7: Point-Based Visualization of Metaballs on a GPU

Chap 8: Summed-Area Variance Shadow Maps

Chap 9: Interactive Cinematic Relighting with Global Illumination

Chap 10: Parallel-Split Shadow Maps on Programmable GPUs

Chap 11: Efficient and Robust Shadow Volumes Using Hierarchical Occlusion Culling and Geometry Shaders

Chap 12: High-Quality Ambient Occlusion

Chap 13: Volumetric Light Scattering as a Post-Process

Chap 14: Advanced Techniques for Realistic Real-Time Skin Rendering

Chap 15: Playable Universal Capture

Chap 16: Vegetation Procedural Animation and Shading in Crysis

Chap 17: Robust Multiple Specular Reflections and Refractions

Chap 18: Relaxed Cone Stepping for Relief Mapping

Chap 19: Deferred Shading in Tabula Rasa

Chap 20: GPU-Based Importance Sampling

Chap 21: True Imposters

Chap 22: Baking Normal Maps on the GPU

Chap 23: High-Speed, Off-Screen Particles

Chap 24: The Importance of Being Linear

Chap 25: Rendering Vector Art on the GPU

Chap 26: Object Detection by Color: Using the GPU for Real-Time Video Image Processing

Chap 27: Motion Blur as a Post-Processing Effect

Chap 28: Practical Post-Process Depth of Field

Chap 29: Real-Time Rigid Body Simulation on GPUs

Chap 30: Real-Time Simulation and Rendering of 3D Fluids

Chap 31: Fast N-Body Simulation with CUDA

Chap 32: Broad-Phase Collision Detection with CUDA

Chap 33: LCP Algorithm for Collision Detection Using CUDA

Chap 34: Signed Distance Fields Using Single-Pass GPU Scan Conversion of Tetrahedra