GPU Instancing is a widely used graphics optimization technique in games and simulations that allows rendering multiple copies of the same object (with the same mesh) in a scene efficiently without requiring additional draw calls.
Traditionally when rendering multiple identical objects, each instance requires a separate draw call to to the GPU. However, with GPU instancing, the GPU can render multiple instances of an object using a single draw call.
Draw calls are instructions sent to the GPU to render a specific mesh or set of meshes with associated materials and shaders.
GPU Instancing is particularly beneficial in various scenarios within game development which include
The technique works best when multiple instances to be rendered share the same mesh and material with slight variations in properties, such as position, scale, and color. Also, the shader should support instanced rendering so that it can take input from instance data to differentiate between individual instances while rendering. It's essential to identify scenarios where instances are similar enough to leverage this optimization effectively.
Let's say we need to display 1000 copies of a mesh in Unity. Of course, you can use the Instantiate()
method within a loop to create the instances but the CPU needs to send instructions to the GPU to draw each mesh one after the other. This will result in a higher number of draw calls.
Fig 3.1 Traditional rendering with over 2000 draw calls
Unity provides built-in functionalities to implement GPU instancing with ease. All you need to use the Graphics.DrawMeshInstanced()
method with a data pipeline. You need to specify
Additionally, you can provide SubmeshIndex, Material Property Block, Shadow Parameters, Layer, Camera, and LightProbeUsage.
Let's see how we can create the above floating cubes using GPU Instancing in Unity. We will use the default cube mesh provided in Unity and a material with the Standard shader applied as this shader supports GPU instancing.
csharp
1public class InstanceManager : MonoBehaviour 2{ 3 // Number of instances to be drawn 4 [SerializeField, Range(1, 1023)] private int count = 100; 5 // Reference to the mesh object 6 [SerializeField] private Mesh instanceMesh; 7 // Reference to the material that support GPU instancing 8 [SerializeField] private Material instanceMaterial; 9 10 // An array of matrices to handle transformations 11 private Matrix4x4[] transformMatrices; 12 13 // Start is called before the first frame update 14 private void Start() 15 { 16 // Initialize matrices array 17 transformMatrices = new Matrix4x4[count]; 18 19 // Iterate for instance count 20 for (int i = 0; i < count; i++) 21 { 22 // Get a random point within -50 to 50 unit in each axis 23 var pos = new Vector3(Random.Range(-50, 50), Random.Range(-50, 50), Random.Range(-50, 50)); 24 // Get a random rotation 25 var rot = Random.rotation; 26 // Assign position and rotation along with unit scale 27 transformMatrices[i] = Matrix4x4.TRS(pos, rot, Vector3.one); 28 } 29 } 30 31 // Update is called once per frame 32 private void Update() 33 { 34 // Rendering should be called in update to render each frame 35 Graphics.DrawMeshInstanced(instanceMesh, 0, instanceMaterial, transformMatrices, count); 36 } 37} 38
Once you created the above class component, you can attach it to an empty game object. You can set the count between 1 and 1023 and the the max GPU instances supported per batch is 1023. Also, assign a Standard material asset to the inspector field of this component and ensure GPU Instancing is enabled within the material.
Fig 3.2 GPU Instancing enabled material
The above approach will render 1000 cubes leveraging GPU instancing and you may notice that the overall drawcall is now reduced significantly.
Fig 3.3 Cubes drawn with GPU Instancing
What is happening here is that the GPU is aware of the cube mesh data block and all it needs to do is draw copies in a single instruction. Though a single material is shared among all the instances, customization to attributes can be applied through material property block or even through a custom shader code.
csharp
1... 2 3 // Update is called once per frame 4 private void Update() 5 { 6 // Iterate for instance count 7 for (int i = 0; i < count; ++i) 8 { 9 // Get current position 10 var pos = transformMatrices[i].GetPosition(); 11 // Get current rotation 12 var rot = transformMatrices[i].rotation; 13 14 // Create delta rotation 15 Quaternion deltaRotation = Quaternion.AngleAxis(rotationSpeed * 10.0f * Time.deltaTime, Vector3.up); 16 // Update current rotation 17 rot *= deltaRotation; 18 19 //Apply position and rotation back to matrix along with unit scale 20 transformMatrices[i] = Matrix4x4.TRS(pos, rot, Vector3.one); 21 } 22 23 // Rendering should be called in update to render each frame 24 Graphics.DrawMeshInstanced(instanceMesh, 0, instanceMaterial, transformMatrices, count); 25 } 26 27... 28
Finally, if you need to apply transformations to the matrix to create rotation, you can apply it as above. One thing you need to keep in mind is that GPU instancing only optimizes draw calls whereas any matrix transformation applied inside an Update()
method still gets executed in the CPU. So any complex transformations can lead to frame rate drops.
To overcome this, you can leverage parallel processing to perform your complex math computations on multi-threads using Unity ECS/DOTS/Burst. Or you could move your math computation to the GPU using compute shaders.
Hope you had a great time reading! Cheers 🍻
Copyright © 2024 rendercodeninja. All rights reserved.