We proudly present CGIris – a series of informative blogs, each unpacking a different topic in the field of 3D and computer animation.
This is the second instalment in the CGIris series. In this blog, we look at the science and theory behind 3D rendering. How does 3D rendering work? Why is rendering so complex?
Rendering is essentially a mathematical computation. It takes all of the separate components of an animation and combines them together. The moving images that we see on our television, cinema, or mobile device screens can be summarised by the following equation…
In non-algebraic terms, rendering can be summarised as the below:
An equation (algebraic or otherwise) probably doesn’t help anybody visualise what’s really happening when we try to animate something. To understand how rendering works, its useful to consider how we see light, and therefore objects.
When the human eye observes an object, it is observing how light behaves; how it travels around and interacts with the other elements in the environment. As light reaches an object, it is reflected. This reflection of light contains the information – such as the colour of that object, where it’s located, its size, etc. – that is eventually communicated to the human brain.
This communication occurs when this light hits the human eye and travels through to the retina. Here the light activates light-sensitive receptors, called rods and cones, which contain pigment molecules. When these receptors are triggered, they send an electrical message to the brain, which effectively translates the reflection of light into the correct information.
Now that we know how light behaves with the human eyes and brain, we can begin to understand how ‘light’ is rendered in an computer generated image or scene.
The process of determining how light behaves in a computer generated image is called ray tracing. Ray tracing uses an algorithm to trace and simulate this behaviour. Each ray of light is tracked through a scene, gathering information on how that light interacts with assets, textures and so on. This information is then returned to the camera, which in effect acts as the human eye.
This seems relatively straightforward so far, right? A ray of light is traced around a scene, communicating the information it picks up back to the camera. Easy?!
While the concept can be broadly explained in simple terms, the reality is much more complex. For context, each ray typically bounces 2-4 times before it returns to the camera; this means it interacts with around 2-4 objects in any one scene.
More complex still is the fact that rays are rarely straight lines. They also rarely hit objects in a uniform manner and reflect accordingly. If we think about skin for example, which has a depth to it with multiple layers of differing translucency and colours, light behaviour suddenly becomes difficult to trace. If a ray was traced just from the surface of skin, you’d have very flat and one-dimensional texture which is not physically realistic. To accurately mimic the complexity of how light interacts with objects, you need a more sophisticated technique called subsurface scattering.
Subsurface scattering occurs when a light ray hits a complex object and then behaves differently according to the relative level of translucency. Skin, for example, has a high degree of translucency. This means that as light hits the surface, some of it is actually absorbed by the deeper layers. The light that is absorbed is then refracted and scattered around that environment before it exits the surface via a different location. A good example of how this works in real life is when you hold a light behind someone’s ear.
To add further complexity, tracing the movement of light is just one piece of the rendering puzzle. As with most things, the accuracy of the final outcome is a direct result of the volume of information that you can gather through sufficient sampling. As is the case with scientific experiments, the more samples of the environment you’re creating that you can collect, the more accurate the outcome will be. In this case, the more physically accurate the scene will look.
Undersampling leads to what the industry calls “noise”. Noise occurs when a scene is not sampled enough, “aliasing” the pixels in the image. Aliasing is the term used to describe when information is lost between individual pixels.
For example, if each pixel was only sampled once, it would only communicate back one specific colour or visual property. There would, therefore, be gaps between samples and the image would be left to ‘make up’ the information it doesn’t have. With more samples, more information is available about the scene, giving the final product a more complete look.
More sampling requires more work, more mathematical calculations, and more computing power.
To give it to you in numbers:
Producing computer generated images, or CGI, requires immense calculation. As technology advances, so does consumer demand for photorealistic animated features. Over the last twenty five years, there has already been huge change in the industry. Just take Toy Story as an example. The first feature was released in 1995. It looks worlds apart from Toy Story 4, released this year. In render terms, each frame in the first Toy Story film took anywhere between 45 minutes to 30 hours to render, with a total of 114,240 frames – that’s a long time! When rendering Toy Story 4, each frame took a staggering 60 to 160 hours to render. Therefore, each frame was at least 80 x more compute intensive!
Fortunately for the CGI industry, as our demand for realism and intensifies, technology in the media and entertainment industry advances with it. Nevertheless, its always nice to appreciate the level of work that goes into our household favourites!
You are seeing this because you are using a browser that is not supported. The YellowDog website is built using modern technology and standards. We recommend upgrading your browser with one of the following to properly view our website:Windows
Please note that this is not an exhaustive list of browsers. We also do not intend to recommend a particular manufacturer's browser over another's; only to suggest upgrading to a browser version that is compliant with current standards to give you the best and most secure browsing experience.