Jump to content

Search the Community

Showing results for tags 'optimization'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Community
    • News
    • Conversation
    • Compliment Box
    • Game Jams
    • Job Board
    • Your Games
    • Your Services
  • The Toolkit - Our Tools
    • AI GameDev Toolkit
    • Ambient Sounds
    • Gaia & Gaia Pro 2021
    • GeNa Pro
    • GTS - Glyph Terrain Shader
    • Pegasus
    • Scene Optimizer
    • Sectr Complete
    • HDRP Time Of Day
  • Packs - Procedural Content Packs
    • BEPR - Big Environment Pack Reforged
    • Fields of Colour
    • POLYGON Fantasy Kingdom
    • POLYGON Nature
    • SUNTAIL Stylized Fantasy Village
  • The xChange - You decide what we make ! (Subscribers Only)
    • Priority Requests
    • General Requests
    • Coming Soon - Playtime
    • Asset Integrations
    • Early Access
    • Game Templates
    • Game Ready Levels
    • Procedural Content Packs
    • Stamp Packs
    • Texture Packs
    • Utilities
  • Learning
    • Art
    • Level Design
    • Lighting
    • Marketing
    • Programming
    • Postmortems
    • Story Telling & Narrative
    • Unity

Blogs

  • Lets build worlds together

Categories

  • General
    • FAQ
    • Getting Started
  • Content Packs
    • Asset Packs
    • Game Ready Levels
    • Game Templates
    • Micro Biomes - Fields of Color
    • Nature - Spawner Pack for "POLYGON Nature"
    • Fantasy Kingdom - Spawner Pack for "POLYGON Fantasy Kingdom"
    • Fantasy Village - Spawner Pack for "SUNTAIL Fantasy Village"
    • BEPR - Spawner Pack for "Big Environment Pack Reforged"
  • Tools
    • Ambient Sounds
    • Gaia / Gaia Pro
    • GeNa Pro
    • GTS - Glyph Terrain Shader
    • HDRP Time of Day
    • Pegasus
    • Scene Optimizer
    • Sectr Complete 2019
    • Utilities
  • Deep Dives
    • Lighting
    • Maths
    • Performance
  • Early Access

Categories

  • AI GameDev Toolkit
    • Downloads
  • Beginner Downloads
    • Entry Level Tools
  • Professional Tool Downloads
    • Professional Tools
    • Procedural Content Packs
  • We Make For You Downloads
    • Early Access
    • Game Ready Levels & Templates
    • Stamp & Texture Packs
    • Tools & Utilities
  • Free Customer Downloads
    • Gaia Bonuses
    • GeNa Pro Bonuses

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Found 5 results

  1. Hi Everyone! In the past few months we have worked on optimising a scene for a VR game, which included needing a better running terrain shader than what Unity currently uses. We first made a shader in URP that can handle 4 terrain layers, then moved onto making a version that can handle 8 terrain layers. What's unique about this shader is the fact that we were able to reduce the number of texture samples per terrain layer - going from Unity's default of 3 (BaseColor, Normal, MaskMap) down to 2 layers. So it's a third of the cost of what a typical Unity terrain shader would utilise. For example, a terrain with 8 layers using Unity's system would require 24 textures (that's 8 layers * 3 textures per layer), whilst our shader would only need 16 textures (that's 8 layers * 2 textures per layer). Another area of improvement was reducing the amount of data tied to the terrain mesh required for the shader to sample. We were able to remove the need of vertex normals on the terrain mesh, as well as removing the need for UVs. With some initial testing in RenderDoc, our new shader (with 8 terrain layers) was able to save 11 textures (Unity's had 32, ours had 21). We were also able to save 10 variables used in one of the constant buffers (Unity's had 50, ours had 40). Through testing this early development shader 3 times in Renderdoc I was able to infer that this current version of our optimised shader runs about +9.47% faster than Unity's. It will still need some more testing in different scenarios and devices to conclude any other differences. The aim of this shader is to have an indentical look with Unity's terrain shader, but just run a little faster: Unity's Terrain Shader: Our Optimised Shader: Unity's Terrain Shader: Our Optimised Shader (still needs some work on getting the normals correct!): Let us know what you think about this approach!
  2. Introduction Optimizing 3D scenes for mobile devices often requires a range of techniques to reduce the amount of work both the CPU and GPU must do to render a scene. Here are 5 quick tips aimed to help increase the performance and framerate of your game on your Android and IOS mobile device. Reduce Polycount Combine Meshes There is an overhead in Unity for keeping track of the many transforms of GameObjects in a scene, even if your meshes are quite low in polygons. Data for each mesh, including position, rotation, scale, vertex colors, vertex normals and UVs are sent to the GPU for rendering, and the process of sending this group of data is called a batch. The more batches your scene has, the more work the CPU must do to send this data to the GPU. We can reduce the number of batches in a scene by using less individual GameObjects with meshes, and we can do this by merging meshes. Unity has built in API calls that you can code for this, but a lot of work is then required on top of this, and it is a slow and painful process. Alternatively, you can also use the Combine Meshes tool that comes with the PW Toolbox. More information about the PW Toolbox can be found here: Polygon budget While every game can differ in polygon counts, and mobile hardware is able to handle many more polygons than they used to, sticking to a guideline of around ~100,000 vertices can help with your games performance. Ideally you would keep a memory budget of around 1-3 mb per 3d mesh, which could range from 300-1000 vertices. Level of Detail Level of Detail (LOD) works by swapping higher and lower polygon versions of the same asset based on the screen size percentage that the model takes up. This can be good in situations where your mobile device cannot handle many high polygon models in a scene at once. However, setting up LODs should only be used if the number of polygons is affecting the performance, as swapping lower poly versions of meshes in and out of view increases the memory load and adds more work on the CPU for game to decide when to perform the swap. Another use case for LODs in Unity is their ability to hide (also known as culling) the object when it takes up a certain screen percentage. Hiding the mesh from a distance will stop it rendering, thus reducing the total polygon count. This can also be done with the LOD Group component. Some assets will come multiple LOD levels, but this adds even more work to the game to decide which LOD to show, and as this increases you suffer from diminishing returns where the LOD actually becomes more expensive than the extra polygons. For low poly assets we would typically only have a LOD 0 level, and a Cull level. Occlusion Culling Occlusion culling is a method used to hide meshes that are obstructed from view by another mesh. This process works well in small mobile levels where areas are often occluded by large meshes, such as interiors with corridors. However, using this method on large scale open world mobile environments can cause more overhead than actual performance gains in some cases. The occlusion calculations end up taking more time than the actual rendering of the items in the scene. To make use of occlusion culling, the object needs to be marked as either Occluder static or Occludee static. Setting the object to Occluder static will mean it can be used to test against if other objects can be hidden behind it. Setting the object to Occludee static will mean it can be hidden if it is behind an Occluder object. Change Resolution Changing the resolution of the game can increase your performance when the main contributor to rendering is related to the number of pixels the scene must render (also called fill rate). Settings to change the resolution can be found in Edit->Project Settings->Player->Resolution Scaling. You can set a fixed resolution for mobile devices by setting Resolution Scaling Mode to Fixed DPI (Dots per Inch) and entering a custom DPI in the Target DPI field. The DPI of your mobile device can be set there, but you can also set it to a value lower than the device’s native DPI to render the scene at a lower resolution. Another setting relating to Fixed DPI is located under Edit->Project Settings->Player->Quality-> Resolution Scaling Fixed DPI Factor. This setting is a multiplier for the previously mentioned Target DPI field. A value of 1 will result in the same setting as the Target DPI, however a value of 0.5 will scale the Target DPI by a half. In the example, the Target DPI is set as 400, and the Resolution Scaling Fixed DPI Factor set as 0.5, so the resulting resolution would be 200 DPI. Disable unused features Depth and Opaque Textures Disabling the creation of Depth and Opaque textures rendered from the camera can reduce the time taken to render a frame. However, certain effects require the use of these textures such as post process effects and custom shaders that make use of _CameraDepthTexture and _CameraOpaqueTexure features. If you are certain your scene doesn’t make use of these textures, it may be helpful disabling them and measuring the gained performance. They can be disabled in the pipeline asset, located in the heading General. HDR Turning off HDR in the pipeline asset settings can also reduce the time taken to render each frame, as HDR increases the VRAM usage and requires a tone mapping process on top of the rendered image. Anti-Aliasing Anti-aliasing is used to reduce the jaggy edges of objects in the scene. There are many ways this can be implemented, however each come with their own drawbacks and added computations. Experimenting with each technique on the mobile device is recommended to see how it affects the look of the scene. Unity recommends FXAA, which can be set on the camera component. Also compare how the scene looks without any Anti-Aliasing at all to see if you wish to include it in your project. Because this is a mostly fill rate limited technique, it is best to avoid using the more computationally heavy types of Anti-aliasing where possible. The MSAA type of Anti-Aliasing works on a hardware level by rendering the boarders of polygon edges multiple times at a subpixel level. This effect has a couple of different levels (2x, x4, x16), each with an increase to the rendering cost. The MSAA type of Anti-Aliasing can be turned off in the pipeline asset, located under the Quality heading. Test to see how much of performance benefit compared with the visual quality looks to determine if it needed in your project. Lower Lighting Quality Assuming your scene is using forward rendering on a mobile device, reducing the number of real time lights calculated per pixel can benefit performance by reducing the number of polygons and draw calls. Settings for these changes can be found in the pipeline asset under the heading Lighting. Additional lights can be either Per Vertex or Per Pixel. Per Vertex will result in less computations but will have lower quality lights due to the data being interpolated. Per Pixel will increase the computation time but will result in higher quality lights. For additional lights viewed at a far distance, the cheaper Per Vertex option should be considered. Opting for a baked lighting approach will be more performant and can allow more lights as Unity does not include these in any further lighting calculations at runtime. Reduce texture memory Reducing the size of your textures can help speed up the fill rate calculations for your GPU, as it has less pixels to process. A quick way to find which textures are taking up the most space in your scene is by using the free tool Resource Checker (available here: https://assetstore.unity.com/packages/tools/utilities/resource-checker-3224) It provides a quick overview of all the textures, meshes and materials in the scene. Looking at each texture and seeing the biggest sizes can help you identify textures that may need to be reduced. Anything above 2048x2048 pixels on mobile can be considered too much, and an excessive number of textures in the 2048-4096 range will slow down your scene, so it’s best to stick with 512 and 1024 resolution textures. Where possible, try to reduce the number of textures by creating texture atlases. These images contain more than one texture in them (at a reduced resolution), where multiple meshes can sample from this one texture. For low poly models, having a color palette texture that all your assets reference for their colors is common approach to reducing the number of textures present. Example of a texture atlas as seen in Minecraft Conclusion For a more in-depth article on optimizing a scene, including other areas to consider not just in the mobile environment, see our other articles: And don’t forget to check our PW Toolbox, which contains the Combine Meshes, for an easy and customizable mesh combining tool!
  3. Hi Everyone! In conjuction with the new GeNa Mesh Extrusions tool, we are also looking at better blending options in the radius where the holes are cut. We are working on getting a tool set up that, when giving an area, can create a set of PBR textures reflecting the final blended terrain at that position. This saves us from having to sample the terrain shader more times than actually needed, and allows us to explore other areas where sampling the color of the terrain below a surface could be used. Here are some screenshots showing the final result being sampled on a plane gameobject, next to the terrain it sampled from. Can you notice the seams where the plane intersects the terrain? This tool is also supported on multiple terrains (4 terrains are used in this picture, intersecting where the area is captured, with one of them highlighted in orange). We also sampled some other data that Unity's terrain component provides as textures, such as their splatmaps, heightmaps and world normal. Here is a comparision without some of the more detailed terrain textures utilised: Before: After: Sampling the color on the terrain below on object also allows us to blend it with objects on top, such as this rock example! What other uses do you think this could be used for?
  4. The Scene Optimizer (Formerly: PW Toolbox) helps you to optimize your Unity projects to gain better performance in your game. The additional graphical headroom that you get with better frame rates allows you to increase the graphical fidelity in other aspects of your project, or to build up a “performance buffer” that allows users with weaker hardware to enjoy your product as well. Trailer / Tutorial: You can download the Scene Optimizer as a Pro subscriber from here: https://canopy.procedural-worlds.com/files/file/20-pw-toolbox/ Installation You can install the Scene Optimizer as any other product via the package manager in the unity editor, or by importing the .unitypackage file. Usage To start using the Scene Optimizer, simply open the Main Window by going to: Window > Procedural Worlds > Scene Optimizer > Main Window… This will open the main window of the scene optimizer that lets you access most of its functionality. It is recommended to follow along with the Quickstart Guide in the next chapter to gain a basic understanding of the tool. Quick Start The Scene Optimizer can be accessed from a separate window inside the Unity Editor. 1. Once you have the Scene Optimizer imported into your project. It can then be opened through Window -> Procedural Worlds -> Scene Optimizer -> Main Window… Once opened, you’ll be presented with a series of settings to configure: 2. For mesh combination, which is the main feature of the scene optimizer, you need to add a ‘Root GameObject’ that contains all of the meshes are that you wish to combine. In this example, we have an ‘Original’ GameObject that contains all the original meshes as separate objects. 3. Normally you would configure some of the additional settings, but for the purpose of the quickstart, the default settings are fine. Select the ‘Debug Performance’ checkbox. This will create a test environment for us to compare quickly between the original Game Object setup vs. the output of the Scene Optimizer which will allow us to measure FPS improvements. 4. Then, click on ‘Optimize Scene’. This will create a new GameObject in the scene with a pre-configured UI to demonstrate the FPS improvement over the original GameObjects during Runtime. You now have a bunch of optimized GameObjects in your scene! 5. Hit play and see the performance improvements: As you can see, the Original GameObjects are running at roughly 28 frames per second. We have quite a lot of separate GameObjects in this scene so that’s not surprising that Unity has a lot of overhead. If we click on the green ‘Optimize’ button, you’ll notice that we are getting 212 frames per second! That’s exactly a 657% improvement over the original GameObjects in our scene! This quickstart introduction was the quickest possible way to perform an optimization for your scene. You will be able to tweak your optimizations further by tuning the settings and especially setting up layer based distance culling and tweaking it according to the objects being used in the scene. For more information please see the tutorial video, and take a look at the description of the Interface in the next sections. Clean Up The optimizer can create a lot of data depending on the complexity of your scene. You should do a 'Clean up' operation after you have optimised your scene. This will delete all the components that were added to your original objects and other kinds of scene data clean-up. To do this, go to Window > Procedural Worlds > Scene Optimizer > Clean up And wait for the process to finish. Gaia Pro Support By default, the Scene Optimizer supports optimizing Gaia Terrain Streaming in your project. To see this in action, go ahead and create your Gaia Environment using the Terrain Streaming system as you normally would. You should then notice a new option open in the Scene Optimizer window. This operation performs optimizations on the Gaia loaded terrains one after the other similar to processing every scene manually by hand. It takes the spawned game object childs below the terrain as root objects for the optimization. Interface This chapter details the individual settings found in the Scene Optimizer window. Global Settings Panel The Settings panel shows the global settings for the various features of the Scene Optimizer. Snap Mode: If enabled, snapping will only happen on Terrains, otherwise snapping will be on Meshes. Offset Check: Position offset of the selected object, helps with checking if objects are under the ground. Distance Check: How far the raycast check will go. The higher the value, the more it will affect performance. Raise & Lower Amount: How much the object is raised or lowered into the ground. Key Bindings Panel Key bindings settings to control various features of the Scene Optimizer. Snap to Ground Key: Button used to snap objects to ground. Align to Slope Key: Button used to align objects to slope. Align & Snap to Ground Key: Button used to align object to slope & snap object to ground. Raise from Ground Key: Button used to raise the object up from the ground. Lower in Ground Key: Button used to lower the object to the ground. Optimize Key: Button used to perform Scene Opimization. Scene Optimization Panel This panel shows all settings pertaining to the operation of combining meshes to optimize your Mesh Renderers in your scene. Drop GameObjects (Drop Area): GameObjects dropped here will be added to the ‘Root GameObjects’ array. Root GameObjects: GameObjects added to this list will be used in the process of optimizing your scene. Optimize Commands: A list of optimization instructions to be performed with all of the gathered objects within the Root GameObjects. Save To Disk: Saves the optimized object's assets to disk (significantly reduces the load time of scenes). Child Under Roots: Childs the optimized objects under the root GameObjects. Note: This is ignored when 'Debug Performance' is enabled. Debug Performance: Creates FPS UI and environment for testing performance when optimizing. Reset to Defaults: Reset to Defaults: Resets the Scene Optimization settings back to the default settings. Optimize Scene: Performs the Combine Mesh Operation using all the configurations above. Optimization Settings Combine Commands are a set of instructions to be performed (top down) and filters GameObjects in the process based on Cell Size, Object Size, Filtered Materials, etc. Is Static: Static Meshes are required for Occlusion Culling to work properly. Mesh Format: The index format to restrict meshes to. UInt16 is recommended for Mobile builds. Note: See 'Mesh.IndexFormat' in Unity's Scripting API. Mesh Layer: The Object Layer of the Optimized Objects. Add Layer Culling: Adds Layer Culling to the Camera. Object Distance: The distance that Objects will render for this layer. Shadow Distance: The distance that Shadows will render for this layer. Object Viz Color: The visualization color for the object distance. Shadow Viz Color: The visualization color for the shadow distance. Disable Renderers: Disables the original mesh renderers. Merge Colliders: Merges Colliders to the optimized from the original colliders. Add Colliders: Adds Mesh Colliders to Optimized (if missing). Collider Layer: The Layer to put all the newly created Colliders onto. Visualization Color: The global visualization color for the Optimized Objects. Use Large Ranges: Enables the use of larger values in the following properties. LOD Size Multiplier: Applies a multiplier to the final calculated LOD Size. Useful for controlling the render distance of generated LOD Groups. Add LOD Group: Adds LOD Groups to objects that do not have existing LOD Groups. LOD Cull Percentage: The percentage to cull renderers in the LOD Group. Cell Size: The size of the cell partition to optimize your scene (in world units). Cell Offset: The offset of the cell partition to optimize your scene (in world units). Object Size Range: The minimum size of objects that will be collected for this Optimize Command (Inclusive). Filter Materials: Filter the Optimizing of objects based on Material. Material Entries: List of Material Entries to be used in the Scene Optimizing Process. Provides control over what Shaders can be used or not. Add new Material: Adds a new Material Entry for Manually filtering out GameObjects. Scan for Materials: Scans the Root GameObjects for all Materials being used in the current operation. Save To Disk: Saves the optimized object's assets to disk (significantly reduces the load time of scenes). Child Under Roots: Childs the combined meshes under the root GameObjects. Note: This is ignored when 'Debug Performance' is enabled. Debug Performance: Creates FPS UI and environment for testing performance when Combining Meshes. Optimization Information Keeping track of many individual GameObjects in a scene has a cost to performance. There is a lot of data often associated with each mesh that gets sent to the GPU to render and reducing the amount of data that gets sent to the GPU improves performance. One way of reducing this data is by combining meshes, so instead of information such as position, scale and rotation being sent per mesh, this data is only sent once in this combined mesh. Unity provides a statistics window to profile performance quickly in a scene. It can be accessed through the Game window. Statistics Window Information A brief explanation of the most important data represented in the statistics window is summarized. FPS / ms FPS stands for Frames per Second, which is an indication of how many times Unity can render a single frame in a second. The more Frames per Second, the smoother a scene will run. The small ms value pictured in the brackets next to the FPS counter represents milliseconds, which is a unit of measurement showing how long it took Unity to render each frame. So, milliseconds and Frames per Second are directly related to each other. There is 1000 milliseconds in a second. If your scene is targeting a minimum of 60 Frames per Second, the target milliseconds would be (1000/ 60) = 16.667 milliseconds. Batches Batches refer to groups of data that can be sent by the CPU to the GPU to render. Sometimes objects are grouped together because they share the same material, as this saves the material being included in many different individual gameobjects being sent to the GPU. As a rule, the less batches the better the performance, but it can depend on how certain hardware is able to handle certain aspects within these batches. The process of the CPU telling the GPU to render a mesh is referred to as a Draw Call, so the less Draw Calls the CPU must send the better. 3 objects with the same material may take 3 Draw Calls to render, whilst combining these objects into 1 single object can be drawn with just 1 Draw Call. Tris / Verts Tris is shorthand for Triangles, and Verts is shorthand for Vertices. These refer to the total number of triangles and vertices being rendered in the frame. Generally, the less triangles and vertices in the frame, the better the performance, but certain hardware is better suited to handling higher numbers of triangles and vertices more efficiently than others. To see if triangles and vertices are the main cause of loss in performance, try changing the resolution in the Game view to a smaller value. If the Frames per Second doesn’t increase, it’s possible that the number of vertices and triangles being rendered will be main bottleneck on your hardware. SetPass calls SetPass calls refers to number of times the renderer needs to setup information about the current material being drawn. Having multiple different materials on many individual meshes will increase this number, as the renderer is constantly changing what material needs to be drawn to complete the frame. Shadow casters To render shadows, each individual object is rendered again but with a simpler shading. This also adds up in triangle / vertex count, so turning shadow casting off will often reduce these numbers. Combining meshes reduces the amount of individual shadow casters required, because the amount of individual meshes have also been reduced. Combining Meshes based on Cell Sizes The Scene Optimization tool can group together meshes based on a configurable spatial partition. The spatial partition is a 3D volume size that is configurable in the menu. Items that occupy each partition are combined based on unique materials, size and LOD groups. Due to the way that different hardware can be better suited for many pixels rendering on screen or the number of vertices, different cell sizes may be better suited for different hardware. Below is a comparison table showing how the data in the statistics tab will change based on different cell sizes chosen. Frames per Second and milliseconds are excluded as this data changes between hardware. Cell Size Large (~128+) Small (~8+) Tradeoffs Batches Less batches than original scene More batches than a Larger cell size, but still less batches than original If your CPU performs better with less batches, it may be CPU bound, meaning sticking to a higher cell size may give you the best performance, all other aspects considered. Tris / Verts More Tris / Verts than original scene Less Tris / Verts than Large cell size, but more Tris / Verts than original scene The cost of having a larger cell size may be worth using if your hardware can support more tris / verts without decreasing performance. SetPass Calls Less Setpass Calls than original scene More Setpass Calls than a Larger cell size, but less Setpass Calls than original scene Much like the number of batches, if the scene runs better with less SetPass calls, it might be worth sticking with a higher cell size for best performance, all other aspects considered. Shadow Casters Less Shadow Casters than original scene More Shadow Casters than a larger cell size, but less Shadow Casters than original scene The cost of having a larger cell size may be worth using if your hardware can support more tris / verts without decreasing performance. Frustum Culling with Cell Sizes Because this tool combines meshes that fall within the bounds of each spatial partition, they are treated like one object when it comes to rendering. Unity’s renderer automatically culls (hides) objects that do not fall into the rendered camera’s view frustum. Because the combined objects are considered one object, even if one small part of it is occupying the screen, the whole object will be drawn. This causes extra vertices to be rendered even if they are not visible in the camera’s view. A smaller cell size will reduce the visible size of each combined object, thus providing better frustum culling with less vertices rendered on screen. If your scene if vertex bound, i.e., performance slows down with more vertices, than choosing a smaller cell size may suit this scenario. Frustum culling roughly visualized in a scene without any combined objects. Frustum culling roughly visualized in a scene with combined objects Cell Size = 64. Because of the bigger bounds of the combined objects, more is being drawn. Frustum culling roughly visualized in a scene with combined objects Cell Size = 16. It draws less vertices in total than the bigger cell size of 64, but still draws more vertices than the scene without any combined meshes. Using the Mesh Combine tool The combine meshes tool works on the currently selected root elements of all gameobjects that should be combined in the hierarchy. To combine meshes, the default hotkey is Ctrl + T.
  5. Introduction Optimizing a scene to increase its frame rate can be a difficult process to get right, due to the number of aspects that contribute to the final rendered image. It can even be hard just to find the right settings to change for the result you want, as some of the new render pipeline settings may seem hidden. This article explores ways to find bottlenecks in your scene in Unity, the categories they fall into, and some common approaches to remove or reduce them and improve your scenes performance and framerate. The aspects considered in this article are: Profiling General Optimization Fill Rate Optimization Vertex Throughput Optimization Batch / Set Pass Call Optimization Profiling Profiling Approach When profiling a scene, it is best to first establish your baseline metrics, recording the current statistics on those aspects that cause the most cost to performance, such as milliseconds, number of batches and the number of SetPass calls. Placing a camera in a stationary position throughout the optimization process will help achieve consistent measuring statistics to allow you to compare how your fixes have impacted performance. Once you have established the worst aspects of the scene contributing to the performance degradation, it is recommended to fix these first before moving on to the smaller contributors. When fixed, you can iterate the process and find the next biggest contributors, continuing until you reach your performance target. When establishing where the bottlenecks are, the main categories that impact scene rendering are: Fill Rate - the amount of pixels being processed by the scene. Vertex Throughput - the amount of vertices being processed by the scene. Batches / SetPass Calls - groups of data that get sent to the GPU for rendering, i.e. meshes and materials. Simple Fill Rate Profiling Fragment shader operations are tied to fill rate as they contribute to the final pixel of an object. A relatively quick way to discover if your scene is fill rate limited is to reduce the display resolution. If the scene renders faster with a reduced resolution, this indicates that it may be limited by the fill rate on the GPU. This can be done in editor as well as on mobile platforms. For desktop testing in-editor, a list of premade resolutions appears in the dropdown window under ‘Free Aspect’ in the Game window. A custom resolution can also be added to match native resolution of mobile devices. For mobile devices, there are a couple of areas in the project settings that change the resolution of the project for a build. The first couple of settings are located under Edit->Project Settings->Player->Resolution Scaling. You can set a fixed resolution for mobile devices by setting Resolution Scaling Mode to Fixed DPI and entering a custom DPI in the Target DPI field. The DPI of your mobile device can be set there, but you can also set it to a value lower than the device’s native DPI to render the scene at a lower resolution. Another setting relating to Fixed DPI is located under Edit->Project Settings->Player->Quality-> Resolution Scaling Fixed DPI Factor. This setting is a multiplier for the previously mentioned Target DPI field. A value of 1 will result in the same setting as the Target DPI, however a value of 0.5 will scale the Target DPI by a half. In the example, the Target DPI is set as 400, and the Resolution Scaling Fixed DPI Factor set as 0.5, so the resulting resolution would be 200 DPI. Simple Profiling of Vertex Throughput When reducing the display resolution to test for fill rate, if the FPS rate doesn’t increase significantly, you may be vertex or set pass call bound. When drawing a scene at a reduced resolution you are still processing the same vertex count and setting the same number of draw calls. Profiling Batches / SetPass Calls Batching refers to the number of groups of objects that the CPU sends for processing / rendering to the GPU. A SetPass call is a change of state between the batches being sent to the GPU. When a new batch with different data required for rendering than the previous batch is sent to the GPU, the GPU needs to receive the new information on how to render it. For example, if you had a blue cube followed by a red cube in the scene, the GPU would need to change its current instructions on how to render a red cube to the instructions for rendering the blue cube. The more batches and SetPass calls required, the more work the CPU and GPU must do. Batches and SetPass calls can be profiled by adjusting the field of view as well as turning game objects on and off whilst playing in editor. Detailed Profiling For more detailed statistics of which areas of your scene are costly to performance, Unity provides the profiler and the frame debugger. Using Unity’s Profiler This can be located under Window -> Analysis -> Profiler. The profiler is a tool used for capturing all the events executed that in the game over several frames. To understand the loop, refer to following: https://docs.unity3d.com/Manual/ExecutionOrder.html https://docs.unity.cn/560/Documentation/Manual/ScriptableRenderPipeline.html https://docs.unity3d.com/Packages/com.unity.render-pipelines.high-definition@7.1/manual/Custom-Pass.html To the left of the profiler is a color-coded toggle of features in the scene that take up milliseconds. You can turn them on and off to see how it affects the overall performance. At the bottom of the window is the Hierarchy view giving an overview of the total milliseconds used when executing each function in the different loops that Unity uses. The Hierarchy view can be swapped out to display a Timeline view, showing the main functions executed with their millisecond timings, in order, within a single frame. Using Unity’s Frame Debugger This can be located under Window -> Analysis -> Frame Debugger. The Frame Debugger shows the individual draw calls used by Unity to build up the final frame. These can be stepped through to show how certain images are drawn together and which components require more draw calls than others. The less draw calls the scene contains, the better. Certain Post Processing features require more draw calls, so checking this section on the Frame Debugger early on may give a hint to where many of the resources are being allocated. Using the Free Asset Resource Checker The Resource Checker is a helpful tool that provides a summary of the resources (Textures, Materials, Meshes) contained within the scene along with their memory footprint. The Resource Checker tool can be found here: https://assetstore.unity.com/packages/tools/utilities/resource-checker-3224 Generally, the smaller the memory footprint a resource takes up, the less time taken to load it. This means it is possible to impact the speed of how certain art assets are loaded for rendering by: Using smaller texture sizes or greater compression (e.g. a 4096x4096 reduced to 2048x2048) Lowering the polycount for meshes Stripping out data not used on the mesh (UV channels, lightmaps, vertex normals, vertex colours, flat shading) Optimizing objects to share the same material. General Optimization The following sections will describe possible optimizations to implement relating to the rendering categories. Graphics APIs (General) It is worthwhile testing if the scene runs faster with different graphics APIs. The order in which these are set will affect the fallback options if the device doesn’t support a particular API. These settings can be found in Edit->Project Settings->Player->Other Settings. To force the device to use a particular API, disable Auto Graphics API. Adding a new API may require a recompilation of the entire project to add it to the Graphics APIs list, so be prepared for this to occur. In this example, Vulkan has been added as well as OpenGLES3, with Vulkan taking priority as it is at the top of the list. Fill Rate Optimization As mentioned in the profiling section, lowering the resolution of a fill rate limited scene can help gain performance. This may prompt design decisions like rendering the game at half of the mobile’s native resolution or setting the resolution to a fixed size. Custom render textures within a scene should be considered. Changing their size on a fill rate limited scene will mean less pixels for the renderer to read/write to. Reduce Buffer Sizes Disabling the capture of a depth buffer in the render texture’s settings, along with choosing an appropriate color format (i.e. using a single channel color format like R8_UNorm for a greyscale render texture), can help reduce the computations involved with using a render target. Reduce Overdraw Particle systems with many transparent effects such as fog and raindrops may quickly add up to the overdraw of each frame. Transparent overdraw can quickly add up if many transparent pixels are rendered on top of one another, such as having multiple layers of a fog material or looking through many slides of glass. For effects like fog sheets, reducing the spawned amount but increasing the opacity may achieve a similar affect, but with less pixels being drawn twice. Reduce Post Processing Disable Unused Buffers Disabling the creation of Depth and Opaque textures rendered from the camera can reduce the time taken to render a frame. However, certain effects require the use of these textures such as post process effects and custom shaders that make use of _CameraDepthTexture and _CameraOpaqueTexure features. If you are certain your scene doesn’t make use of these textures, it may be helpful disabling them and measuring the gained performance. They can be disabled in the pipeline asset, located in the heading General. Precision Reduction The project’s color mode can be changed under Edit->Project Settings->Player->Other Settings->Rendering. The Color Space option will allow you to choose Linear or Gamma. Like the Graphics API section, there is also an option to support multiple color Gamut’s, located in the Color Gamut section. Turning off HDR in the pipeline asset settings can also reduce the time taken to render each frame, as HDR increases the VRAM usage and requires a tone mapping process on top of the rendered image. Full screen effects Some post processing effects require expensive calculations at runtime, resulting in slower performance. If your project is already not using HDR (as per pipeline asset), the grading mode in the pipeline asset under Post-Processing can also be switched to Low Dynamic Range. A quick way to discover the heavier computation effects is by looking at the Stats dropdown window with a stationary camera and turning off certain post process effects. Pay attention to the number of batches and set pass calls, as some effects will contribute more to the total calls. As previously mentioned, the cost of these effects can be also measured in draw calls through the frame debugger, located under Window->Analysis->Frame Debugger. This displays the process used to build up each frame and will show the draw calls for some post process effects. Removing these effects or implementing workarounds to achieve a similar result with less computations will provide an increase to performance, such as applying a bloom effect to a HDRI sky image (in an image editing software) before importing it into the project. Reducing Fragment Shader Complexity Realtime Lighting If your scene uses a directional light with mixed / realtime lighting, lowering the quality of the real time shadows created can help reduce the time required to render each frame. This can be altered in the pipeline asset for your project. Reducing the Max Distance value will shrink the area of affect for shadows being drawn and help reduce the vertex throughput and draw calls. Lowering the Cascade Count will reduce the staged reductions in shadow map size, resulting a more pixelated shadow depending on the resolution of the shader map, but will reduce the fill rate and Set Pass calls. The shadow map resolution can be changed with the parameter Shadow Resolution under the Lighting heading in the pipeline asset. Cascade Count: 4 Cascade Count: 1 Setting Soft Shadows to enabled will help alleviate the pixilation, however this includes an added cost to the time taken to render the frame. Lighting Calculations Assuming your scene is using forward rendering on a mobile device, reducing the number of real time lights calculated per pixel can benefit performance by reducing vertex throughput and draw calls. Settings for these changes can be found in the pipeline asset under the heading Lighting. Additional lights can be either Per Vertex or Per Pixel. Per Vertex will result in less computations but will have lower quality lights due to the data being interpolated. Per Pixel will increase the computation time but will result in higher quality lights. For additional lights viewed at a far distance, the cheaper Per Vertex option should be considered. Opting for a baked lighting approach will be more performant and can allow more lights as Unity does not include these in any further lighting calculations at runtime. Anti-aliasing Anti-aliasing is used to reduce the jaggy edges of objects in the scene. There are many ways this can be implemented, however each come with their own drawbacks and added computations. Experimenting with each technique on the mobile device is recommended to see how it affects the look of the scene. Unity recommends FXAA, which can be set on the camera component. Also compare how the scene looks without any Anti-Aliasing at all to see if you wish to include it in your project. Because this is a mostly fill rate limited technique, it is best to avoid using the more computationally heavy types of Anti-aliasing where possible. The MSAA type of Anti-Aliasing works on a hardware level by rendering the boarders of polygon edges multiple times at a subpixel level. This effect has a couple of different levels (2x, x4, x16), each with an increase to the rendering cost. The MSAA type of Anti-Alising can be turned off in the pipeline asset, located under the Quality heading. Test to see how much of performance benefit compared with the visual quality looks to determine if it needed in your project. Mobile Friendly Shaders Reducing the instruction count on your shaders means less computations for the GPU to work through to render an effect. Changing the shader’s precision mode to half on effects that don’t require precise calculations and deleting surface structure modules, will help lower the instruction count. Moving certain calculations from the fragment program to the vertex program where applicable can help reduce the number of times the calculation is executed. Lower precision in the Graph Settings. Delete unused surface structure inputs. Sampler Settings Filter modes on a texture will impact performance depending on which option you choose. Unity provides 3 filtering options. Point filtering is the cheapest to calculate, followed by Bilinear then Trilinear: Point. Texture pixels become blocky up close. Bilinear. Texture samples are averaged. Trilinear. Texture samples are averaged and blended between mipmap levels. https://docs.unity3d.com/ScriptReference/FilterMode.html The Aniso Level refers to the Anisotropic filtering quality of the texture. This is used for improving the look of textures at shallow angles, but also contributes to more work required for the GPU. https://docs.unity3d.com/ScriptReference/Texture-anisoLevel.html Texture Data Mip Levels Enabling mip maps will increase the file size but help reduce the scene from sampling large textures at a faraway distance where their detail isn’t noticeable. This helps with the GPU texture cache as it requires less data to be loaded in. Unity also provides a couple of mipmap filtering options to control their look when viewed from afar. Texture Size Reducing the dimensions of the texture will decrease its quality but require less memory to be used. The dimensions of the texture can be reduced in the texture asset’s settings: Texture Format Channel Packing To reduce the number of textures in a project, certain maps can be packed into a single texture using their RGBA channels. For instance, a PBR material may have a roughness, metallic, ambient occlusion and height map required for its shading. Since these textures are greyscale images, they can each be packed into a single channel of 1 image to save a potential of 4 separate files. There are many ways to channel pack, such as using online tools, Photoshop, and Substance Designer. Trim Sheets / Atlases Trim sheets / Atlases store many different textures into 1 texture for use by multiple objects. This helps reduce the project size as 4 assets each with their own texture (a total of 4 textures) can be reduced to using a single texture. With the PBR workflow this may result in trim sheets / atlases for the different channels, such as a base color trim sheet, the accompanying normal map trim sheet, and a mask map trim sheet for each of the required PBR channels. This technique is most beneficial at the design stage of props, as having in mind what materials and details are part of a trim sheet will influence the design of the created props. This technique is beneficial if many assets in the scene share the same texture but may be considered a waste of memory if only a few assets end up utilizing a small section of the trim sheet, where the whole texture still gets loaded, but results in a lot of unused space. Compression & Precision Textures can be reduced in file size and memory footprint through changing their formats and compression settings. Further options for customized compression formats can be found by clicking on the PC, Mac & Linux Standalone Settings tab. Changing the format to one that has a lower number of bits will reduce the file size but result in lower quality color data. Other formats include support for only one or a couple of the RGBA channels, such as R8 only storing a single Red channel with 8 bits – which may be suited for greyscale images. Be careful when choosing different compression algorithms. Some dedicate more bits to other color channels unequally, such as DXT1 which stores 5 bits in the red channel, 6 bits in the green channel, and 5 bits in the blue channel. Vertex Throughput Optimizations Reduce Vertex Shader Complexity Like the method used with reducing the fragment shader complexity, removing surface structure inputs that are not used in the vertex program can reduce the number of shader instructions. Level of Detail (LOD) Using Level of Detail (LOD) on your gameobjects should only be used in specific cases. This optimization works by swapping out a higher polygon mesh with a lower polygon version of itself when viewed at specified screen size percentage. This means there are less polygons in total to draw to render a scene. If your scene is not vertex bound (i.e. the number of polygons in the scene isn’t causing the main bottleneck), then this method is not effective for increasing performance. Because it swaps out a mesh with a lower quality version of itself, this adds to the SetPass calls. Individual meshes can’t easily be batched together if they all have differing LOD levels, because there would need to be batches accounting for all the different combinations of the LOD, at each screen size percentage. Using Unity’s LOD Group component for setting up culling distances can be useful however to hide objects depending on screen size. Culling the mesh will reduce the polygon count, number of batches and SetPass calls. Occlusion Culling Occlusion culling is a method used to hide meshes that are obstructed from view by another mesh. This process works well in small levels where areas are often occluded by large meshes, such as interiors with corridors. However, using this method on large scale environment scenes can cause more overhead than actual performance gains in some cases, where calculating which objects are to be occluded takes more time than the actual rendering of all the items in the scene. Mesh complexity Reducing the complexity of data that a mesh contains can help reduce the instructions required to render it. Data associated with a mesh includes vertex positions, vertex normals, UV channels and vertex colors. If the meshes used in a game do not require some of this extra data, like vertex colors and extra UV channels, it is best to remove them since they may still get processed and interpolated in the shader. A mesh that is smooth shaded will require less memory than a mesh that is flat shaded. For a face to be shaded flat, it requires all its vertices to share the same vertex normal vector. This can mean multiple different vertex normals per vertex, depending on how many flat shaded faces are joined together. This also applies to UV data, where having separate UV shells will require multiple UV positions per vertex. Batches / SetPass Call Optimizations Types of Batching Batching is a way to reduce the amount of unique data sent from the CPU to the GPU for rendering. It is more efficient for the CPU to send a single mesh comprised of smaller individual parts to the GPU than it is to send many different individual parts as separate meshes. Below are the types of batching Unity supports. Static Static batching requires each mesh to share the same material and shader. The gameobject also needs to be stationary and cannot move. This can be turned on in the Inspector view of the selected gameobject: Dynamic Dynamic batching is handled by Unity and requires each mesh to share the same material and shader. However, the combined mesh data must have less than 900 vertex attributes – so this method is mainly used for very small objects such as quads in a particle system. Instanced Batching Instanced batching requires each game object to share the same mesh, material, and shader. This can be turned on through enabling ‘Enable GPU Instancing' on the material with a compatible shader. Instanced Indirect Batching Instanced Indirect batching requires manual setup on the user’s end and a custom shader is required to make use of its functionality. This method also requires the same mesh, material, and shader to be used on the objects in the scene. Methods to Aid the Batching Process Having assets that already conform to some of the prerequisites of these batching techniques can help make the process easier to set up. Consider allowing multiple assets to share the same material where possible. This may mean developing assets that make use of atlases / trim sheets so one material can be shared across many unique meshes. Using shaders that do not require unique mesh data and can be placed on many meshes will also help reduce the number of materials used. Combining meshes into one single mesh (that share a single material) will also reduce the batches required for rendering. Conclusion Overall: Determine what aspect of your scene is causing the biggest cost to performance early on in your optimization cycle. Discover whether your scene is fill rate limited, vertex throughput limited, or batching / SetPass call limited. This will guide you to where your optimization resolutions should focus first. Working your through the heavy computational areas first will give your biggest boost to performance early on and prevent unnecessary work which may be time and effort spent in the wrong areas. Hopefully these suggestions and possible resolutions will help your thinking process as you optimize your scene.
×
×
  • Create New...