Summary Biography Abstract

Content

Introduction

 Performance is an important component of any game, the popularity of the game, user comfort and overall impression depend on it [1]. Performance usually means the average FPS level in the game for a certain period of time. FPS (Frames Per Second) is the number of frames per second, frames are images that the graphics adapter processes at the command of the CPU. The more frames are shown per second, the smoother and more comfortable the game is.

 Contrary to expectations, few players require high FPS rates, for most, an average of 30 units is enough, and for some even less. Studies conducted by Lesta Studio show that for the entire population of users, the relationship between performance and a comfortable game is rather weak (Fig.1).

Figure 1 – Diagram of the relationship between comfort perception and average FPS

 However, the level of comfort also depends on the genre of the game and on the user himself. For example, many console games operate at a frequency of 30fps, but with such indicators, the picture is not smooth enough, and the feedback on the player's manipulations is delayed, which prevents full immersion in the game. For players in casual arcade games, 40+fps will be enough, for first-person shooters, strategies, rhythm games, MOBA will need about 60-100fps [2]. The Fps score above 100 will be required for players of competitive shooters or esports players.

 Thus, if you do not take into account the fps preferences of professional players, then for the majority an average of 60fps will be comfortable, this indicator is considered a comfortable minimum.

What affects performance? [3]

Consider the most popular factors affecting optimization


Figure 2 – Key Factors Affecting Optimization (animation) (6 frames)


  1. Client software
    If the system is loaded with background processes or programs are running that access the disk during the game, then there may not be enough resources for the game itself.
  2. PC Configuration
    The performance directly depends on the computer components.
  3. Graphics Settings
    The graphics are usually configured by users themselves or use presets, but neither option is a guarantee that the settings will be set as optimally as possible, which is why the user loses performance.
  4. Network problems
    Client-server interaction is not always stable, users often have problems with the network, which affects the performance of network games. The reason for this may be problems on the provider's side, a large number of wireless clients, a suboptimal network route for traffic delivery, poor-quality network equipment.
  5. Modifications
    Modifications of the original game by third-party developers negatively affect performance. They not only reduce the average fps, but also create sharp dips in cases where a clean client works stably.
  6. The speed of the application
    There are a huge number of different game engines with their pros and cons, they all consume different amounts of resources, depending on the quality of the architecture and the supported effects. The programming language in which the engine is written also affects, because resources are distributed differently in different languages. This will be discussed further.

Ways to optimize the code

 In order for the engine to produce the maximum number of fps, it is necessary not only to be able to work effectively with it, but also to know the subtleties of development in the language in which the engine is written, and the principles of the engine for a better understanding of the resource allocation process. [4]

 There are many different factors affecting the processing speed of frames, let's look at the general methods of code optimization. Which can be suitable in almost any situation:

  1. Minimizing the influence of objects outside the screen
    Minimizing the amount of calculations by optimizing the rendering of objects is extremely important. Often this is done by engines, or a graphics processor. For implementation, it is necessary to divide the object into 2 layers - the first will be a graphical representation of the object, and the second - data and functions. Thus, if the object is outside the screen, we no longer need to draw it.
  2. Independence from frame updates
    Usually, in game engines, most objects are updated in each frame, this greatly loads the processor and lowers performance. It is necessary, if possible, to get rid of the update in each frame. To do this, you need to separate the rendering function and call it only when the state of the object changes.
  3. Direct calculations and value search
    Неплохую прибавку к производительности может дать хеширование тригонометрических функций, т.к. эффективнее хранить большую таблицу и брать данные из неё, а не выполнять расчёты на лету.
  4. Downtime
    It is necessary to allocate functions that do not depend on time, for example, weather conditions and calculate them when the user has moved away, is busy reading or other things that are not resource-intensive. The time that is released when the user is busy with things that do not load the processor can be used to calculate many other events.

Optimization of the engine based on the java language

 It is also not superfluous to mention the optimization of the code, based on the features of the java programming language and the features of working with engines based on it [5]:

  1. Loops
    If you need to bypass a large array of data, for example, a list of rectangles (for rendering), enemies or any other heavy objects and you are going to use a for loop for this, then it is better to use the reverse for. The reverse for gives greater efficiency due to the fact that it does not need to check the size of the array every time, and this also happens because a comparison with zero is normal, while a comparison with an integer requires subtraction. But if there is no need to use a for loop, then while will be more efficient.
  2. Threads
    It is necessary to divide heavy actions into different threads and not run complex logic in the rendering thread. When profiling the CPU, you will see a thread named GLThread. Most engines use GL 2.0/3.0 for rendering, and this is the stream that contains the GL context. This means that every user interface change must be made through this thread, otherwise bad things happen — textures don't load, UI elements change simultaneously from different places, it can be a mess. The problem is that using this thread leaves the user with a hung application.
  3. Memory
    Do not forget to clean the components that are no longer needed. Java has its own garbage collector that cleans up unnecessary classes, but some engines, such as libGDX, do not do this. This is because OpenGL memory is not managed by the GC JVM, and thus if you create your own textures, you will have to delete them manually, otherwise you risk a hard-to-debug memory leak.
  4. Packages
    Calling the start and end of the sprite package and ShapeRenderer takes up a lot of memory. So, try your best to call the beginning and the end only once per frame. Try to visualize all your sprites and then visualize the shapes. Instead of rendering your sprites, then shapes, and then re-opening your spritebatch.
  5. Individual methods
    There are some methods that require a lot of resources (or more than others) when they are used, such as intersections of the Rectangle class. For example, the best thing you can do to view an array of rectangles is to first check if both rectangles are close, and if so, call the Intersects method for them. Instead of calling Intersects with all the rectangles in the array.
  6. Visualization
    The drawing method is another element that requires a lot of resources, if you need to change the state of an element, even if it is outside the screen, you can try to update its logic/variables every time, but call the drawing method only if it is inside the screen coordinates.
  7. Переменные
    Declaring variables or objects in bad places can cause hanging, avoid declaring variables in the rendering loop and basically don't declare them inside something like a for loop, instead declare them outside the loop and update its values inside. The good point about this is that it doesn't matter when we're talking about primitive data types, so if a variable is of type int, boolean, or float, declaring it inside a loop won't have a big impact on performance.
  8. Patterns
    Patterns greatly simplify the development and efficiency of the application, as well as code support. If you want to have one element that can interact and be in any class of your program, something ubiquitous, such as a player class, you can make it a Singleton class (this is very simple and will give you more convenient code). If you don't want your program to have useless instances that you may never use, you can use the Factory design pattern.Patterns greatly simplify the development and efficiency of the application, as well as code support. If you want to have one element that can interact and be in any class of your program, something ubiquitous, such as a player class, you can make it a Singleton class (this is very simple and will give you more convenient code). If you don't want your program to have useless instances that you may never use, you can use the Factory design pattern.

OpenGL-based optimization

 At first glance, it may seem that the performance of applications based on OpenGL is primarily determined by the performance of the implementation of the OpenGL library itself [6]. This is true, but the organization of the entire application (the use of local and global variables, data structures, libraries, command execution sequence) is also very important.

High-level optimization

 Usually, an OpenGL program requires high-quality visualization at interactive speeds. But, as a rule, it is not possible to get both at once. Therefore, it is necessary to find a compromise between quality and performance. There are many different approaches to this issue [7]:

  • – displaying the geometry of the scene with low quality during animation, and at the moments of stops showing it with the best quality;

  • – objects that are completely out of sight can be effectively cut off without being passed to the OpenGL pipeline by checking whether the simple volumes (spheres or cubes) that limit them fall into the pyramid of vision;

  • – visualization of the model with a reduced number of primitives during interactive rotation (for example, when the mouse key is pressed) and displaying the model completely when drawing a static image;

  • – during animation, you can disable pseudo-toning (dithering), smooth fill, texture overlay and enable all this during the demonstration of static images (this approach is especially effective for systems without OpenGL hardware support).

Low-level optimization

 Objects displayed using OpenGL are stored in some data structures. The speed of visualization is determined by the efficiency of using such structures. It is desirable to use data structures that can be quickly and efficiently transferred to the OpenGL pipeline. For example, if you need to display an array of triangles, then using a pointer to this array is much more efficient than passing it to OpenGL piecemeal.

 Suppose that an application is being created that implements the drawing of a terrain map. One of the components of the database is a list of cities with their width, longitude and name.

 A corresponding data structure is created to store information about the city. The list of cities can be stored as an array of such structures. Then a function is created that draws cities on the map as dots of different sizes with captions: if the city is small, then the dots have a size of 2 px, if large - 4 px.

 The implementation presented in the first variant is unsuccessful for the following reasons:

  • – glPointSize() is called for each iteration of the loop;

  • – only one point is drawn between glBegin() and glEnd();

  • – vertices are defined in a non-optimal format.

 In the second implementation, glPointSize() is called only twice, and the number of vertices between glBegin() and glEnd() increases. However, there are still ways to optimize. If you change the data structures, you can also increase the efficiency of drawing points.

 The third option can be considered the most optimal. After the reorganization, the structures of cities of different sizes are stored in different lists, the positions of points are stored separately in a dynamic array. The need for a conditional operator inside glBegin/glEnd is eliminated and it is possible to use arrays of vertices for optimization.

Optimization of games for smartphones [8]

  1. Download speed
    Players want to immerse themselves in the action of your game as quickly as possible, so it is important to reduce the loading time of your game as much as possible. The following measures usually help to reduce the loading time:

    – Perform lazy loading. If you use the same assets in consecutive scenes or levels of the game, load these assets only once;

    – Reduce the size of your assets. This way you can link uncompressed versions of these resources to the APK of your game;

    – Use a disk-efficient compression method. An example of such a method is zlib.

  2. Keep threads with a large amount of memory on a single processor
    On many mobile devices, L1 caches are on specific CPUs, and L2 caches are on a set of CPUs that share a common clock. To maximize the hit to the L1 cache, as a rule, it is best that the main thread of your game, along with any other threads with a large amount of memory, be executed on a single processor.
  3. Postpone short-term work on processors with less power
    Most game engines are able to defer workflow operations to a different CPU compared to the main thread of your game. However, the engine does not know about the specific architecture of the device and cannot anticipate the workload of your game as well as you [9]. Most system-on-chip devices have at least 2 common clock generators. one for fast device processors and one for slow device processors. The consequence of this architecture is that if one fast CPU needs to run at maximum speed, all other fast CPUs also run at maximum speed. The example report shown in Figure 3 shows a game that takes advantage of fast processors. However, this high level of activity quickly generates a large amount of energy and heat.

    Figure 3 – Demonstration of suboptimal thread assignment to device processors

  4. Thermal load
    When devices overheat, they can slow down the CPU and/or GPU, and this can have an unexpected effect on games. Games that involve complex graphics, heavy computing, or sustained network activity are more likely to run into problems. Use the thermal API to monitor temperature changes on the device and take measures to maintain lower power consumption and lower device temperature. When the device reports overheating, stop the current activity to reduce power consumption. For example, reduce the frame rate or polygon tessellation.

    Loading user interface elements. to maintain a constant frame rate, it is important to take into account the relatively small size of mobile displays and simplify the user interface as much as possible.

    The report shown in Figure 4 is an example of a user interface frame that tries to display too many elements compared to the capabilities of a mobile device. A good goal is to reduce the user interface refresh time to 2-3 milliseconds. Such rapid updates can be achieved by performing an optimization similar to the following:

    – Update only those elements on the screen that have been moved.

    – Limit the number of textures and layers of the user interface. Combine graphical calls such as shaders and textures that use the same material.

    – Transfer the animation operations of elements to the GPU.

    – Perform a more aggressive clipping of the truncated pyramid and occlusion.

    – If possible, perform drawing operations using the Vulcan API.

    – The overhead of rendering calls is lower on Vulcan. [10]

    Figure 4 is a report for a game in which dozens of user interface elements are displayed simultaneously.

Conclusions

 There are a huge number of game engines that provide the programmer with many different functions and allow you to create high-quality games, however, if it is designed for users with a weak device, then it is necessary to use knowledge of the programming language, the architecture of the engine and the operating system in order to achieve satisfactory performance indicators of the program. To create such projects, you need a lot of experience and a considerable amount of skills.

 In further research, it is planned to test the above methods in practice and present metrics in the form of graphs and diagrams in order to improve them.

List of sources