Systems

COS 486 - Game Engine Architecture

slide deck credit

original slide content prepared by Roger Mailler, Ph.D., Assoc Prof of Computer Science, University of Tulsa

overview

subsystem startup and shutdown
memory management
containers
strings and localization
engine configuration

systems

subsystems startup and shutdown

startup and shutdown

game engines are complex beasts
- rendering system (OpenGL / DirectX / Vulkan contexts)
- audio system
- memory managers
- human input device system
- networking
- user: authentication, leaderboard, badges
- ...
when it first starts up, subsystems must be configured and initialized
order is relevant, because of subsystem dependencies
typically started in one direction and shutdown in the opposite

singleton

a common pattern for game engine components is to create managers using a singleton

class RenderManager {
    public:
    RenderManager() {
        // start up the manager
    }
    ~RenderManager() {
        // shut down the manager
    }
};

// singleton instance
static RenderManager gRenderManager;

singleton: a software design pattern that restricts the instatiation of a class to one

[ wikipedia ]

initialization and singleton

C++ creates static objects just before the main method is called
The order for static object creation is arbitrary
Not very good for controlling the order of startup and shutdown

controlling the singleton

One trick you can use is that a static variable declared within a method is not initialized at startup

class RenderManager {
    public:
    static RenderManager& get() {
        static RenderManager sSingleton;
        return sSingleton;
    }
    RenderManager() {
        VideoManager::get();
        TextureManager::get();
    }
    ~RenderManager() {
        // shut down the manager
    }
};

This works pretty well, but you cannot control the shutdown at all

do it directly

create startup and shutdown methods in all the managers
- it's simple
- it's understandable
- easy to debug and maintain
you could also use singletons where the main creates the objects using new
take a look at the code from OGRE and Naughty Dog's games in the book

Godot

Godot splits startup across several functions

main.cpp setup()
- handles command-line args
- determine which systems and system configs to use
main.cpp setup2()
- brings up systems

systems

memory management

The performance of your game engine is associated not only with your algorithms but the way you use memory

malloc() and new are slow!
access pattern and memory fragmentation seriously impact caching performance
- see How Stardock's Elemental: War of Magic Failed | War Stories | Ars Technica

dynamic allocation

Heap allocation is slow, because it has to be general
- Handles requests for 1byte, 1MB, 1GB, ...
- Creates a ton of management overhead
Also slow because of a context switch
- First switches from user to kernel mode
- Then has to switch back
Engines cannot completely avoid dynamic allocation, but can create custom allocators to work around the performance issues

stack-based allocator

The easiest way is to use a pre-allocated memory block and use it as a stack
When a level is loaded, add it to the stack. When it is finished, move the stack pointer back.
Order is important, because you can overwrite a currently used memory location
Often done using rollback markers

stacks

double-ended stack

Another method uses a double-ended stack

useful for having big allocations on one side and small temporary allocations on the other

pool allocator

this technique works by allocating a large number of fixed-sized memory blocks
- ex: 1000 4x4 matrices
When you need a new matrix, you get it from the pool
Return it to the pool when you are done
The pool can be managed by a linked list

aligned allocation

one problem with pooled memory is that every variable and object has an alignment requirement
The memory allocator must be able to return aligned memory, otherwise you have serious trouble
Not a serious problem as an aligned allocator is easy to write

single or double frame

you often need to allocate memory every frame
One way to do this is to use a single frame buffer
allocate the memory once and free it only when the rendering is complete
- very fast, but you have to be careful
a double-frame buffer might be better in a multi-core setup
- allocate memory at frame \(i\) for use in \(i+1\)

fragmentation

doing dynamic allocation can create memory fragments
- this slows down memory copies
- can prevent allocation when a contiguous block is not available
pooled and stack allocators avoid this problem

fragmentation

if you need random allocation/deallocation, you may require a defragmentation routine

defrag

defrag memory by shifting each block toward one end
requires sorting blocks by starting memory address

defrag

the defrag operation can occur over several frames
- just move one thing at a time
moving memory is tricky though
- all pointers to it need to be updated!
rather than using pointers directly, handles are frequently used
- handles point to immutable memory that contains the current pointer to the object (yes, a pointer to a pointer)

cache

by now, you should understand how cache works (or at least understand the effects of cache)
to avoid cache misses on data, we try to keep data chunks small, contiguous in memory, and access them sequentially
instructions are also held in cache

i-cache

the compiler and linker handle most of the details about how code is represented in memory
we can help it because it follows certain rules:
- machine code from a function is contiguous in memory
- functions are stored in their original order
- functions in a single file are almost always contiguous

taking advantage

high-performance code

keep it code small
avoid making function calls in a performance critical section
- if you have to call a function, put it as close as possible to the caller and never in another file
don't overuse inline functions; they can bloat the code
- besides, inline is only a suggestion to the compiler

systems

containers

game developers use many different data structures

array, dynamic array,
linked list, queue, deque,
tree, binary search tree,
binary heap,
priority queue,
dictionary,
set,
graph, directed acyclic graph,
...

many of these can be found in STL (standard template library)

container operations

Containers store, retrieve, and operate on data

insert, remove
sequential access (iteration), random access
find
sort

remember that different containers have different costs for these various operations!

iterators

STL has an iterator class that works much like the one in Java
Allows you to maintain encapsulation in the container object
Easier to use than pointer manipulation

void processList(std::list<int>& container) {
    std::list<int>::iterator pBegin = container.begin();
    std::list<int>::iterator pEnd   = container.end();
    std::list<int>::iterator p;
    for(p = pBegin; p != pEnd; p++) {
        int element = *p;
        // do stuff
    }
}

building custom containers

Many reasons to build your own custom containers

Total control: you dictate the algorithms, memory use, etc.
Opportunities to optimize: optimize based on a specific hardware platform or your use patterns
Customizability: you can add custom features specific to your purpose
Elimination of external dependencies: no licensing fees, you can fix it yourself
Control of concurrent data structures: you have total control over concurrent access

standard template library

Benefits

rich set of features
robust implementations on a wide variety of platforms
comes standard with most C++ compilers

Drawbacks

steep learning curve
often slower than a custom crafted data structure
eats up a lot of memory
does a lot of dynamic memory allocation
performance varies based on the compiler

rules for using STL

be aware of the performance and memory characteristics

avoid heavyweight STL classes in critical code sections
don't use it when memory is at a premium

use STLPort if you plan to create multiplatform games

note: big-Oh notation hides constant factors that could be huge! this is why we used tilde notation in COS265

boost

Boost is another library with aim to extend and work with STL

Benefits

provides things not available in STL
provides some workarounds to problems with STL
handles complex problems like smart pointers
documentation is really good

Drawbacks

most core classes are templates; has large .lib files
no guarantees: if you find a bug, it's your issue
boost license: not very restrictive

Some companies have strict development policies against using boost

Loki

“
Loki is a C++ library of designs, containing flexible implementations of common design patterns and idioms.
”

Written by Andrei Alexandrescu
- GitHub, Wikipedia
Very powerful, but hard to understand
Less portable because it uses sophisticated compiler tricks
Look for the book Modern C++ Design by Alexandrescu

systems

strings and localization

Strings

Strings are extremely important in games

They are far from simple to manage

What if they need to be resized?
How do you deal with localization issues?
- Different character sets
- Different lengths for translations
- Different display layouts
Checking for equality is an \(O(N)\) operation

string classes

They come with an overhead
- Copy constructors on a function call
- Dynamic memory allocation
Probably should be avoided by using fixed sized wchar_t arrays
Path classes might be the exception
- Often include more information than a string

unique ids

Objects within the game need a way to be identified

Strings seem like a natural choice
- Comparison costs are not good, though
GUIDs (numbers) are a lot faster to compare
- Harder to remember

hashed string ids

We can use hash functions to map our strings to numbers

best of both worlds
remember: collisions are possible
- a good hash function eliminates this concern
- using a uint32 for the values gives over 4 million possible values

These are sometimes referred to as string id

implementation ideas

Runtime hashing can be slow
- Doing many of them can take a long time
One way to avoid this is to offline process the source code
- look for occurrences of the function call and replace it with the hashed number
Another way is to create a static variable to intern the string ids

localization

best to plan for localization from day 1
important to understand that ASCII character codes don't support localization at all
retrain your brain to think in Unicode

unicode

Like ASCII, unicode assigns a unique code point to every character or glyph
When storing characters we use a particular encoding
The combination of encoding and code point yields a character or glyph
Common encodings are
- UTF-32
- UTF-8
- UTF-16

UTF-32

The simplest encoding because all code points are stored in a 32-bit value
Wasteful
- Most western languages don't use high value code points (wastes 2 bytes per character)
- the highest unicode code point is 0x10FFFF (only 21 bits)
Easy because we can figure out the length of a string by dividing number of bytes by 4

UTF-8

Code points are stored in one-byte granularity, but some use two
It's called Variable Length Encoding (VLE) or MultiByte Character Sets (MBCS)
It's backwards compatible with ANSI encoding
- Needs 7 bits to represent the 127 characters
Multibyte characters have the first bit set to one
- This indicates that there are two bytes in the code point

utf-16

a bit simpler than utf-8, but more expensive
code points stored as one or two 16-bit values
also called Wide Character Set (WCS)
UTF contains 17 planes that each contain \(2^{16}\) code points
First plane called Basic Multilingual Plane (BMP)
- Most characters are present in this plane
The other planes are called supplementary planes
- Requires two 16-bit values

ucs-2

A subset of UTF-16 containing only the BMP
Its main advantage is that it is fixed length
- UTF-16 and UTF-8 are variable length
Can be stored in little endian or big endian
- Often stored with a Byte Order Marker (BOM)

`char` and `wchar_t`

standard C/C++ define two data types for characters
- char is used for legacy ANSI strings or for MBCS
- wchar_t is used to represent any valid code point
  - could be 8, 16, or 32bits
to write truly platform-independent code you will need to define your own character data types

Unicode on Windows

In Windows, wchar_t is exclusively for UTF-16 encoding and char for ANSI encoding
Windows API defines three sets of character/string functions

ANSI	WCS	MBCS
`strcmp()`	`wcscmp()`	`_mbscmp()`
`strcpy()`	`wcscpy()`	`_mbscpy()`
`strlen()`	`wcslen()`	`_mbstrlen()`

There are also translation functions like wcstombs()

unicode on consoles

XBox360 uses WCS strings
At Naughty Dog, they only use char strings
- foreign languages are handled with UTF-8 encoding

other localization concerns

Need to translate more than strings
- audio clips
- textures, if they have English words on them
- symbols may not mean what you think it means
- market specific game-rating issues (ex: blood changes the teen-rating in Japan)
Localization database
- need a way to convert string ids to human readable strings
- form is up to you. varies from CSV to full databases

id	english	french
`p1score`	`"Player 1 Score"`	`"Grade Joueur 1"`
`p2score`	`"Player 2 Score"`	`"Grade Joueur 2"`
`p1wins`	`"Player one wins!"`	`"Joueur un gagne!"`
`p2wins`	`"Player two wins!"`	`"Joueur deux gange!"`

final notes on localization

Establish a set of functions early to handle localization
Force developers to use those functions instead of using string literals in the code
Create a configuration system to allow the language to be set

systems

engine configuration

most engines require the ability to save and load config files

many ways to do this

text files
compressed binary files
windows registry
command line options
environment variables
online user profiles

game vs user options

Be careful to separate the game settings from those of a particular user
On windows machines you can use ApplicationData Directory by creating your own folder
You can also use the special key HKEY_CURRENT_USER in the registry to store settings information

examples

Quake uses CVARs, named values with a set of flags
- These are stored in a linked list and the values are retrieved by name
- The flags indicate if the value should persist, written to file
OGRE3D
- uses text files in the Windows INI format
Uncharted
- Several mechanisms including Scheme

Systems

COS 486 - Game Engine Architecture

slide deck credit

overview

systems

subsystems startup and shutdown

startup and shutdown

singleton

initialization and singleton

controlling the singleton

do it directly

Godot

systems

memory management

memory management

dynamic allocation

stack-based allocator

stacks

double-ended stack

pool allocator

aligned allocation

single or double frame

fragmentation

fragmentation

defrag

defrag

cache

i-cache

taking advantage

systems

containers

containers

container operations

iterators

building custom containers

standard template library

rules for using STL

boost

Loki

systems

strings and localization

Strings

string classes

unique ids

hashed string ids

implementation ideas

localization

unicode

UTF-32

UTF-8

utf-16

ucs-2

char and wchar_t

Unicode on Windows

unicode on consoles

other localization concerns

final notes on localization

systems

engine configuration

engine configuration

game vs user options

examples

`char` and `wchar_t`