Systems
COS 486 - Game Engine Architecture
slide deck credit
original slide content prepared by Roger Mailler, Ph.D., Assoc Prof of Computer Science, University of Tulsa
overview
- subsystem startup and shutdown
- memory management
- containers
- strings and localization
- engine configuration
systems
subsystems startup and shutdown
startup and shutdown
- game engines are complex beasts
- rendering system (OpenGL / DirectX / Vulkan contexts)
- audio system
- memory managers
- human input device system
- networking
- user: authentication, leaderboard, badges
- ...
- when it first starts up, subsystems must be configured and initialized
- order is relevant, because of subsystem dependencies
- typically started in one direction and shutdown in the opposite
singleton
a common pattern for game engine components is to create managers using a singleton
class RenderManager {
public:
RenderManager() {
// start up the manager
}
~RenderManager() {
// shut down the manager
}
};
// singleton instance
static RenderManager gRenderManager;
- singleton
- a software design pattern that restricts the instatiation of a class to one
initialization and singleton
- C++ creates static objects just before the main method is called
- The order for static object creation is arbitrary
- Not very good for controlling the order of startup and shutdown
controlling the singleton
One trick you can use is that a static variable declared within a method is not initialized at startup
class RenderManager {
public:
static RenderManager& get() {
static RenderManager sSingleton;
return sSingleton;
}
RenderManager() {
VideoManager::get();
TextureManager::get();
}
~RenderManager() {
// shut down the manager
}
};
This works pretty well, but you cannot control the shutdown at all
do it directly
- create startup and shutdown methods in all the managers
- it's simple
- it's understandable
- easy to debug and maintain
- you could also use singletons where the
main
creates the objects using new
- take a look at the code from OGRE and Naughty Dog's games in the book
Godot
Godot splits startup across several functions
systems
memory management
memory management
The performance of your game engine is associated not only with your algorithms but the way you use memory
malloc()
and new
are slow!
- access pattern and memory fragmentation seriously impact caching performance
dynamic allocation
- Heap allocation is slow, because it has to be general
- Handles requests for 1byte, 1MB, 1GB, ...
- Creates a ton of management overhead
- Also slow because of a context switch
- First switches from user to kernel mode
- Then has to switch back
- Engines cannot completely avoid dynamic allocation, but can create custom allocators to work around the performance issues
stack-based allocator
- The easiest way is to use a pre-allocated memory block and use it as a stack
- When a level is loaded, add it to the stack. When it is finished, move the stack pointer back.
- Order is important, because you can overwrite a currently used memory location
- Often done using rollback markers
stacks
double-ended stack
Another method uses a double-ended stack
- useful for having big allocations on one side and small temporary allocations on the other
pool allocator
- this technique works by allocating a large number of fixed-sized memory blocks
- When you need a new matrix, you get it from the pool
- Return it to the pool when you are done
- The pool can be managed by a linked list
aligned allocation
- one problem with pooled memory is that every variable and object has an alignment requirement
- The memory allocator must be able to return aligned memory, otherwise you have serious trouble
- Not a serious problem as an aligned allocator is easy to write
single or double frame
- you often need to allocate memory every frame
- One way to do this is to use a single frame buffer
- allocate the memory once and free it only when the rendering is complete
- very fast, but you have to be careful
- a double-frame buffer might be better in a multi-core setup
- allocate memory at frame \(i\) for use in \(i+1\)
fragmentation
- doing dynamic allocation can create memory fragments
- this slows down memory copies
- can prevent allocation when a contiguous block is not available
- pooled and stack allocators avoid this problem
fragmentation
if you need random allocation/deallocation, you may require a defragmentation routine
defrag
- defrag memory by shifting each block toward one end
- requires sorting blocks by starting memory address
defrag
- the defrag operation can occur over several frames
- just move one thing at a time
- moving memory is tricky though
- all pointers to it need to be updated!
- rather than using pointers directly, handles are frequently used
- handles point to immutable memory that contains the current pointer to the object (yes, a pointer to a pointer)
cache
- by now, you should understand how cache works (or at least understand the effects of cache)
- to avoid cache misses on data, we try to keep data chunks small, contiguous in memory, and access them sequentially
- instructions are also held in cache
i-cache
- the compiler and linker handle most of the details about how code is represented in memory
- we can help it because it follows certain rules:
- machine code from a function is contiguous in memory
- functions are stored in their original order
- functions in a single file are almost always contiguous
taking advantage
high-performance code
- keep it code small
- avoid making function calls in a performance critical section
- if you have to call a function, put it as close as possible to the caller and never in another file
- don't overuse inline functions; they can bloat the code
- besides,
inline
is only a suggestion to the compiler
containers
game developers use many different data structures
- array, dynamic array,
- linked list, queue, deque,
- tree, binary search tree,
- binary heap,
- priority queue,
- dictionary,
- set,
- graph, directed acyclic graph,
- ...
many of these can be found in STL (standard template library)
container operations
Containers store, retrieve, and operate on data
- insert, remove
- sequential access (iteration), random access
- find
- sort
remember that different containers have different costs for these various operations!
iterators
- STL has an iterator class that works much like the one in Java
- Allows you to maintain encapsulation in the container object
- Easier to use than pointer manipulation
void processList(std::list<int>& container) {
std::list<int>::iterator pBegin = container.begin();
std::list<int>::iterator pEnd = container.end();
std::list<int>::iterator p;
for(p = pBegin; p != pEnd; p++) {
int element = *p;
// do stuff
}
}
building custom containers
Many reasons to build your own custom containers
- Total control: you dictate the algorithms, memory use, etc.
- Opportunities to optimize: optimize based on a specific hardware platform or your use patterns
- Customizability: you can add custom features specific to your purpose
- Elimination of external dependencies: no licensing fees, you can fix it yourself
- Control of concurrent data structures: you have total control over concurrent access
standard template library
Benefits
- rich set of features
- robust implementations on a wide variety of platforms
- comes standard with most C++ compilers
Drawbacks
- steep learning curve
- often slower than a custom crafted data structure
- eats up a lot of memory
- does a lot of dynamic memory allocation
- performance varies based on the compiler
rules for using STL
be aware of the performance and memory characteristics
- avoid heavyweight STL classes in critical code sections
- don't use it when memory is at a premium
use STLPort if you plan to create multiplatform games
note: big-Oh notation hides constant factors that could be huge!
this is why we used tilde notation in COS265
boost
Boost is another library with aim to extend and work with STL
Benefits
- provides things not available in STL
- provides some workarounds to problems with STL
- handles complex problems like smart pointers
- documentation is really good
Drawbacks
- most core classes are templates; has large
.lib
files
- no guarantees: if you find a bug, it's your issue
- boost license: not very restrictive
Some companies have strict development policies against using boost
Loki
“
Loki is a C++ library of designs, containing flexible implementations of common design patterns and idioms.
”
- Written by Andrei Alexandrescu
- Very powerful, but hard to understand
- Less portable because it uses sophisticated compiler tricks
- Look for the book Modern C++ Design by Alexandrescu
systems
strings and localization
Strings
Strings are extremely important in games
They are far from simple to manage
- What if they need to be resized?
- How do you deal with localization issues?
- Different character sets
- Different lengths for translations
- Different display layouts
- Checking for equality is an \(O(N)\) operation
string classes
- They come with an overhead
- Copy constructors on a function call
- Dynamic memory allocation
- Probably should be avoided by using fixed sized
wchar_t
arrays
- Path classes might be the exception
- Often include more information than a string
unique ids
Objects within the game need a way to be identified
- Strings seem like a natural choice
- Comparison costs are not good, though
- GUIDs (numbers) are a lot faster to compare
hashed string ids
We can use hash functions to map our strings to numbers
- best of both worlds
- remember: collisions are possible
- a good hash function eliminates this concern
- using a
uint32
for the values gives over 4 million possible values
These are sometimes referred to as string id
implementation ideas
- Runtime hashing can be slow
- Doing many of them can take a long time
- One way to avoid this is to offline process the source code
- look for occurrences of the function call and replace it with the hashed number
- Another way is to create a static variable to intern the string ids
localization
- best to plan for localization from day 1
- important to understand that ASCII character codes don't support localization at all
- retrain your brain to think in Unicode
unicode
- Like ASCII, unicode assigns a unique code point to every character or glyph
- When storing characters we use a particular encoding
- The combination of encoding and code point yields a character or glyph
- Common encodings are
UTF-32
- The simplest encoding because all code points are stored in a 32-bit value
- Wasteful
- Most western languages don't use high value code points (wastes 2 bytes per character)
- the highest unicode code point is
0x10FFFF
(only 21 bits)
- Easy because we can figure out the length of a string by dividing number of bytes by 4
UTF-8
- Code points are stored in one-byte granularity, but some use two
- It's called Variable Length Encoding (VLE) or MultiByte Character Sets (MBCS)
- It's backwards compatible with ANSI encoding
- Needs 7 bits to represent the 127 characters
- Multibyte characters have the first bit set to one
- This indicates that there are two bytes in the code point
utf-16
- a bit simpler than utf-8, but more expensive
- code points stored as one or two 16-bit values
- also called Wide Character Set (WCS)
- UTF contains 17 planes that each contain \(2^{16}\) code points
- First plane called Basic Multilingual Plane (BMP)
- Most characters are present in this plane
- The other planes are called supplementary planes
- Requires two 16-bit values
ucs-2
- A subset of UTF-16 containing only the BMP
- Its main advantage is that it is fixed length
- UTF-16 and UTF-8 are variable length
- Can be stored in little endian or big endian
- Often stored with a Byte Order Marker (BOM)
char
and wchar_t
- standard C/C++ define two data types for characters
char
is used for legacy ANSI strings or for MBCS
wchar_t
is used to represent any valid code point
- could be 8, 16, or 32bits
- to write truly platform-independent code you will need to define your own character data types
Unicode on Windows
- In Windows,
wchar_t
is exclusively for UTF-16 encoding and char
for ANSI encoding
- Windows API defines three sets of character/string functions
ANSI |
WCS |
MBCS |
strcmp() |
wcscmp() |
_mbscmp() |
strcpy() |
wcscpy() |
_mbscpy() |
strlen() |
wcslen() |
_mbstrlen() |
- There are also translation functions like
wcstombs()
unicode on consoles
- XBox360 uses WCS strings
- At Naughty Dog, they only use
char
strings
- foreign languages are handled with UTF-8 encoding
other localization concerns
- Need to translate more than strings
- audio clips
- textures, if they have English words on them
- symbols may not mean what you think it means
- market specific game-rating issues (ex: blood changes the teen-rating in Japan)
- Localization database
- need a way to convert string ids to human readable strings
- form is up to you. varies from CSV to full databases
id |
english |
french |
p1score |
"Player 1 Score" |
"Grade Joueur 1" |
p2score |
"Player 2 Score" |
"Grade Joueur 2" |
p1wins |
"Player one wins!" |
"Joueur un gagne!" |
p2wins |
"Player two wins!" |
"Joueur deux gange!" |
final notes on localization
- Establish a set of functions early to handle localization
- Force developers to use those functions instead of using string literals in the code
- Create a configuration system to allow the language to be set
systems
engine configuration
engine configuration
most engines require the ability to save and load config files
many ways to do this
- text files
- compressed binary files
- windows registry
- command line options
- environment variables
- online user profiles
game vs user options
- Be careful to separate the game settings from those of a particular user
- On windows machines you can use
ApplicationData
Directory by creating your own folder
- You can also use the special key
HKEY_CURRENT_USER
in the registry to store settings information
examples
- Quake uses CVARs, named values with a set of flags
- These are stored in a linked list and the values are retrieved by name
- The flags indicate if the value should persist, written to file
- OGRE3D
- uses text files in the Windows INI format
- Uncharted
- Several mechanisms including Scheme