2011年9月24日星期六

havok's memory system

Chapter 2. Memory Management

Havok's memory management system is designed to maximize performance based on how Havok uses memory, as well to provide flexibility in order to fit into the memory requirements of your application. This chapter describes the components of the memory system, how to customize the system to your needs, and how to get detailed reports on memory use at any time.

1. Overview

There are three major components in the Havok memory system:

hkMemoryAllocator
The lowest level interface through which blocks can be allocated and freed
hkMemoryRouter
A per-thread collection of hkMemoryAllocators, e.g. it contains allocators for regular heap, temporary allocations and debug allocations.
hkMemorySystem
High level interface to the system as a whole, e.g. statistics, setting up a new threads memory router. Generally contains all the allocators for all threads.
The following diagram shows their relationships: the memory system owns and arranges the plumbing of the allocators; the memory routers point to allocators and only see a small part of the system; and the allocators themselves have an even more restricted overall view.
Each of these components is discussed in more detail in the following sections.

1.1. hkMemoryRouter Class

hkMemoryRouter is a facade object through which all allocations and deallocations are routed. It is little more than a structure containing pointers to allocators for each allocation pattern. The class is nonvirtual - its behaviour is customized by replacing the allocator pointers:

heap
Allocations which will typically last longer than a single frame, for example rigid bodies, constraints, the world and their contained arrays. May be freed by any thread and have no particular lifetime.
temp
Temporary allocation. Lifetime is restricted to the current scope, thus the same thread allocates and frees. Very often but not always in a LIFO pattern.
debug
Allocator used by profiling and checking code so as not to change the main memory profile. For example monitor streams, memory checking utilities and debug visualizers all use this allocator.
solver
Special allocator used exclusively by the physics constraint solver. Requests are only made from the constraint solver and their lifetime is temporary. Thus solver memory may be freed outside of the physics step. Requests are often for large contiguous blocks so this has been separated from the temp allocator and will likely be shared between all threads unlike the temp allocator.
stack
Allocator for pure last-in-first-out allocations, essentially a restricted form of the temp allocator. On most platforms with the default setup, this is the temp allocator. (On the PlayStation®3 SPU, this is not actually a virtual allocator class but is a simple pointer bump and all calls are inlined.)
Each thread has its own router instance, though usually many of the objects behind the facade are shared across multiple threads.

1.2. hkMemoryAllocator Interface

hkMemoryAllocator is the basic interface for allocations, providing the pure virtual methods blockAlloc and blockFree to allocate and deallocate blocks of memory. Note that the size must be explicitly given to blockFree which reduces per-allocation overhead.

Note

By default, Havok expects allocated memory to be 16-byte aligned. You will need to take this into account if you override the memory allocation methods.

1.2.1. Buffer allocations

The bufAlloc, bufFree and bufRealloc methods may be overridden to optimize handling of resizable blocks. They are used where the caller cares that bigger than requested block size may be returned by the allocator and also to give the allocator opportunity to resize an allocation without copying. hkArray is the major user of this interface. An implementation may take advantage of this fact by trying to leave space after buffer allocations so that a future reallocation will succeed without moving the buffer. The default implementation of the buffer allocator implemented using the block allocator.

1.2.2. Optimizing for space

Memory managers often store "magic" information, such as size, just in front of the allocated memory. On modern machines the overhead for this may be up to 64 bytes because of alignment issues. Havok's memory manager tries to save this space using two techniques:
  • Many simple classes (e.g. hkAabb) know their size, and so the memory allocator doesn't need to explicitly store it - the class knows how much space to free when it is deleted. These classes have their sizes compiled directly into their overridden operator new and delete operators. See HK_DECLARE_NONVIRTUAL_CLASS_ALLOCATOR.
  • For classes derived from hkReferencedObject, the allocated size is stored directly in the instance. See m_memSizeAndFlags in HK_DECLARE_CLASS_ALLOCATOR.

1.2.3. Allocator utilities

The default allocators all supply 16 byte aligned allocations and require the size to be given for deallocating. hkMemoryRouter has utility methods - easyAlloc, alignedAlloc - to allocate blocks with greater alignment and to stash the allocation size so it need not be supplied. They work by allocating a little more than requested and storing some bookkeeping information before the returned pointer, so the corresponding utility free must be used.

1.2.4. Marking unused memory

Some memory allocator implementations overwrite freed/uninitialized memory with known values to aid in debugging. hkFreeListAllocator overwrites freed memory and newly allocated memory with 0x7ffa110c when HK_DEBUG is defined. The allocator used in hkCheckingMemorySystem always overwrites its memory.

1.3. hkMemorySystem Interface

The hkMemorySystem is for operations on the memory system as a whole. For example it has methods for getting memory statistics and for setting up and tearing down the memory system.

1.3.1. Initialization

The memory system used by Havok must be initialized before the hkBaseSystem is initialized.
To initialize a memory system, two tasks need to be performed:
  • Set a global memory system object for global methods on the memory system
  • Initialize each of the allocators in a hkMemoryRouter instance and return it
Since all knowledge of the memory setup is encapsulated in the memory system class, each thread must ask the system to set up its memory router appropriately. One thread is must be designated the "main" thread - not necessarily the main system thread, just the first Havok aware thread to be initialized and the last to be deinitialized. The relevant methods on hkMemorySystem are mainInit and mainQuit. (Note that hkMemoryInitUtil calls these methods for you.)
After the main thread has been set up, each worker thread should call threadInit and threadQuit. These methods are implicitly called from the mainInit and mainQuit so should not be called in the main thread.

1.3.2. Garbage Collection

Many allocators cache their freed memory to satisfy future requests and do not release it to their parents unless explicitly requested. The memory system has several methods to release such cached memory. Because of the lockless nature of the thread local allocators, normally each thread needs to release its own cache back to the shared pool and a later call releases the shared cache back to the higher level allocator. This must be done manually by calling the memory system's garbageCollect method (see also garbageCollectThread and garbageCollectShared). This method will cause the allocator to optimize memory in space - potentially making more, and larger chunks of memory available. Doing a garbage collection takes some work, so it is not recommended that it is called every frame. It is better to call it in between levels, or at a point where you know your memory usage is going to change significantly.
Note that doing a garbage collection cannot make available all 'free' memory available - it only frees a block of memory if all of the contained blocks are free. Therefore if you allocated all of the memory with 128 byte chunks and then freed every other chunk - performing a 256 byte allocation will still fail, as no pool will have been freed, and moreover memory is now fragmented such that there is no contiguous block of 256 bytes.
When hkFreeListAllocator cannot allocate it will automatically perform a garbage collection and attempt to allocate again. It is still recommended that you perform a garbage collect independently of this mechanism, as otherwise the automatic garbage collection could happen at a time critical section of your application.
Note that calling threadQuit implicitly releases cached memory. Also hkFreeListMemorySystem will release cached memory when the temporary parts are released.

1.3.3. Reusing Temporary Memory

Some memory is only needed during specific intervals of execution. For instance the temp allocators, solver allocators and all thread stacks are completely empty outside of Havok SDK calls and could be reused while the Havok SDK is not being used. Most calls will require both a stack and a temp allocator, but the solver allocator is only ever accessed from the constraint solver, so can safely be reused when the world is not being stepped.
To support this, memory systems may support partially tearing down a threads resources. The following example shows a thread freeing its temporary parts while sleeping between workloads. The main thread would have a similar mainInit and mainQuit pair which would free any shared resources.
hkMemorySystem& memSystem = hkMemorySystem::getInstance();
hkMemoryRouter memRouter;
memSystem.threadInit( memRouter, "worker", hkMemorySystem::FLAG_PERSISTENT );
hkBaseSystem::initThread(&memRouter);

while( waitForStartSignal() == HK_SUCCESS )
{
    memSystem.threadInit( memRouter, "worker", hkMemorySystem::FLAG_TEMPORARY );
    doWork();
    memSystem.threadQuit( memRouter, hkMemorySystem::FLAG_TEMPORARY );
    sendWorkDoneSignal();
}

hkBaseSystem::quitThread();
memSystem.threadQuit( memRouter, hkMemorySystem::FLAG_PERSISTENT );

1.3.4. Memory Statistics

All memory systems support getting a summary of all memory allocated and in use by the memory system, via the getHeapStatistics method. This contains information for each allocator in the router. Values which are not applicable for a given memory system are set to -1.
Some memory systems also support getting snapshots to gather detailed data about the memory that is currently allocated by the memory system, via the getMemorySnapshot method. These snapshots contains the following information about each allocation, and can be used to give detailed reports on memory use:
  • A pointer to the allocation.
  • The size (in bytes) of the allocation.
  • A trace of the call stack when the allocation was made.
Please see the Memory Reporting section for more information on working with memory statistics.

1.4. Overriding Class New and Delete

Havok uses several macros to declare class local memory management. These ensure that all Havok objects are allocated by the appropriate allocators in the Havok memory system.
The HK_DECLARE_CLASS_ALLOCATOR macro overrides class new and delete so that classes derived from hkReferencedObject are automatically handled correctly by the memory system. Note that the memory manager expects to be able to find a hkReferencedObject at the object address, so when using multiple inheritance the hkReferencedObject should come first in the inheritance list.
Unfortunately, non virtual classes cannot inherit from a memory management class, since some compilers do not implement the empty base object optimization. For each of these classes we require them to use the HK_DECLARE_NONVIRTUAL_CLASS_ALLOCATOR macro. The memory class parameter is the same as in the virtual case. The name of the class must also be passed to get its size information and is compiled into their operator new and delete
Some classes can only usefully provide placement operators new and delete (most notably the memory classes themselves). These classes use HK_DECLARE_PLACEMENT_ALLOCATOR() and must be allocated statically or via another memory system and constructed in place.

没有评论:

发表评论