Resource Management — Generation-Based Handles, Buffers, Images, and Samplers
The RHI resource layer is the memory backbone of the renderer. Every GPU buffer, texture, and sampler flows through a single class — ResourceManager — which wraps raw Vulkan objects behind lightweight, validated handles. The design solves two problems at once: it hides VMA allocation boilerplate behind clean creation descriptors, and it prevents an entire class of use-after-free bugs through a generation counter baked into every handle. This page dissects the handle mechanism, the three resource pools (buffers, images, samplers), the upload pipeline, and the format abstraction layer that decouples upper code from Vulkan enum types.
Sources: types.h, resources.h, resources.cpp
Generation-Based Handle Architecture
Why Generations?
In a renderer, GPU resources are created and destroyed frequently — textures load, render targets resize, buffers get replaced. A raw index into an array is dangerous: if slot 3 is freed and then reused for a different texture, any stale handle still pointing to slot 3 will silently reference the wrong resource. The classic fix is a generation counter per slot.
Each handle stores two 32-bit values: an index (which slot in the pool) and a generation (which "version" of that slot). When a slot is freed, its generation increments. When the slot is reused, the new handle captures the incremented generation. Any stale handle from before the free will carry the old generation, and a single uint32_t comparison catches the mismatch in debug builds via assert.
Sources: types.h
The Three Handle Types
All three handle types — ImageHandle, BufferHandle, and SamplerHandle — share the same structure and semantics. They are value types (copyable, comparable), default-constructed to an invalid state (index = UINT32_MAX), and validated through a valid() method:
| Handle Type | Default Index | Invalid Condition | Equality Operator |
|---|---|---|---|
ImageHandle |
UINT32_MAX |
index == UINT32_MAX |
default (compares both fields) |
BufferHandle |
UINT32_MAX |
index == UINT32_MAX |
default |
SamplerHandle |
UINT32_MAX |
index == UINT32_MAX |
default |
A fourth related type, BindlessIndex, breaks the pattern — it is a single uint32_t with no generation counter, because bindless texture indices are never recycled within a session (they index into a grow-only descriptor array).
Sources: types.h
Slot Lifecycle Diagram
The following diagram shows the complete lifecycle of any resource slot — creation, lookup, destruction, and the safety net that the generation provides:
Sources: resources.cpp, resources.cpp, resources.cpp
The Slot Pool Pattern
All three resource pools (buffers, images, samplers) use an identical allocation strategy. Each pool is a std::vector<T> paired with a std::vector<uint32_t> free list. The free list behaves like a LIFO stack — the most recently freed slot is the first to be reused:
- Allocation (
allocate_*_slot()): If the free list is non-empty, pop the back element. Otherwise,emplace_back()a default-constructed slot and returnsize - 1. - Release (
destroy_*()): Destroy the Vulkan/VMA object, null out the handle, increment the generation counter, and push the index onto the free list. - Lookup (
get_*()): Assert the handle's generation matches the slot's generation, then return aconstreference to the internal storage.
This design avoids slot fragmentation (no hole-punching in a dense array), keeps iteration over all slots trivial for bulk cleanup, and makes the free-list push/pop operations O(1).
Sources: resources.h, resources.cpp
Buffers — Creation, Memory Strategy, and Device Address
BufferDesc and BufferUsage
A BufferDesc carries three fields: size (bytes), usage (bitmask of BufferUsage flags), and memory (one of three MemoryUsage strategies). The BufferUsage enum is designed for bitwise composition — the RHI provides operator|, operator&=, and a has_flag() free function for testing individual bits:
| BufferUsage Flag | Vulkan Equivalent | Typical Use |
|---|---|---|
VertexBuffer |
VERTEX_BUFFER_BIT |
Mesh vertex data |
IndexBuffer |
INDEX_BUFFER_BIT |
Mesh index data |
UniformBuffer |
UNIFORM_BUFFER_BIT |
Per-frame/per-draw constants |
StorageBuffer |
STORAGE_BUFFER_BIT |
GPU-read/write structured data |
TransferSrc |
TRANSFER_SRC_BIT |
Staging buffer source |
TransferDst |
TRANSFER_DST_BIT |
Upload destination |
ShaderDeviceAddress |
SHADER_DEVICE_ADDRESS_BIT |
Ray tracing AS references |
AccelStructBuildInput |
ACCELERATION_STRUCTURE_BUILD_INPUT_READ_ONLY_BIT_KHR |
RT geometry input |
Sources: resources.h, resources.cpp
Memory Strategies via VMA
The MemoryUsage enum maps directly to VMA allocation flags. All strategies use VMA_MEMORY_USAGE_AUTO as the base, which lets VMA choose between pooled and dedicated allocations based on the driver's VkMemoryDedicatedRequirements:
| MemoryUsage | VMA Flags | CPU Access | Typical Use |
|---|---|---|---|
GpuOnly |
None (auto) | No | Vertex/index buffers, storage buffers, images |
CpuToGpu |
HOST_ACCESS_SEQUENTIAL_WRITE | MAPPED |
Write-only, persistent map | Dynamic uniform buffers, staging uploads |
GpuToCpu |
HOST_ACCESS_RANDOM | MAPPED |
Read-write, persistent map | Readback buffers (e.g. GPU query results) |
For CpuToGpu buffers, VMA provides a persistently mapped pointer through VmaAllocationInfo::pMappedData, so the upper layer can write directly without calling vmaMapMemory each frame.
Sources: types.h, resources.cpp
Internal Buffer Storage
Each buffer slot stores: the VkBuffer handle, the VmaAllocation, the VmaAllocationInfo (which contains the mapped pointer for CPU-visible buffers), the original BufferDesc, and the generation counter. The allocation info is preserved because dynamic uniform buffers need the mapped pointer every frame — it would be wasteful to re-query it.
Sources: resources.h
Device Address Retrieval
For ray tracing, buffers need a GPU-visible address (used in acceleration structure build inputs and shader device address references). get_buffer_device_address() wraps vkGetBufferDeviceAddress and asserts that the buffer was created with ShaderDeviceAddress usage. The address is a raw VkDeviceAddress (a 64-bit GPU virtual pointer) that shaders can use directly.
Sources: resources.cpp
Images — Views, Cubemaps, External Registration, and Mip Generation
ImageDesc and ImageUsage
An ImageDesc describes the full image dimensionality: width, height, depth, mip levels, array layers, sample count, format, and usage flags. The ImageUsage enum mirrors Vulkan's image usage bits:
| ImageUsage Flag | Vulkan Equivalent | Typical Use |
|---|---|---|
Sampled |
SAMPLED_BIT |
Texture sampling in shaders |
Storage |
STORAGE_BIT |
Compute write targets |
ColorAttachment |
COLOR_ATTACHMENT_BIT |
Render targets |
DepthAttachment |
DEPTH_STENCIL_ATTACHMENT_BIT |
Depth buffers |
TransferSrc |
TRANSFER_SRC_BIT |
Blit source (mip generation) |
TransferDst |
TRANSFER_DST_BIT |
Upload destination |
Sources: resources.h, resources.cpp
Automatic View Creation
create_image() always creates a default image view alongside the image itself. The view type is inferred from the descriptor: a 6-layer array gets VK_IMAGE_VIEW_TYPE_CUBE, any other multi-layer array gets VK_IMAGE_VIEW_TYPE_2D_ARRAY, and a single-layer image gets VK_IMAGE_VIEW_TYPE_2D. Cubemap images also receive the VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT flag on the VkImageCreateInfo. This convention means callers rarely need to create views manually — they can use the default view directly for sampling or attachment binding.
Both the image and the view receive debug names (the view is named "ImageName [View]") via VK_EXT_debug_utils, making them identifiable in RenderDoc and Nsight Graphics.
Sources: resources.cpp
External Image Registration
Swapchain images are created by Vulkan's swapchain extension, not by the resource manager. To integrate them into the handle system, register_external_image() allocates a slot and records the VkImage, VkImageView, and ImageDesc — but sets allocation = VK_NULL_HANDLE to mark it as externally owned. When the slot is released via unregister_external_image(), the manager increments generation and frees the slot without destroying the Vulkan objects. The bulk destroy() method detects external images by their null allocation and emits a warning instead of destroying them.
Sources: resources.cpp, resources.cpp
Per-Layer Views for Shadow Cascades
For shadow mapping with CSM, each cascade needs its own VkImageView targeting a single layer of a layered depth image. create_layer_view() creates a VK_IMAGE_VIEW_TYPE_2D view with baseArrayLayer = layer and layerCount = 1. The returned VkImageView is caller-owned — it must be destroyed via destroy_layer_view() before the parent image is freed. This is one of the few places where the API returns a raw Vulkan handle instead of an RHI handle, because layer views are short-lived, per-pass resources.
Sources: resources.cpp
Mip Generation
generate_mips() records a chain of vkCmdBlitImage2 calls into the active immediate command buffer. It expects mip level 0 to already be populated (e.g. from upload_image()). The algorithm transitions mip 0 from SHADER_READ_ONLY to TRANSFER_SRC, then iterates from level 1 upward — each level is transitioned from UNDEFINED to TRANSFER_DST, blitted from the previous level with linear filtering, then transitioned to TRANSFER_SRC for the next iteration. Finally, all levels are transitioned to SHADER_READ_ONLY. The image must have both TransferSrc and TransferDst usage flags, and more than one mip level.
Sources: resources.cpp
Samplers — Filtering, Wrapping, and Comparison
SamplerDesc Fields
A SamplerDesc controls every parameter of a VkSampler through a Vulkan-agnostic interface:
| Field | Type | Purpose |
|---|---|---|
mag_filter / min_filter |
Filter (Nearest, Linear) |
Magnification/minification filter |
mip_mode |
SamplerMipMode (Nearest, Linear) |
Inter-mip interpolation |
wrap_u / wrap_v |
SamplerWrapMode (Repeat, ClampToEdge, MirroredRepeat) |
UV address mode |
max_anisotropy |
float |
Max anisotropic filtering level (0 = disabled) |
max_lod |
float |
Maximum accessible mip level (0 = base only, VK_LOD_CLAMP_NONE = all) |
compare_enable |
bool |
Enable depth comparison mode |
compare_op |
CompareOp |
Comparison function for shadow sampling |
Anisotropy is conditionally enabled — only when max_anisotropy > 0.0f. The device's supported maximum is queried at startup and exposed via max_sampler_anisotropy(), so the application can pass it directly to the sampler description.
Sources: resources.h, resources.cpp
Sampler Catalog in Practice
The renderer creates five application-wide samplers at initialization, each tuned for a specific purpose:
| Sampler | Filter | Wrap | Aniso | Compare | Use Case |
|---|---|---|---|---|---|
| Default | Linear/Linear/Linear | Repeat | Max device | No | Standard PBR textures |
| Shadow Comparison | Linear/Linear/Nearest | ClampToEdge | 0 | Yes (GreaterOrEqual) |
PCF shadow sampling |
| Shadow Depth | Nearest/Nearest/Nearest | ClampToEdge | 0 | No | Shadow map depth attachment |
| Nearest Clamp | Nearest/Nearest/Nearest | ClampToEdge | 0 | No | AO/depth buffer reads |
| Linear Clamp | Linear/Linear/Nearest | ClampToEdge | 0 | No | Bilateral filter taps |
Additionally, the scene loader creates per-texture samplers from glTF sampler descriptors, mapping fastgltf::Sampler properties to SamplerDesc fields.
Sources: renderer_init.cpp
Data Upload Pipeline
Immediate Command Scope
All upload operations must occur within a Context::begin_immediate() / Context::end_immediate() scope. This provides a command buffer from a dedicated pool (decoupled from per-frame pools) and guarantees GPU completion via vkQueueWaitIdle before end_immediate() returns. The resource manager asserts context_->is_immediate_active() at the top of every upload method.
Sources: resources.cpp, context.h
Staging Buffer Pattern
Both upload_buffer() and upload_image() follow the same staging pattern:
- Create a CPU-visible staging buffer (
VMA_MEMORY_USAGE_AUTO+HOST_ACCESS_SEQUENTIAL_WRITE | MAPPED). - Memcpy the source data into the mapped pointer.
- Record a copy command (
vkCmdCopyBufferorvkCmdCopyBufferToImage) into the immediate command buffer. - Register the staging buffer for deferred cleanup via
context_->push_staging_buffer(). - Cleanup happens when
end_immediate()destroys all staging buffers after GPU completion.
This design avoids per-upload synchronization — the single vkQueueWaitIdle at the end of the scope covers all uploads within the scope.
Sources: resources.cpp
Multi-Level Upload for Compressed Textures
upload_image_all_levels() handles the KTX2/mip-chain case where all levels are uploaded in a single batch. It creates one staging buffer for the entire mip chain, records one VkBufferImageCopy2 per level, and issues a single vkCmdCopyBufferToImage2 call. For cubemaps (array_layers = 6), each region covers all six faces — matching the KTX2 face-contiguous layout. The MipUploadRegion struct carries buffer_offset, width, and height per level, giving the caller precise control over where each level's data starts within the source buffer.
Sources: resources.cpp, resources.h
Format Abstraction Layer
The Format enum decouples upper-layer code from VkFormat constants. It covers the formats needed by the renderer: 8-bit UNORM/SRGB, 16-bit float, 32-bit float, packed formats (A2B10G10R10, B10G11R11), block-compressed formats (BC5, BC6H, BC7), and depth formats (D32Sfloat, D24UnormS8Uint). Three utility functions bridge the abstraction:
| Function | Input → Output | Use Case |
|---|---|---|
to_vk_format(Format) |
Format → VkFormat | Resource creation |
from_vk_format(VkFormat) |
VkFormat → Format | KTX2 loading |
aspect_from_format(Format) |
Format → VkImageAspectFlags | Barrier construction |
Sources: types.h
Block-Compressed Format Support
BC formats require special handling because they operate on 4×4 texel blocks rather than individual pixels. Three utilities provide the necessary metadata:
format_bytes_per_block(Format): Returns 16 bytes for BC formats (4×4 pixels compressed into one block), 1–16 bytes for uncompressed formats.format_block_extent(Format): Returns{4, 4}for BC formats,{1, 1}for uncompressed.format_is_block_compressed(Format): Simple boolean check for BC5/BC6H/BC7.
These are used by the texture pipeline when computing buffer sizes and copy regions for BC-compressed textures loaded from KTX2 files.
Sources: types.h
Alignment Utility
The template function align_up<T>(value, alignment) rounds up to the next power-of-two boundary using the standard bitmask trick: (value + alignment - 1) & ~(alignment - 1). This is used throughout the codebase for uniform buffer alignment, SBT entry sizing, and scratch buffer offset computation.
Sources: types.h
Integration with the Render Graph
The Render Graph is the primary consumer of ResourceManager at the framework layer. It manages a collection of managed images — persistent GPU textures that survive across frames (color buffers, depth buffers, AO outputs, temporal history). Each managed image holds an ImageHandle to its backing resource. When the window resizes, the render graph calls set_reference_resolution(), which triggers a rebuild of all size-relative managed images: each is destroyed via destroy_image() and recreated with the new dimensions via create_image(). The generation mechanism ensures that any stale render graph references to the old image are caught.
Temporal managed images (used for temporal anti-aliasing and temporal AO) maintain a double buffer — a current backing and a history_backing. Both are destroyed and recreated together on resize, and the history is marked invalid until one full frame has been rendered.
Sources: render_graph.cpp, render_graph.h
Bulk Destruction and Leak Detection
ResourceManager::destroy() iterates all slots in all three pools and destroys any remaining resources. It counts "leaked" resources (slots where the Vulkan handle is not null) and logs a warning with the counts. For images, it distinguishes between owned images (VMA allocation present → destroy view + image + allocation) and external images (null allocation → emit warning about forgotten unregister_external_image() call). This catch-all destruction provides a safety net: even if upper layers forget to clean up, the resource manager won't leak GPU objects past shutdown.
Sources: resources.cpp
Next steps: With resource handles understood, the natural progression is to see how images and samplers are wired into the GPU via the Bindless Descriptor Architecture — Three-Set Layout and Texture Registration, and how the Render Graph orchestrates handle-to-image resolution and barrier insertion across render passes.