- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
- Export some debug information in the free texture register space components zw
Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
Some engines (UE3) read all the components in the shader and use mul/mad with the result
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
- TODO: Alot of work is still needed to execute draw commands out of order
Thats the only solution to games sending many draw calls with high frequency of state changes
- Do not bother rechecking the dirty sampler pool for hits. Its faster to create new sampler than to search the pool
- Reserve some memory on vertex layout struct to reduce reallocation penalty
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions
rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well
addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine
vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation