- The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour
- D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders
formats
- Allows D24S8 and D32S8 transport via typeless channels
- Allows uploading and downloading D24S8 data easily
- TODO: Implement optional byteswapping to fix flushed readbacks with
the same method
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
- Remove the required_xxx_pitch constraint as it makes no sense. The pitch controls what can be written per line.
- It is possible to have a huge surface width but only render to a small region at the beginning and have a smaller pitch than can fit the surface (NFS carbon)
- Do not bind companion framebuffer when clearing single aspect; let the
contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
write-confirmation time. Should avoid false tagging if setup does not
allow a render to occur.
- Immediate mode is isolated from the rest of the vertex configuration
- TODO: Verify register behaviour when immediate mode is used
Check if per-primitive const register values are supported (likely are)
* Restore stack in fifo error handling
* Update get register after the cmd execution
* Fix put pause in the middle of command
* Add restore points when branching to self
* Precise nopcmd detection
* Test all invalid cmds for early treatment of queue corruption
- Do not assume flip marks end-of-frame if executed via syscall
- Also disables skip_frame for these applications as there is no frame boundary
- NOTE: QUEUE_HEAD cannot be relied on as it is seemingly possible to flip the same head and not need to queue it
- Tries to detect when FIFO preprocessing is beneficial and only enables optimizations if the benefit outweighs the cost
- Current threshold is at least 500 draw calls saved at over 2000 draw calls to justify the overhead
- TODO: More tuning for other CPUs
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
- Also fix visual corruption when using disjoint indexed draws
- Refactor draw call emit again (vk)
- Improve execution barrier resolve
- Allow vertex/index rebase inside begin/end pair
- Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]
- gl bringup
- Simplify
- using the simple_array gets back a few more fps :)