Carry bit is now resident in a register-allocated GPR instead of being backed directly into IML instructions
All the PowerPC carry ADD* and SUB* instructions as well as SRAW/SRAWI have been reworked to use more generalized IML instructions for handling carry
IML instructions now support two named output registers instead of only one (easily extendable to arbitrary count)
Storing the condition result in a register instead of imitating PPC CR lets us simplify the backend a lot. Only implemented as PoC for BDZ/BDNZ so far.
Instead of having fixed macros for BCCTR/BCCTRL/BCLR/BCLRL we now have only one single macro instruction that takes the jump destination as a register parameter.
This also allows us to reuse an already loaded LR register (by something like MTLR) instead of loading it again from memory.
As a necessary requirement for this: The register allocator now has support for read operations in suffix instructions
Intermediate commit while I'm still fixing things but I didn't want to pile on too many changes in a single commit.
New:
Reworked PPC->IML converter to first create a graph of basic blocks and then turn those into IML segment(s). This was mainly done to decouple IML design from having PPC specific knowledge like branch target addresses. The previous design also didn't allow to preserve cycle counting properly in all cases since it was based on IML instruction counting.
The new solution supports functions with non-continuous body. A pretty common example for this is when functions end with a trailing B instruction to some other place.
Current limitations:
- BL inlining not implemented
- MFTB not implemented
- BCCTR and BCLR are only partially implemented
Undo vcpkg change
We were using VK_EXT_DEPTH_CLIP_ENABLE but didn't actually request it.
Also fixed an assert when closing Cemu caused by incorrectly tracking the number of allocated pipelines
This allows a savy user, developer or modder to change the comment field of a logging breakpoint to include placeholders such as {r3} or {f3} to log the register values whenever that code is hit.
The result is treated as signed in most cases, but the calculation uses unsigned arithmetic.
As a concrete example where this matters, DS VC passes -1 (2^64-1) to OSWaitEventWithTimeout which internally causes an overflow. But only with unsigned arithmetic this will result in a large positive number that behaves like the intended infinite timeout. With signed arithmetic the result is negative and the events will timeout immediately.