* Import memory
* 64K pages and fix memory mapping
* Queue coverage
* Buffer syncing, faulted readback adn BDA in Buffer
* Base DMA implementation
* Preparations for implementing SPV DMA access
* Base impl (pending 16K pages and getbuffersize)
* 16K pages and stack overflow fix
* clang-format
* clang-format but for real this time
* Try to fix macOS build
* Correct decltype
* Add testing log
* Fix stride and patch phi node blocks
* No need to check if it is a deleted buffer
* Clang format once more
* Offset in bytes
* Removed host buffers (may do it in another PR)
Also some random barrier fixes
* Add IR dumping from my read-const branch
* clang-format
* Correct size insteed of end
* Fix incorrect assert
* Possible fix for NieR deadlock
* Copy to avoid deadlock
* Use 2 mutexes insteed of copy
* Attempt to range sync error
* Revert "Attempt to range sync error"
This reverts commit dd287b48682b50f215680bb0956e39c2809bf3fe.
* Fix size truncated when syncing range
And memory barrier
* Some fixes (and async testing (doesn't work))
* Use compute to parse fault buffer
* Process faults on submit
* Only sync in the first time we see a readconst
Thsi is partialy wrong. We need to save the state into the submission context itself, not the rasterizer since we can yield and process another sumission (if im not understanding wrong).
* Use spec const and 32 bit atomic
* 32 bit counter
* Fix store_index
* Better sync (WIP, breaks PR now)
* Fixes for better sync
* Better sync
* Remove memory coveragte logic
* Point sirit to upstream
* Less waiting and barriers
* Correctly checkout moltenvk
* Bring back applying pending operations in wait
* Sync the whole buffer insteed of only the range
* Implement recursive shared/scoped locks
* Iterators
* Faster syncing with ranges
* Some alignment fixes
* fixed clang format
* Fix clang-format again
* Port page_manager from readbacks-poc
* clang-format
* Defer memory protect
* Remove RENDERER_TRACE
* Experiment: only sync on first readconst
* Added profiling (will be removed)
* Don't sync entire buffers
* Added logging for testing
* Updated temporary workaround to use 4k pages
* clang.-format
* Cleanup part 1
* Make ReadConst a SPIR-V function
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Squashed initial implementation
* Logging for checking if buffers are memory contiguous
* Add check to see if first instruction is valid in the next buffer to avoid false positives
* Oof
* Replace old code with IndecisiveTurtle's new, better implementation
* Add `unlikely` keyword to the split packet handling branches
Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
---------
Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
* ir_passes: Add barrier at end of block too
* vk_platform: Always assign names to resources
* texture_cache: Better overlap handling
* liverpool: Avoid resuming ce_task when its finished
* spirv_quad_rect: Skip default attributes
Fixes some crashes
* memory: Improve buffer size clamping
* liverpool: Relax binary header validity check
* liverpool: Stub SetPredication with a warning
* Better than outright crash
* emit_spirv: Implement round to zero mode
* liverpool: queue::pop takes the front element
* image_info: Remove obsolete assert
The old code assumed the mip only had 1 layer thus a right overlap could not return mip 0. But with the new path we handle images that are both mip-mapped and multi-layer, thus this can happen
* tile_manager: Fix size calculation
* spirv_quad_rect: Skip default attributes
---------
Co-authored-by: poly <47796739+polybiusproxy@users.noreply.github.com>
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
* coroutine code prettification
* asc queues submission refactoring
* better asc ring context handling
* final touches and review notes
* even more simplification for context saving
* libkernel: Cleanup some function places
* kernel: Refactor thread functions
* kernel: It builds
* kernel: Fix a bunch of bugs, kernel thread heap
* kernel: File cleanup pt1
* File cleanup pt2
* File cleanup pt3
* File cleanup pt4
* kernel: Add missing funcs
* kernel: Add basic exceptions for linux
* gnmdriver: Add workload functions
* kernel: Fix new pthreads code on macOS. (#1441)
* kernel: Downgrade edeadlk to log
* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload
* exception: Add context register population for macOS. (#1444)
* kernel: Pthread rewrite touchups for Windows
* kernel: Multiplatform thread implementation
* mutex: Remove spamming log
* pthread_spec: Make assert into a log
* pthread_spec: Zero initialize array
* Attempt to fix non-Windows builds
* hotfix: change incorrect NID for scePthreadAttrSetaffinity
* scePthreadAttrSetaffinity implementation
* Attempt to fix Linux
* windows: Address a bunch of address space problems
* address_space: Fix unmap of region surrounded by placeholders
* libs: Reduce logging
* pthread: Implement condvar with waitable atomics and sleepqueue
* sleepq: Separate and make faster
* time: Remove delay execution
* Causes high cpu usage in Tohou Luna Nights
* kernel: Cleanup files again
* pthread: Add missing include
* semaphore: Use binary_semaphore instead of condvar
* Seems more reliable
* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`
* libraries/kernel: implement `scePthreadSetPrio`
---------
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
* devtools: fix showing entire depth instead of bits
* devtools: show button for stage instead of menu bar
- fix batch view dockspace not rendering when window collapsed
* devtools: removed useless "Batch" collapse & don't collapse last batch
* devtools: refactor DrawRow to templating
* devtools: reg popup size adjusted to the content
* devtools: better window names
* devtools: regview layout compacted
* devtools: option to show collapsed frame dump
keep most popups open when selection changes
best popup windows positioning
* devtools: show compute shader regs
* devtools: tips popup
* devtools: pm4 - show markers
* SaveDataDialogLib: fix compile with mingw
* devtools: pm4 - show program state
* devtools: pm4 - show program disassembly
* devtools: pm4 - show frame regs
* devtools: pm4 - show color buffer info as popup
add ux improvements for open new windows with shift+click
better window titles
* imgui: skip all textures to avoid hanging with crash diagnostic enabled
not sure why this happens :c
* devtools: pm4 - show reg depth buffer
* shader_recompiler: Add more format swap modes
* texture_cache: Handle stencil texture reads
* emulator: Support loading font library
* readme: Add thanks section
* shader_recompiler: Constant buffers as integers
* shader_recompiler: Typed buffers as integers
* shader_recompiler: Separate thread bit scalars
* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow
* shader_recompiler: Implement data append/consume operations
* clang format
* buffer_cache: Simplify invalidation scheme
* video_core: Remove some invalidation remnants
* adjust