* Implement sceKernelMapDirectMemory2
Behaves similarly to sceKernelMapDirectMemory, but has a type parameter.
* Simplify
No need to copy all the MapDirectMemory code over, can just call the function, then do the SetDirectMemoryType call
* Clang
* Update sceKernelMemoryPoolExpand
Hardware tests show that this function is basically the same as sceKernelAllocateDirectMemory, with some minor differences.
Update the memory searching code to match my updated AllocateDirectMemory code, with appropriate error conditions.
* Update MemoryPoolReserve
Only difference between real hw and our code is behavior with addr = 0.
* Don't coalesce PoolReserved areas.
Real hardware doesn't coalesce them.
* Update PoolCommit
Plenty of edge case behaviors to handle here.
Addresses are treated as fixed, EINVAL is returned for bad mappings, name should be preserved from PoolReserving, committed areas should coalesce, reserved areas get their phys_base updated
* Formatting
* Adjust fixed PoolReserve path
Hardware tests suggest this will overwrite all VMAs in the range. Run UnmapMemoryImpl on the full area, then reserve. Same logic applies to normal reservations too.
Also adjusts logic of the non-fixed path to more closely align with hardware observations.
* Remove phys_base modifications
This can be handled later. Doing the logic properly would likely take work in MergeAdjacent, and would probably need to be applied to normal dmem mappings too.
* Use VMAHandle.Contains()
Why do extra math when we have a function specifically for this?
* Update memory.cpp
* Remove unnecessary code
Since I've removed those two asserts, these two lines of code effectively do nothing.
* Clang
* Fix names
* Fix PoolDecommit
Should fix the address space regressions in UE titles on Windows.
* Fix error log
Should make the cause of this clearer?
* Clang
* Oops
* Remove coalesce on PoolCommit
Windows makes this more difficult.
* Track pool budgets
If you try to commit more pooled memory than is allocated, PoolCommit returns ENOMEM.
Also fixes error conditions for PoolDecommit, that should return EINVAL if given an address that isn't part of the pool.
Note: Seems like the pool budget can't hit zero? I used a <= comparison based on hardware tests, otherwise we're able to make more mappings than real hardware can.
* Fix VirtualQuery behavior on low addresses.
* Fix VirtualQuery struct
Somewhere in our BitField and array use, the size of our VirtualQuery struct became larger than the struct used on real hardware.
Fixing this fixes some data corruption visible in the name parameter during my tests.
* Default name to anon
On real hardware, nameless mappings are given the name "anon:address" where address appears to be the address that made the memory call.
For simplicity sake, I'll stick to the name "anon" for now.
* Place an upper bound on returns from SearchFree
Right now, this upper bound is set based on the limitations of our GPU buffer cache and page table.
Someone with more experience in that area of code should probably fix that at some point.
* More anons
* Clang
* Fix name in sceKernelMapNamedDirectMemory
* strncpy instead of strcpy
Hardcoded the constant size for now, I need to review how real hardware behaves here to determine if anything else is necessary for this to be accurate.
* Fix name behavior
All memory naming functions restrict the name size to a 31 character limit, and return `ORBIS_KERNEL_ERROR_ENAMETOOLONG` if that limit is exceeded.
Since this value is constant for all functions involving names, I've defined it as a constant in kernel's memory.h, and used that in place of any hardcoded 32 character limits.
* Error logging
Hopefully this helps in catching the UFC regression?
* Increase address space upper bound
Probably needs heavy testing, especially on Mac/Windows.
This increases the address space, as needed to accommodate strange memory behaviors seen in UFC.
* VirtualQuery fix
Due to limitations of certain platforms, we initialize our vma_map with 3 separate free mappings.
As such, we need to use a while loop here to accurately query mappings with high addresses
* Fix mappings to high addresses
The PS4's GPU can only handle 40bit addresses. Our texture cache and buffer cache were designed around these limits, and mapping to higher addresses would cause segmentation faults and access violations.
To fix these crashes, only map to the GPU if the mapping is fully contained within the address space the GPU should access.
I'm open to suggestions on how to make this cleaner
* Revert "Increase address space upper bound"
This reverts commit 3d50eeeebb.
* Revert VirtualQuery while loop
Windows wasn't happy with this, again.
Will try to debug and properly fix this when I have a good chance.
* Fix asserts
FindVMA, due to the way it's programmed, never actually returns vma_map.end(), the furthest it ever returns is the last valid memory area. All those asserts we involving vma_map.end() never actually trigger due to this.
This commit removes redundant asserts, adds messages to asserts that were lacking them, and fixes all asserts designed to detect out of bounds memory accesses so they actually trigger.
I've also fixed some potential memory safety issues.
* Proper error behavior in QueryProtection
Might as well handle this properly while I'm here.
* Clang
* More information about ReserveVirtualRange results
Should help debug issues like the one in The Order: 1886 (CUSA00076)
* Fix assert message
* Update assert message
Extra space
* Fix my bug
Oh hey, finally something that's my fault.
* Fix rasterizer unmaps
Should use adjusted_size here, otherwise we could unmap too much.
Thanks to diegolix29 for spotting this.
* Fix edge case in MapMemory
Code comments explain everything.
This should fix some memory asserts.
* Fix fix
Avoid running the code path if it's unnecessary, since there are many additional edge cases to handle when the VMA map is small.
* Fix fix fix
Should prevent infinite loops, haven't tested properly yet though.
* Split logging for inputs and out_addr in ReserveVirtualRange
Addresses review comments.
* Update memory.cpp
* Clean logic
FindDmemArea guarantees that the first dmem area we check contains search_start. Any dmem areas beyond the first one will be entirely past search_start, so checking against it in the loop is unnecessary.
* Emulate memory behavior of libSceGnmDriver _DT_INIT
Due to the unique way some games check for sceKernelAllocateDirectMemory failures, emulating this properly is necessary.
* Clang
* Fix address input for direct memory call
* Fix bug with DirectQueryAvailable
Missed this in my prior PR.
* DirectQueryAvailable fix
Fixes error cases to be more hardware accurate.
* AllocateDirectMemory fixes
search_start and search_end were ignored in certain cases, this fixes that issue.
I've also basically rewritten the function in the process, since the lack of documentation made it difficult to make the proper adjustments.
* DirectQueryAvailable fixes
remaining_size was calculated incorrectly in cases where a free dmem_area had a base earlier than search_start, or an end after search_end.
* Reduce sceKernelGetDirectMemorySize log severity
By this point, we've confirmed that sceKernelGetDirectMemorySize is hardware-accurate. There's no reason to clog logs with this function, which games usually call before every sceKernelAllocateDirectMemory call.
* Clang
* Fix phys_addr_out
phys_addr_out should be equal to search_start in cases where search_start is greater than the dmem_area base.
* Dividing by zero is fun
Need to check for alignment when aligning things.
* Update memory.cpp
* Clang
* ir_passes: Add barrier at end of block too
* vk_platform: Always assign names to resources
* texture_cache: Better overlap handling
* liverpool: Avoid resuming ce_task when its finished
* spirv_quad_rect: Skip default attributes
Fixes some crashes
* memory: Improve buffer size clamping
* liverpool: Relax binary header validity check
* liverpool: Stub SetPredication with a warning
* Better than outright crash
* emit_spirv: Implement round to zero mode
* liverpool: queue::pop takes the front element
* image_info: Remove obsolete assert
The old code assumed the mip only had 1 layer thus a right overlap could not return mip 0. But with the new path we handle images that are both mip-mapped and multi-layer, thus this can happen
* tile_manager: Fix size calculation
* spirv_quad_rect: Skip default attributes
---------
Co-authored-by: poly <47796739+polybiusproxy@users.noreply.github.com>
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
* Implement protecting multiple VMAs
A handful of games expect this to work, and updated versions of Grand Theft Auto V crash if it doesn't work.
* Clang
* memory: Consider flexible mappings as gpu accessible
Multiple guest apps do this with perfectly valid sharps in simple shaders. This needs some hw testing to see how it is handled but for now doesnt hurt to handle it
* memory: Clamp large buffers to mapped area
Sometimes huge buffers can be bound that start on some valid mapping but arent fully contained by it. It is not reasonable to expect the game needing all of the memory, so clamp the size to avoid the gpu tracking assert
* clang-format fix
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Handle errors in sceKernelAllocateDirectMemory
* Improve accuracy of error cases
Some of our existing cases are normally EAGAIN returns.
* Improve logging on errors
* Clang
* TEMPORARY HACK FOR NBA TESTS
This will be removed before this PR is marked as ready, and is only here to make sure the other NBA games (and perhaps DOA3) work if some missing init behavior is handled.
* Revert "TEMPORARY HACK FOR NBA TESTS"
This reverts commit a0e27b0229.
* Change error message
* Unmap memory in chunks if spanning over multiple VMAs
* clang
* Merge fixups
* Minor code style changes
* Update function declarations
---------
Co-authored-by: Marcin Mikołajczyk <marcinmikolajcz@gmail.com>
* core: Split error codes into separate files
* Reduces build times and is cleaner
* core: Bring structs and enums to codebase style
* core: More style changes
* libkernel: Cleanup some function places
* kernel: Refactor thread functions
* kernel: It builds
* kernel: Fix a bunch of bugs, kernel thread heap
* kernel: File cleanup pt1
* File cleanup pt2
* File cleanup pt3
* File cleanup pt4
* kernel: Add missing funcs
* kernel: Add basic exceptions for linux
* gnmdriver: Add workload functions
* kernel: Fix new pthreads code on macOS. (#1441)
* kernel: Downgrade edeadlk to log
* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload
* exception: Add context register population for macOS. (#1444)
* kernel: Pthread rewrite touchups for Windows
* kernel: Multiplatform thread implementation
* mutex: Remove spamming log
* pthread_spec: Make assert into a log
* pthread_spec: Zero initialize array
* Attempt to fix non-Windows builds
* hotfix: change incorrect NID for scePthreadAttrSetaffinity
* scePthreadAttrSetaffinity implementation
* Attempt to fix Linux
* windows: Address a bunch of address space problems
* address_space: Fix unmap of region surrounded by placeholders
* libs: Reduce logging
* pthread: Implement condvar with waitable atomics and sleepqueue
* sleepq: Separate and make faster
* time: Remove delay execution
* Causes high cpu usage in Tohou Luna Nights
* kernel: Cleanup files again
* pthread: Add missing include
* semaphore: Use binary_semaphore instead of condvar
* Seems more reliable
* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`
* libraries/kernel: implement `scePthreadSetPrio`
---------
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
* Skip destruction of adaptive mutex initializers
Based around similar behaviors implemented in the more-kernel branch. Hatsune Miku Project Diva X needs this.
* Fix posix_lseek result overflow
Seen when testing Spider-Man Miles Morales on more-kernel.
* Add posix_fsync
Used by Spider-Man Miles Morales. I've added the normal posix error handling to this function, though I'm aware sceKernelFsync doesn't return any errors currently. This is for future proofing and accuracy, as the actual libkernel does the usual error handling too.
* Properly handle VirtualQuery calls on PoolReserved memory.
* Export posix_getpagesize under libkernel
Bloons TD 5 needs this.
* Clang
* I hate programming and will furiously smash my monitor if I ever see another oversight of this caliber ever again in my goddamn life
* Merge both protect functions together
The memory_type default is based on fpPS4 behavior.
I'm not entirely sure how the offset should be handled, but since the value we use defaults to 0 anyway, that should be better than leaving random data in that area.
Found this issue while looking at code from fpPS4. VirtualQuery was setting is_commited to true when the queried region was reserved.
Also sets the protection value in the VirtualQueryInfo, as I'd assume not storing that could cause issues in games.
This fixes all games currently hanging on the sceKernelmprotect stub.
* memory: Size direct memory based on requested flexible memory.
* memory: Guard against OrbisProcParam without an OrbisKernelMemParam.
* memory: Account for alignment in direct memory suitability checks and add more debugging.
fix: file name typo constant_propogation_pass.cpp
fix typo from 'symbol_vitrual_addr' variable
fix typo in emit_spirv_context_get_set.cpp
fix typo from constant_propagation_pass.cpp in CMakeLists
fix typo in these some config.cpp functions
- setSliderPosition
- setSliderPositionGrid
- getSliderPosition
- getSliderPositionGrid
fix typo inside src\core\aerolib\stubs.cpp
fix typo in a comment from src\core\file_format\pkg.cpp
fix typo inside src\core\file_sys\fs.cpp + fs.h
- NeedsCaseInsensiveSearch -> NeedsCaseInsensitiveSearch
fix 2 function typos: sceAppContentAddcontEnqueueDownloadByEntitlemetId and sceAppContentAddcontMountByEntitlemetId
fix typo on comment inside src\core\libraries\kernel\file_system.cpp
fix typo on src\core\libraries\videoout\driver.cpp
fix typo in src\core\memory.cpp
fix typo from comment in src\qt_gui\game_list_utils.h
fix typo in src\video_core\amdgpu\liverpool.h
- window_offset_disble to window_offset_disable
fix typo from comments in src\video_core\host_shaders\detile_m32x1.comp + detile_m32x2.comp
- subotimal -> suboptimal
fix typo from comment in src\video_core\renderer_vulkan\renderer_vulkan.cpp
- dimentions -> dimensions
fix typo from enum in src\common\debug.h and other files
- MarkersPallete -> MarkersPalette
fix last typo in src\video_core\amdgpu\pm4_opcodes.h
- PremableCntl -> PreambleCntl
* core/libraries/kernel: Fix inaccurate direct memory size
* core/memory: Fix available dmem query on non-free dmem areas
* core/kernel: return ENOMEM if memory area size is zero
* core/kernel: Fix returns on `sceKernelAvailableDirectMemorySize`
* core/memory: Remove unneeded size alignment
* Fix in searchFree should fix#337
* clang format fix
* sceKernelSetVirtualRangeName implementation
* improved vaddr conversion
* updated VirtualQuery to include name too
* unmap also removed name thanks @red_prig
* fixed copy...