shadPS4/src/core/memory.h
Stephen Miller 6477dc4f1e
Some checks are pending
Build and Release / reuse (push) Waiting to run
Build and Release / clang-format (push) Waiting to run
Build and Release / get-info (push) Waiting to run
Build and Release / windows-sdl (push) Blocked by required conditions
Build and Release / windows-qt (push) Blocked by required conditions
Build and Release / macos-sdl (push) Blocked by required conditions
Build and Release / macos-qt (push) Blocked by required conditions
Build and Release / linux-sdl (push) Blocked by required conditions
Build and Release / linux-qt (push) Blocked by required conditions
Build and Release / linux-sdl-gcc (push) Blocked by required conditions
Build and Release / linux-qt-gcc (push) Blocked by required conditions
Build and Release / pre-release (push) Blocked by required conditions
Core: Memory Fixes (#2872)
* Fix VirtualQuery behavior on low addresses.

* Fix VirtualQuery struct

Somewhere in our BitField and array use, the size of our VirtualQuery struct became larger than the struct used on real hardware.
Fixing this fixes some data corruption visible in the name parameter during my tests.

* Default name to anon

On real hardware, nameless mappings are given the name "anon:address" where address appears to be the address that made the memory call.
For simplicity sake, I'll stick to the name "anon" for now.

* Place an upper bound on returns from SearchFree

Right now, this upper bound is set based on the limitations of our GPU buffer cache and page table.
Someone with more experience in that area of code should probably fix that at some point.

* More anons

* Clang

* Fix name in sceKernelMapNamedDirectMemory

* strncpy instead of strcpy

Hardcoded the constant size for now, I need to review how real hardware behaves here to determine if anything else is necessary for this to be accurate.

* Fix name behavior

All memory naming functions restrict the name size to a 31 character limit, and return `ORBIS_KERNEL_ERROR_ENAMETOOLONG` if that limit is exceeded.
Since this value is constant for all functions involving names, I've defined it as a constant in kernel's memory.h, and used that in place of any hardcoded 32 character limits.

* Error logging

Hopefully this helps in catching the UFC regression?

* Increase address space upper bound

Probably needs heavy testing, especially on Mac/Windows.
This increases the address space, as needed to accommodate strange memory behaviors seen in UFC.

* VirtualQuery fix

Due to limitations of certain platforms, we initialize our vma_map with 3 separate free mappings.
As such, we need to use a while loop here to accurately query mappings with high addresses

* Fix mappings to high addresses

The PS4's GPU can only handle 40bit addresses. Our texture cache and buffer cache were designed around these limits, and mapping to higher addresses would cause segmentation faults and access violations.
To fix these crashes, only map to the GPU if the mapping is fully contained within the address space the GPU should access.
I'm open to suggestions on how to make this cleaner

* Revert "Increase address space upper bound"

This reverts commit 3d50eeeebb.

* Revert VirtualQuery while loop

Windows wasn't happy with this, again.
Will try to debug and properly fix this when I have a good chance.

* Fix asserts

FindVMA, due to the way it's programmed, never actually returns vma_map.end(), the furthest it ever returns is the last valid memory area. All those asserts we involving vma_map.end() never actually trigger due to this.
This commit removes redundant asserts, adds messages to asserts that were lacking them, and fixes all asserts designed to detect out of bounds memory accesses so they actually trigger.
I've also fixed some potential memory safety issues.

* Proper error behavior in QueryProtection

Might as well handle this properly while I'm here.

* Clang

* More information about ReserveVirtualRange results

Should help debug issues like the one in The Order: 1886 (CUSA00076)

* Fix assert message

* Update assert message

Extra space

* Fix my bug

Oh hey, finally something that's my fault.

* Fix rasterizer unmaps

Should use adjusted_size here, otherwise we could unmap too much.
Thanks to diegolix29 for spotting this.

* Fix edge case in MapMemory

Code comments explain everything.
This should fix some memory asserts.

* Fix fix

Avoid running the code path if it's unnecessary, since there are many additional edge cases to handle when the VMA map is small.

* Fix fix fix

Should prevent infinite loops, haven't tested properly yet though.

* Split logging for inputs and out_addr in ReserveVirtualRange

Addresses review comments.
2025-05-09 12:33:04 -07:00

284 lines
7.8 KiB
C++

// SPDX-FileCopyrightText: Copyright 2024 shadPS4 Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#pragma once
#include <map>
#include <mutex>
#include <string_view>
#include "common/enum.h"
#include "common/singleton.h"
#include "common/types.h"
#include "core/address_space.h"
#include "core/libraries/kernel/memory.h"
namespace Vulkan {
class Rasterizer;
}
namespace Libraries::Kernel {
struct OrbisQueryInfo;
}
namespace Core::Devtools::Widget {
class MemoryMapViewer;
}
namespace Core {
enum class MemoryProt : u32 {
NoAccess = 0,
CpuRead = 1,
CpuReadWrite = 2,
GpuRead = 16,
GpuWrite = 32,
GpuReadWrite = 48,
};
DECLARE_ENUM_FLAG_OPERATORS(MemoryProt)
enum class MemoryMapFlags : u32 {
NoFlags = 0,
Shared = 1,
Private = 2,
Fixed = 0x10,
NoOverwrite = 0x0080,
NoSync = 0x800,
NoCore = 0x20000,
NoCoalesce = 0x400000,
};
DECLARE_ENUM_FLAG_OPERATORS(MemoryMapFlags)
enum class VMAType : u32 {
Free = 0,
Reserved = 1,
Direct = 2,
Flexible = 3,
Pooled = 4,
PoolReserved = 5,
Stack = 6,
Code = 7,
File = 8,
};
struct DirectMemoryArea {
PAddr base = 0;
size_t size = 0;
int memory_type = 0;
bool is_pooled = false;
bool is_free = true;
PAddr GetEnd() const {
return base + size;
}
bool CanMergeWith(const DirectMemoryArea& next) const {
if (base + size != next.base) {
return false;
}
if (is_free != next.is_free) {
return false;
}
return true;
}
};
struct VirtualMemoryArea {
VAddr base = 0;
size_t size = 0;
PAddr phys_base = 0;
VMAType type = VMAType::Free;
MemoryProt prot = MemoryProt::NoAccess;
bool disallow_merge = false;
std::string name = "";
uintptr_t fd = 0;
bool is_exec = false;
bool Contains(VAddr addr, size_t size) const {
return addr >= base && (addr + size) <= (base + this->size);
}
bool IsFree() const noexcept {
return type == VMAType::Free;
}
bool IsMapped() const noexcept {
return type != VMAType::Free && type != VMAType::Reserved && type != VMAType::PoolReserved;
}
bool CanMergeWith(const VirtualMemoryArea& next) const {
if (disallow_merge || next.disallow_merge) {
return false;
}
if (base + size != next.base) {
return false;
}
if (type == VMAType::Direct && phys_base + size != next.phys_base) {
return false;
}
if (prot != next.prot || type != next.type) {
return false;
}
return true;
}
};
class MemoryManager {
using DMemMap = std::map<PAddr, DirectMemoryArea>;
using DMemHandle = DMemMap::iterator;
using VMAMap = std::map<VAddr, VirtualMemoryArea>;
using VMAHandle = VMAMap::iterator;
public:
explicit MemoryManager();
~MemoryManager();
void SetRasterizer(Vulkan::Rasterizer* rasterizer_) {
rasterizer = rasterizer_;
}
AddressSpace& GetAddressSpace() {
return impl;
}
u64 GetTotalDirectSize() const {
return total_direct_size;
}
u64 GetTotalFlexibleSize() const {
return total_flexible_size;
}
u64 GetAvailableFlexibleSize() const {
return total_flexible_size - flexible_usage;
}
VAddr SystemReservedVirtualBase() noexcept {
return impl.SystemReservedVirtualBase();
}
bool IsValidGpuMapping(VAddr virtual_addr, u64 size) {
// The PS4's GPU can only handle 40 bit addresses.
const VAddr max_gpu_address{0x10000000000};
return virtual_addr + size < max_gpu_address;
}
bool IsValidAddress(const void* addr) const noexcept {
const VAddr virtual_addr = reinterpret_cast<VAddr>(addr);
const auto end_it = std::prev(vma_map.end());
const VAddr end_addr = end_it->first + end_it->second.size;
return virtual_addr >= vma_map.begin()->first && virtual_addr < end_addr;
}
u64 ClampRangeSize(VAddr virtual_addr, u64 size);
bool TryWriteBacking(void* address, const void* data, u32 num_bytes);
void SetupMemoryRegions(u64 flexible_size, bool use_extended_mem1, bool use_extended_mem2);
PAddr PoolExpand(PAddr search_start, PAddr search_end, size_t size, u64 alignment);
PAddr Allocate(PAddr search_start, PAddr search_end, size_t size, u64 alignment,
int memory_type);
void Free(PAddr phys_addr, size_t size);
int PoolReserve(void** out_addr, VAddr virtual_addr, size_t size, MemoryMapFlags flags,
u64 alignment = 0);
int Reserve(void** out_addr, VAddr virtual_addr, size_t size, MemoryMapFlags flags,
u64 alignment = 0);
int PoolCommit(VAddr virtual_addr, size_t size, MemoryProt prot);
int MapMemory(void** out_addr, VAddr virtual_addr, size_t size, MemoryProt prot,
MemoryMapFlags flags, VMAType type, std::string_view name = "anon",
bool is_exec = false, PAddr phys_addr = -1, u64 alignment = 0);
int MapFile(void** out_addr, VAddr virtual_addr, size_t size, MemoryProt prot,
MemoryMapFlags flags, uintptr_t fd, size_t offset);
void PoolDecommit(VAddr virtual_addr, size_t size);
s32 UnmapMemory(VAddr virtual_addr, size_t size);
int QueryProtection(VAddr addr, void** start, void** end, u32* prot);
s32 Protect(VAddr addr, size_t size, MemoryProt prot);
s64 ProtectBytes(VAddr addr, VirtualMemoryArea vma_base, size_t size, MemoryProt prot);
int VirtualQuery(VAddr addr, int flags, ::Libraries::Kernel::OrbisVirtualQueryInfo* info);
int DirectMemoryQuery(PAddr addr, bool find_next,
::Libraries::Kernel::OrbisQueryInfo* out_info);
int DirectQueryAvailable(PAddr search_start, PAddr search_end, size_t alignment,
PAddr* phys_addr_out, size_t* size_out);
int GetDirectMemoryType(PAddr addr, int* directMemoryTypeOut, void** directMemoryStartOut,
void** directMemoryEndOut);
void NameVirtualRange(VAddr virtual_addr, size_t size, std::string_view name);
void InvalidateMemory(VAddr addr, u64 size) const;
private:
VMAHandle FindVMA(VAddr target) {
return std::prev(vma_map.upper_bound(target));
}
DMemHandle FindDmemArea(PAddr target) {
return std::prev(dmem_map.upper_bound(target));
}
template <typename Handle>
Handle MergeAdjacent(auto& handle_map, Handle iter) {
const auto next_vma = std::next(iter);
if (next_vma != handle_map.end() && iter->second.CanMergeWith(next_vma->second)) {
iter->second.size += next_vma->second.size;
handle_map.erase(next_vma);
}
if (iter != handle_map.begin()) {
auto prev_vma = std::prev(iter);
if (prev_vma->second.CanMergeWith(iter->second)) {
prev_vma->second.size += iter->second.size;
handle_map.erase(iter);
iter = prev_vma;
}
}
return iter;
}
VAddr SearchFree(VAddr virtual_addr, size_t size, u32 alignment = 0);
VMAHandle CarveVMA(VAddr virtual_addr, size_t size);
DMemHandle CarveDmemArea(PAddr addr, size_t size);
VMAHandle Split(VMAHandle vma_handle, size_t offset_in_vma);
DMemHandle Split(DMemHandle dmem_handle, size_t offset_in_area);
u64 UnmapBytesFromEntry(VAddr virtual_addr, VirtualMemoryArea vma_base, u64 size);
s32 UnmapMemoryImpl(VAddr virtual_addr, u64 size);
private:
AddressSpace impl;
DMemMap dmem_map;
VMAMap vma_map;
std::mutex mutex;
size_t total_direct_size{};
size_t total_flexible_size{};
size_t flexible_usage{};
Vulkan::Rasterizer* rasterizer{};
friend class ::Core::Devtools::Widget::MemoryMapViewer;
};
using Memory = Common::Singleton<MemoryManager>;
} // namespace Core