Commit graph

100 commits

Author SHA1 Message Date
squidbus
fc90f279e2
vulkan: Limit multisampling to supported sample counts. (#828) 2024-09-12 22:59:23 +03:00
psucien
8a76cd888f hot-fix: mark null image as tracked by default to avoid its updates 2024-09-11 22:39:21 +02:00
TheTurtle
b0bbb16aae
video_core: Add fallback path for pipelines with more than 32 bindings (#837)
* video_core: Small fixes

* renderer_vulkan: Add fallback path for pipelines with more than 32 bindings

* vk_resource_pool: Rewrite desc heap

* work
2024-09-10 20:54:39 +03:00
psucien
56cc70dc97 fix for image view storage flag handling 2024-09-09 00:09:38 +02:00
psucien
f1becb2507 hot-fix: linear cubemaps check assert removed (verified) 2024-09-08 14:18:48 +02:00
psucien
047a115b3e hot-fix: exclude tiling condition from promotion of textures to depth 2024-09-08 11:12:25 +02:00
TheTurtle
13743b27fc
shader_recompiler: Implement data share append and consume operations (#814)
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
Daniel R.
416e23fe76
Fix incompatible format images being passed on overlap resolve (#794) 2024-09-06 20:09:28 +03:00
TheTurtle
b08baaeb13
video_core: Improve handling of image buffer aliases (#757)
* texture_cache: Use invalidate threshhold

* It's possible for shaders to bind huge buffers and only write to lower portion of it. This is a problem if upper parts of the buffer overlap with render targets. If the image is very far away from buffer base it's unlikely the shader will want to write it, so skip invalidation for it

* video_core: Allow using texture cache to validate texture buffers

* texture_cache: Use buffer cache in all cases for data source

* Allows to correctly handle compute written micro tiled textures

* texture_cache: Fix depth pitch

* kernel: Remove missed code

* clang format

* video_core: Adjust depth format

* buffer_cache: Do not cache buffer views

* thread_management: Do not call createMutex on unlock

* temp: Revert this when pr is done

* buffer_cache: Dont skip cpu uploads with image sync

* Sometimes image does not fully overlap with a region

* fix build

* video_core: Improve invalidate heuristic

* small fixes

* video_core: Hopefully fix some vertex explosions
2024-09-05 17:25:45 +03:00
psucien
28feb77982
Surface management rework (3/3) (#370)
* texture_cache: images overlap support

* renderer_vk: log messages on surfaces which require degamma

* missing barriers

* forced sync2 + better barriers

* Handling of depth target aliasing; added formats compatibility check

* Don't bind empty texel buffers

* Promote r32f textures to depth target if shader expects so

* Promote textures to depth if they use depth tiling

* fix for image leaking; detiler stream buffer removed
2024-09-04 23:47:57 +03:00
psucien
34ffd95306
video_core: added VK_LAYER_LUNARG_crash_diagnostic (#751) 2024-09-03 21:56:23 +02:00
oltolm
e9ef726185
Fix warnings (#749)
* suppress warning in vk_mem_alloc.h

* fix warnings in cheats_patches.cpp
2024-09-03 21:41:59 +03:00
adjonesey
0f87d1e3d4 Remove from_compute check in texture cache invalidation (#665)
* Remove from_compute check in texture cache invalidation (hack)

* Remove from_compute parameter

---------

Co-authored-by: Adam Jones <a.c.jones@outlook.com>
2024-08-30 13:01:59 +03:00
TheTurtle
66e96dd944
video_core: Account of runtime state changes when compiling shaders (#575)
* video_core: Compile shader permutations

* spirv: Only specific storage image format for atomics

* ir: Avoid cube coord patching for storage image

* spirv: Fix default attributes

* data_share: Add more instructions

* video_core: Query storage flag with runtime state

* kernel: Use std::list for semaphore

* video_core: Use texture buffers for untyped format load/store

* buffer_cache: Limit view usage

* vk_pipeline_cache: Fix invalid iterator

* image_view: Reduce log spam when alpha=1 in storage swizzle

* video_core: More features and proper spirv feature detection

* video_core: Attempt no2 for specialization

* spirv: Remove conflict

* vk_shader_cache: Small cleanup
2024-08-29 19:29:54 +03:00
psucien
371d1d009a Added missing headers and 2D MSAA image type 2024-08-27 19:17:23 +02:00
DanielSvoboda
2a737d0800
V_NOP | PfpSyncMe | S_CMPK_EQ_U32 (#426)
* V_NOP

V_NOP = Do nothing

* PfpSyncMe

PfpSyncMe ensures that all previous commands are completed before continuing.
'break' should be enough for now

* S_CMPK_EQ_U32

S_CMPK_EQ_U32
SCC = (D.u == SIMM16)

* S_CMPK_EQ_U32

* OperandField::Undefined:

* Update translate.cpp

remove  OperandField::Undefined:

* Update image_view.cpp

[Render.Vulkan] <Error> image_view.cpp:ImageViewInfo:109: Storage image (num_comps = 4) requires swizzling [BGRA]
format 43 dst_sel 3886

* Update liverpool_to_vk.cpp

* S_CMPK_EQ_U32

* S_CMPK_EQ_U32
2024-08-25 22:07:46 +02:00
psucien
b687ae5e34
GnmDriver: Clear context support (#567)
* gnmdriver: added support for gpu context reset

* shader_recompiler: minor validation fixes

* shader_recompiler: added `V_CMPX_GT_I32`

* shader_recompiler: fix for crash on inline sampler access

* compilation warnings and dead code elimination

* amdgpu: fix for registers addressing

* libraries: videoout: reduce logging pressure

* shader_recompiler: fix for devergence scope detection
2024-08-25 23:01:05 +03:00
TheTurtle
c79b10edc1
video_core: Bloodborne stabilization pt1 (#543)
* shader_recompiler: Writelane elimination pass + null image fix

* spirv: Implement image derivatives

* texture_cache: Reduce page bit size

* clang format

* slot_vector: Back to debug assert

* vk_graphics_pipeline: Handle null tsharp

* spirv: Revert some change

* vk_instance: Support primitive restart on list topology

* page_manager: Adjust windows exception handler

* clang format

* Remove subres tracking

* Will be done separately
2024-08-24 22:51:47 +03:00
Vladislav Mikhalin
41dec15869 Fixed video dimensions alignment and image cache 2024-08-24 16:59:30 +03:00
Random
fc745ee767
Fix a few issues with the intel anv vulkan driver from mesa (#514)
* add fallback format for d16UnormS8Uint which is not supported by intel

* fix depth/stencil buffer creation issues causing asserts in intel driver
2024-08-24 14:50:46 +02:00
¥IGA
0c5b91e1fb
Warnings fixes (#541)
* Warnings fixes

* Warnings fixes
2024-08-23 22:38:55 +03:00
Vladislav Mikhalin
79680c50c0
Misc fixes (#517)
* Misc fixes

* Removed the skip for draw calls without RTs

* Remove Srgb image stores to rework later
2024-08-21 23:54:23 +03:00
TheTurtle
3f9c86ad33
vk_pipeline_cache: Avoid recompiling new shaders on each new PL (#480)
* cfg: Add one more divergence case

* Seen in RDR shaders

* renderer_vulkan: Reduce number of compiled shaders

* vk_pipeline_cache: Remove some unnecessary checks
2024-08-21 02:00:24 +03:00
Dzmitry Dubrova
6f4e1a47b9
core: misc changes (#430)
* core: misc changes

* video_core: add some formats for detiling

* clang format
2024-08-14 20:37:05 +02:00
psucien
27cb218584
video_core: CPU flip relay (#415)
* video_core: cpu flip is propagated via gpu thread now

* tentative fix for cpu flips racing

* libraries: videoout: better flip status handling
2024-08-14 11:36:11 +02:00
TheTurtle
1fb0da9b89
video_core: Crucial buffer cache fixes + proper GPU clears (#414)
* translator: Use templates for stronger type guarantees

* spirv: Define buffer offsets upfront

* Saves a lot of shader instructions

* buffer_cache: Use dynamic vertex input when available

* Fixes issues when games like dark souls rebind vertex buffers with different stride

* externals: Update boost

* spirv: Use runtime array for ssbos

* ssbos can be large and typically their size will vary, especially in generic copy/clear cs shaders

* fs: Lock when doing case insensitive search

* Dark Souls does fs lookups from different threads

* texture_cache: More precise invalidation from compute

* Fixes unrelated render targets being cleared

* texture_cache: Use hashes for protect gpu modified images from reupload

* translator: Treat V_CNDMASK as float

* Sometimes it can have input modifiers. Worst this will cause is some extra calls to uintBitsToFloat and opposite. But most often this is used as float anyway

* translator: Small optimization for V_SAD_U32

* Fix review

* clang format
2024-08-13 09:21:48 +03:00
psucien
3d0fdf11f0
Build stabilization (#413)
* shader_recompiler: fix for float convert and debug asserts

* libraries: kernel: correct return code on invalid semaphore

* amdgpu: additional case for cb extents retrieval heuristic

* removed redundant check in assert

* amdgpu: fix for linear tiling mode detection fin color buffers

* texture_cache: fix for unexpected scheduler flushes by detiler

* renderer_vulkan: missing depth barrier

* texture_cache: missed slices in rt view; + detiler format
2024-08-12 17:23:01 +03:00
psucien
ace39957ef
Video Core: debug tools (#412)
* video_core: better use of rdoc markers

* renderer_vulkan: added gpu assisted validation

* renderer_vulkan: make nv_checkpoints operational

* video_core: unified Vulkan objects names
2024-08-12 13:46:45 +02:00
TheTurtle
381ba8c7a5
video_core: Implement guest buffer manager (#373)
* video_core: Introduce buffer cache

* video_core: Use multi level page table for caches

* renderer_vulkan: Remove unused stream buffer

* fix build

* oops forgot optimize off
2024-08-08 15:02:10 +03:00
TheTurtle
a7c9bfa5c5
shader_recompiler: Small instruction parsing refactor/bugfixes (#340)
* translator: Implemtn f32 to f16 convert

* shader_recompiler: Add bit instructions

* shader_recompiler: More data share instructions

* shader_recompiler: Remove exec contexts, fix S_MOV_B64

* shader_recompiler: Split instruction parsing into categories

* shader_recompiler: Better BFS search

* shader_recompiler: Constant propagation pass for cmp_class_f32

* shader_recompiler: Partial readfirstlane implementation

* shader_recompiler: Stub readlane/writelane only for non-compute

* hack: Fix swizzle on RDR

* Will properly fix this when merging this

* clang format

* address_space: Bump user area size to full

* shader_recompiler: V_INTERP_MOV_F32

* Should work the same as spirv will emit flat decoration on demand

* kernel: Add MAP_OP_MAP_FLEXIBLE

* image_view: Attempt to apply storage swizzle on format

* vk_scheduler: Barrier attachments on renderpass end

* clang format

* liverpool: cs state backup

* shader_recompiler: More instructions and formats

* vector_alu: Proper V_MBCNT_U32_B32

* shader_recompiler: Port some dark souls things

* file_system: Implement sceKernelRename

* more formats

* clang format

* resource_tracking_pass: Back to assert

* translate: Tracedata

* kernel: Remove tracy lock

* Solves random crashes in Dark Souls

* code: Review comments
2024-07-30 23:32:40 +02:00
Vasyl Baran
3e6af54ea3 Fixup for detiler artifacts on macOS 2024-07-28 22:21:18 +03:00
psucien
30198d5ffc
Surface management rework (2/3) (#329)
* texture_cache: interface refactoring

* a bit of fixes and improvements

* texture_cache: macro tile extents for bpp 128

* texture_cache: detiler: prefer host memory for large buffers upload
2024-07-28 17:20:42 +02:00
TheTurtle
0d6edaa0a0
Move presentation to separate thread/improve sync (#303)
* video_out: Move presentation to separate thread

* liverpool: Better sync for CPU flips

* driver: Make flip blocking

* videoout: Proper flip rate and vblank management

* config: Add vblank divider option

* clang format

* videoout: added `sceVideoOutWaitVblank`

* clang format

* vk_scheduler: Silly merge conflict

* externals: Add renderdoc API

* clang format

* reuse

* rdoc: manual capture trigger

* clang fmt

---------

Co-authored-by: psucien <168137814+psucien@users.noreply.github.com>
2024-07-28 15:54:09 +02:00
squidbus
175ffe8ce3 Add fallback system for unsupported pixel formats. 2024-07-21 22:36:12 +03:00
squidbus
66fa29059c Add initial macOS support. 2024-07-21 22:36:12 +03:00
psucien
64459f1a76
Surface management rework (1/3) (#307)
* amdgpu: proper CB and DB sizes calculation; minor refactoring

* texture_cache: separate file for image_info

* texture_cache: image guest address moved into image info

* texture_cache: surface size calculation

* shader_recompiler: fixed sin/cos

Thanks to red_pring and gandalfthewhite0173

* initial preparations for subresources upload

* review comments
2024-07-20 12:51:21 +03:00
Vladislav Mikhalin
989f88837d
Filesystem errors and Base Array Layers (#280)
* Filesystem errors and Base Array Layers

* Fixed build for POSIX

* forgot 1 file
2024-07-11 14:37:21 +03:00
psucien
6dbb842bec renderer: a bit more formats to support 2024-07-07 14:34:36 +02:00
TheTurtle
6ceab6dfac
shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store (#231)
* shader_recompiler: Add LDEXP

* shader_recompiler: Add most image integer atomic ops

* shader_recompiler: Implement shared memory load/store

* shader_recompiler: More image atomics

* externals: Update sirit

* clang format

* cmake: Add missing files

* shader_recompiler: Fix some atomic bugs

* shader_recompiler: Vs outputs

* shader_recompiler: Shared mem has side-effects, fix format component order

* shader_recompiler: Inline constant buffer impl

* video_core: Fix regressions

* Work

* Fixup a few things
2024-07-05 00:15:44 +03:00
IndecisiveTurtle
fe5bfa9d61 texture_cache: Always validate for now 2024-07-01 22:53:01 +03:00
IndecisiveTurtle
20e83b4d53 clang format 2024-07-01 13:56:14 +03:00
IndecisiveTurtle
b4d24d8737 renderer_vulkan: Prefer depth stencil read-only layout when possible
* Persona reads a depth attachment while it is being attached with writes disabled. Now this works without spamming vk validation errors
2024-07-01 13:56:14 +03:00
IndecisiveTurtle
22b930ba5e video_core: Track renderpass scopes properly 2024-07-01 13:56:14 +03:00
IndecisiveTurtle
ad10020836 video_core: Fix a few problems 2024-07-01 13:56:14 +03:00
IndecisiveTurtle
10ef357f1f image: Fix image type of 1D Array 2024-07-01 13:56:14 +03:00
psucien
f03262421e texture_cache: force storage usage bit to all images 2024-07-01 09:58:52 +02:00
psucien
14377b39b5 texture_cache: detiler: added missing micro8x2 2024-06-30 15:54:59 +02:00
psucien
2cbbcbd371
Metadata support (#223)
* texture_cache: more image usage flags

* texture_cache: metadata registration

* renderer_vulkan: initial CMask support

* renderer_vulkan: skip redundant FCE and FMask decompression passes

* renderer_vulkan: redundant VO surface registration removed

* renderer_vulkan: initial HTile support

* renderer_vulkan: added support for MSAA attachments

* renderer_vulkan: skip unnecessary metadata updates
2024-06-29 16:49:59 +03:00
psucien
8475a62a46 common: Common namespace for the slot vector container 2024-06-25 09:31:32 +02:00
psucien
cb6b21de1f
Initial instancing and asynchronous compute queues (#207)
* gnm_driver: added `sceGnmRegisterOwner` and `sceGnmRegisterResource`

* video_out: `sceVideoOutGetDeviceCapabilityInfo` for sdk runtime

* gnm_driver: correct vqid index range

* amdgpu: indirect buffer, release mem and some additional irq modes

* amdgpu: added ASC commands processor

* shader_recompiler: added support for fetch instance id

* amdgpu: classic bitfields for T# representation (debugging experience)

* renderer_vulkan: skip zero sized VBs from binding

* texture_cache: image upload logic moved into `Image` object

* gnm_driver: `sceGnmDingDong` implementation

* texture_cache: `Image` usage flags moved; correct VO buffer pitch
2024-06-22 19:50:20 +03:00