Commit graph

553 commits

Author SHA1 Message Date
psucien
345d55669e texture_cache: 8bpp macro detiler 2025-01-02 23:27:18 +01:00
TheTurtle
77d2172441
renderer_vulkan: Cleanup and improve barriers in caches (#1865)
* texture_cache: Stricter barriers on image upload

* buffer_cache: Stricter barrier for vkCmdUpdateBuffer

* vk_rasterizer: Barrier also normal buffers and make it apply to all stages

* texture_cache: Minor barrier cleanup

* Batch image and buffer barriers in a single command

* clang format
2025-01-02 19:43:56 +01:00
psucien
f7a8e2409c hot-fix: debug build 2025-01-02 19:41:15 +01:00
liberodark
596f4cdf0e
Fix amdgpu & other issues (#2000) 2025-01-02 15:39:39 +02:00
TheTurtle
c25447097e
buffer_cache: Improve buffer cache locking contention (#1973)
* Improve buffer cache locking contention

* buffer_cache: Revert some changes

* clang fmt 1

* clang fmt 2

* clang fmt 3

* buffer_cache: Fix build
2025-01-02 15:39:02 +02:00
hspir404
6862c9aad7
Speed up LiverpoolToVK::SurfaceFormat (#1982)
* Speed up LiverpoolToVK::SurfaceFormat

In Bloodborne this shows up as the function with the very highest cumulative "exclusive time". This is true both in scenes that perform poorly, and scenes that perform well.

I took (approximately) 10s samples using an 8khz sampling profiler.

In the Nightmare Grand Cathedral (looking towards the stairs, at the rest of the level):
- Reduced total time from 757.34ms to 82.61ms (out of ~10000ms).
- Reduced average frame times by 2ms (though according to the graph, the gap may be as big as 9ms every N frames).

In the Hunter's Dream (in the spawn position):
- Reduced the total time from 486.50ms to 53.83ms (out of ~10000ms).
- Average frame times appear to be roughly the same.

These are profiles of the change vs the version currently in the main branch. These improvements also improve things in the `threading` branch. They might improve them even more in that branch, but I didn't bother keeping track of my measurements as well in that branch. I believe this change will still be useful even when that branch is stabilized and merged.

It could be there are other bottlenecks in rendering on this branch that are preventing this code from being the critical path in places like the Hunter's Dream, where performance isn't currently as constrained. That might explain why the reduction in call times isn't resulting in a higher frame rate.

* Implement SurfaceFormat with derived lookup table instead of switch

* Clang format fixes
2025-01-02 15:38:51 +02:00
Mahmoud Adel
099e685bff
add R16Uint to Format Detiler (#1995)
helps with Matterfall
2025-01-02 14:29:57 +02:00
polybiusproxy
a76e8f0211
clang-format 2025-01-01 13:21:00 +01:00
psucien
d69341fd31 hot-fix: detiler: forgotten lut optimizations 2025-01-01 03:40:28 +01:00
squidbus
927dc6d95c
vk_platform: Fix incorrect type for MVK debug flag. (#1993) 2024-12-31 12:38:30 +02:00
squidbus
41d64a200d
shader_recompiler: Add swizzle support for unsupported formats. (#1869)
* shader_recompiler: Add swizzle support for unsupported formats.

* renderer_vulkan: Rework MRT swizzles and add unsupported format swizzle support.

* shader_recompiler: Clean up swizzle handling and handle ImageRead storage swizzle.

* shader_recompiler: Fix type errors

* liverpool_to_vk: Remove redundant clear color swizzles.

* shader_recompiler: Reduce CompositeConstruct to constants where possible.

* shader_recompiler: Fix ImageRead/Write and StoreBufferFormatF32 types.

* amdgpu: Add a few more unsupported format remaps.
2024-12-31 06:14:47 +02:00
squidbus
38f1cc2652
renderer_vulkan: Render polygons using triangle fans. (#1969) 2024-12-29 12:30:37 +01:00
Quang Ngô
1bc27135e3
renderer_vulkan: fix deadlock when resizing the SDL window (#1860)
* renderer_vulkan: Fix deadlock when resizing the SDL window

* Address review comment
2024-12-29 13:22:35 +02:00
TheTurtle
f09a95453e
hot-fix: Correct queue id in dispatch indirect
I missed this
2024-12-29 12:48:45 +02:00
Mahmoud Adel
e952013fe0
add EventWrite and DispatchIndirect to ProcessCompute (#1948)
* add EventWrite and DispatchIndirect to ProcessCompute

helps Alienation go Ingame

* apply review changes

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>

---------

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-29 12:47:15 +02:00
Quang Ngô
202c1046a1
Fix loading RenderDoc in offline mode for Linux (#1968) 2024-12-29 12:36:29 +02:00
Quang Ngô
99e1e028c0
texture_cache: Don't read max ansio value if not aniso filter (#1942)
Fix Sonic Forces.
2024-12-28 13:18:56 +02:00
Quang Ngô
0351b864d0
texture_cache: Enable anisotropic filtering (#1872) 2024-12-27 16:47:26 +02:00
squidbus
a86ee7e7f5
vk_platform: Enable MoltenVK debug if crash diagnostics is enabled. (#1887)
* vk_platform: Enable MoltenVK debug if crash diagnostics is enabled.

* build: Make sure MoltenVK gets re-bundled when changed.
2024-12-27 16:46:31 +02:00
¥IGA
cf84c46a49
Fix for D32Sfloat and R8Snorm Tiled image (#1898)
* Fix for D32Sfloat Tiled image

* Fix for R8Snorm Tiled image
2024-12-27 16:43:44 +02:00
Vinicius Rangel
edc027a8bc
Devtools IV (#1910)
* devtools: fix popen in non-windows environment

* devtools: fix frame crash assertion when hidden

* devtools: add search to shader list

* devtools: add copy name to shader list

* devtools: frame dump: search by shader name
2024-12-26 23:08:47 +02:00
¥IGA
3ab118837a
Fix for D16Unorm Tiled image (#1863) 2024-12-25 16:06:12 +02:00
squidbus
3c111202e1
renderer_vulkan: Make sure at least one viewport is set (#1859) 2024-12-25 16:05:51 +02:00
squidbus
a89c29c2ca
shader_recompiler: Rework image read/write emit. (#1819) 2024-12-25 01:13:32 +02:00
squidbus
6d728ec7ed
renderer_vulkan: Enable LDS barriers for MoltenVK (#1866) 2024-12-24 23:03:04 +02:00
Daniel R.
c284cf72e1
Switch remaining CRLF terminated files to LF 2024-12-24 13:56:31 +01:00
squidbus
0a4453b912
renderer_vulkan: Simplify depth pipeline state and move stencil to dynamic state. (#1854)
* renderer_vulkan: Simplify depth pipeline state and move stencil to dynamic state.

* Change graphics key depth-stencil flags to bitfields.
2024-12-24 13:45:11 +02:00
TheTurtle
092d42e981
renderer_vulkan: Implement rectlist emulation with tessellation (#1857)
* renderer_vulkan: Implement rectlist emulation with tessellation

* clang format

* renderer_vulkan: Use tessellation for quad primitive as well

* vk_rasterizer: Handle viewport enable flags

* review

* shader_recompiler: Fix quad/rect list FS passthrough semantics.

* spirv: Bump to 1.5

* remove pragma

---------

Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
2024-12-24 13:28:47 +02:00
psucien
c2e9c877dd hot-fix: missing fce barrier 2024-12-23 18:20:37 +01:00
Quang Ngô
400da1aa8d
Handle swapchain recreation (#1830) 2024-12-23 16:21:48 +02:00
Emulator-Team-2
94f861588d
added B5G6R5UnormPack16 format (#1856) 2024-12-23 15:52:29 +02:00
psucien
2dc5755799 build: exclude Tracy from release builds 2024-12-22 22:51:48 +01:00
psucien
8abc43a03d
texture_cache: 32bpp and 64bpp macro detilers (#1852)
* added 32bpp macro detiler

* added 64bpp macro detiler

* consider 3d depth alignment in size calculations
2024-12-22 19:43:44 +01:00
Vladislav Mikhalin
7fe4df85ab
Clear color attachment if FCE was invoked before any draws (#1851)
* Clear RT if FCE was invoked before any draws

Co-authored-by: psucien <bad_cast@protonmail.com>

* address review comments

---------

Co-authored-by: psucien <bad_cast@protonmail.com>
2024-12-22 18:12:43 +01:00
setepenre
8a409d86d4
post-processing: rework gamma correction (#1756) 2024-12-22 16:18:07 +01:00
squidbus
14dc136832
renderer_vulkan: Various attachment cleanup and fixes. (#1795) 2024-12-22 16:08:48 +02:00
TheTurtle
5eebb04de9
vk_rasterizer: hot fix 2024-12-22 15:31:10 +02:00
TheTurtle
fb2c035c05
vk_rasterizer: Fix stencil clears (#1840) 2024-12-22 02:49:42 +02:00
Daniel R.
8d8bb05055
renderer_vulkan: add support for Polygon draws (#1798) 2024-12-21 10:20:24 +01:00
TheTurtle
188eebb92a
ir: Add heuristic based LDS barrier pass (#1801)
* ir: Add heuristic based LDS barrier pass

* Attempts to insert barriers after zero-depth divergant conditional blocks in shaders that use shared memory

* lds_barriers: Limit to nvidia

* Intel has historically had problems with cs barriers, will debug other time
2024-12-19 10:18:28 +02:00
Mahmoud Adel
1e08099036
add R8Uint in image Detiling (#1812)
used by InFamous, and maybe other games
2024-12-18 22:06:30 +02:00
squidbus
ccfb1bbfa8
vk_instance: Add additional fallback for missing D16UnormS8Uint. (#1810) 2024-12-18 07:56:08 +02:00
squidbus
87773a417b
mac: Choose whether system Vulkan is needed at runtime. (#1780) 2024-12-17 15:04:19 +02:00
psucien
e7c4ffe032 hot-fix: Tracy operation restored; memory leak fix as a bonus 2024-12-15 20:53:29 +01:00
psucien
0fd1ab674b
GPU processor refactoring (#1787)
* coroutine code prettification

* asc queues submission refactoring

* better asc ring context handling

* final touches and review notes

* even more simplification for context saving
2024-12-15 00:54:46 +02:00
squidbus
8b88344679
vk_instance: Remove unused dynamic state 2 features struct (#1791) 2024-12-14 22:46:19 +02:00
TheTurtle
e9ede8d627
Revert "DmaData and Recompiler fixes (#1775)" (#1784)
This reverts commit cafd40f2c2.
2024-12-14 16:17:14 +02:00
squidbus
e752f04cde
shader_recompiler: Fixups from stencil changes (#1776) 2024-12-14 14:33:24 +02:00
Vladislav Mikhalin
cafd40f2c2
DmaData and Recompiler fixes (#1775)
* liverpool: fix dmadata packet handling

* recompiler: emit a label right after s_branch to prevent dead code interferrence

* specialize barriers
2024-12-14 14:33:06 +02:00
baggins183
3c0c921ef5
Tessellation (#1528)
* shader_recompiler: Tessellation WIP

* fix compiler errors after merge

DONT MERGE set log file to /dev/null

DONT MERGE linux pthread bb fix

save work

DONT MERGE dump ir

save more work

fix mistake with ES shader

skip list

add input patch control points dynamic state

random stuff

* WIP Tessellation partial implementation. Squash commits

* test: make local/tcs use attr arrays

* attr arrays in TCS/TES

* dont define empty attr arrays

* switch to special opcodes for tess tcs/tes reads and tcs writes

* impl tcs/tes read attr insts

* rebase fix

* save some work

* save work probably broken and slow

* put Vertex LogicalStage after TCS and TES to fix bindings

* more refactors

* refactor pattern matching and optimize modulos (disabled)

* enable modulo opt

* copyright

* rebase fixes

* remove some prints

* remove some stuff

* Add TCS/TES support for shader patching and use LogicalStage

* refactor and handle wider DS instructions

* get rid of GetAttributes for special tess constants reads. Immediately replace some upon seeing readconstbuffer. Gets rid of some extra passes over IR

* stop relying on GNMX HsConstants struct. Change runtime_info.hs_info and some regs

* delete some more stuff

* update comments for current implementation

* some cleanup

* uint error

* more cleanup

* remove patch control points dynamic state (because runtime_info already depends on it)

* fix potential problem with determining passthrough

---------

Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-14 12:56:17 +02:00