Commit graph

111 commits

Author SHA1 Message Date
squidbus
4d12de8149 hotfix: 64-bit shift fixups 2025-01-24 03:14:37 -08:00
baggins183
c13b29662e
handle control point strides that arent a multiple of 16 (#2172) 2025-01-17 10:14:54 +02:00
squidbus
3b474a12f9
shader_recompiler: Improvements to buffer addressing implementation. (#2123) 2025-01-16 18:40:03 -08:00
squidbus
da2b58f66e
resource_tracking_pass: Persist image resource atomic designation. (#2158) 2025-01-16 12:36:41 +02:00
squidbus
82cb298c5c
shader_recompiler: Remove AMD native CubeFaceCoord. (#2129) 2025-01-11 13:57:49 -08:00
squidbus
e656093d85
shader_recompiler: Fix some image view type issues. (#2118) 2025-01-10 12:35:03 -08:00
squidbus
725814ce01
shader_recompiler: Improvements to array and cube handling. (#2083)
* shader_recompiler: Account for instruction array flag in image type.

* shader_recompiler: Check da flag for all mimg instructions.

* shader_recompiler: Convert cube images into 2D arrays.

* shader_recompiler: Move image resource functions into sharp type.

* shader_recompiler: Use native AMD cube instructions when possible.

* specialization: Fix buffer storage mistake.
2025-01-10 10:48:12 +02:00
squidbus
b0d7feb292
video_core: Implement conversion for uncommon/unsupported number formats. (#2047)
* video_core: Implement conversion for uncommon/unsupported number formats.

* shader_recompiler: Reinterpret image sample output as well.

* liverpool_to_vk: Remove mappings for remapped number formats.

These were poorly supported by drivers anyway.

* resource_tracking_pass: Fix image write swizzle mistake.

* amdgpu: Add missing specialization and move format mapping data to types

* reinterpret: Fix U/SToF input type.
2025-01-07 12:21:49 +02:00
TheTurtle
dcc662ff1a
ir_passes: Integrate DS barriers in block (#2020) 2025-01-02 22:52:10 +02:00
squidbus
41d64a200d
shader_recompiler: Add swizzle support for unsupported formats. (#1869)
* shader_recompiler: Add swizzle support for unsupported formats.

* renderer_vulkan: Rework MRT swizzles and add unsupported format swizzle support.

* shader_recompiler: Clean up swizzle handling and handle ImageRead storage swizzle.

* shader_recompiler: Fix type errors

* liverpool_to_vk: Remove redundant clear color swizzles.

* shader_recompiler: Reduce CompositeConstruct to constants where possible.

* shader_recompiler: Fix ImageRead/Write and StoreBufferFormatF32 types.

* amdgpu: Add a few more unsupported format remaps.
2024-12-31 06:14:47 +02:00
baggins183
62c47cb1b7
recompiler: handle reads of output variables in hull shaders (#1962)
* Handle output control point reads in hull shader. Might need additional barriers

* output storage class
2024-12-29 12:37:15 +02:00
squidbus
b1f74660df
shader_recompiler: Implement S_BCNT1_I32_B64 and S_FF1_I32_B64 (#1889)
* shader_recompiler: Implement S_BCNT1_I32_B64

* shader_recompiler: Implement S_FF1_I32_B64

* shader_recompiler: Implement IEqual for 64-bit.

* shader_recompiler: Fix immediate type in S_FF1_I32_B32
2024-12-27 16:46:07 +02:00
squidbus
a89c29c2ca
shader_recompiler: Rework image read/write emit. (#1819) 2024-12-25 01:13:32 +02:00
Daniel R.
c284cf72e1
Switch remaining CRLF terminated files to LF 2024-12-24 13:56:31 +01:00
georgemoralis
b0b74243af clang-fix 2024-12-19 10:25:03 +02:00
TheTurtle
188eebb92a
ir: Add heuristic based LDS barrier pass (#1801)
* ir: Add heuristic based LDS barrier pass

* Attempts to insert barriers after zero-depth divergant conditional blocks in shaders that use shared memory

* lds_barriers: Limit to nvidia

* Intel has historically had problems with cs barriers, will debug other time
2024-12-19 10:18:28 +02:00
baggins183
9aa1c13c7e
Fix some compiler problems with ds3 (#1793)
- Implement S_CMOVK_I32
- Handle Isoline abstract patch type
2024-12-15 16:30:19 +02:00
squidbus
f93677b953
resource_tracking_pass: Fix converting dimensions to float for normalization. (#1790) 2024-12-14 22:46:35 +02:00
baggins183
3c0c921ef5
Tessellation (#1528)
* shader_recompiler: Tessellation WIP

* fix compiler errors after merge

DONT MERGE set log file to /dev/null

DONT MERGE linux pthread bb fix

save work

DONT MERGE dump ir

save more work

fix mistake with ES shader

skip list

add input patch control points dynamic state

random stuff

* WIP Tessellation partial implementation. Squash commits

* test: make local/tcs use attr arrays

* attr arrays in TCS/TES

* dont define empty attr arrays

* switch to special opcodes for tess tcs/tes reads and tcs writes

* impl tcs/tes read attr insts

* rebase fix

* save some work

* save work probably broken and slow

* put Vertex LogicalStage after TCS and TES to fix bindings

* more refactors

* refactor pattern matching and optimize modulos (disabled)

* enable modulo opt

* copyright

* rebase fixes

* remove some prints

* remove some stuff

* Add TCS/TES support for shader patching and use LogicalStage

* refactor and handle wider DS instructions

* get rid of GetAttributes for special tess constants reads. Immediately replace some upon seeing readconstbuffer. Gets rid of some extra passes over IR

* stop relying on GNMX HsConstants struct. Change runtime_info.hs_info and some regs

* delete some more stuff

* update comments for current implementation

* some cleanup

* uint error

* more cleanup

* remove patch control points dynamic state (because runtime_info already depends on it)

* fix potential problem with determining passthrough

---------

Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-14 12:56:17 +02:00
squidbus
8caca4df32
shader_recompiler: Support VK_AMD_shader_image_load_store_lod for IMAGE_STORE_MIP (#1770)
* shader_recompiler: Support VK_AMD_shader_image_load_store_lod for IMAGE_STORE_MIP

* emit_spirv: Fix missing extension declaration.
2024-12-14 12:03:42 +02:00
squidbus
f1c23d514b
shader_recompiler: Implement FREXP instructions. (#1766) 2024-12-13 21:51:39 +02:00
TheTurtle
722a0e36be
graphics: Improve handling of color buffer and storage image swizzles (#1763)
* liverpool_to_vk: Remove wrong component swap formats

* shader_recompiler: Handle storage and buffer format swizzles

* shader_recompiler: Skip unsupported depth export

* image_view: Remove image format swizzle

* Platform support is not always guaranteed
2024-12-13 21:49:37 +02:00
squidbus
028be3ba5d
shader_recompiler: Emulate unnormalized sampler coordinates in shader. (#1762)
* shader_recompiler: Emulate unnormalized sampler coordinates in shader.

* Address review comments.
2024-12-13 21:49:07 +02:00
TheTurtle
22a2741ea0
shader_recompilers: Improvements to SSA phi generation and lane instruction elimination (#1667)
* shader_recompiler: Add use tracking for Insts

* ssa_rewrite: Recursively remove phis

* ssa_rewrite: Correct recursive trivial phi elimination

* ir: Improve read lane folding pass

* control_flow: Avoid adding unnecessary divergant blocks

* clang format

* externals: Update ext-boost

---------

Co-authored-by: Frodo Baggins <baggins31084@proton.me>
2024-12-05 23:14:16 +02:00
squidbus
920acb8d8b
renderer_vulkan: Parse fetch shader per-pipeline (#1656)
* shader_recompiler: Read image format info directly from sharps instead of storing in shader info.

* renderer_vulkan: Parse fetch shader per-pipeline

* Few minor fixes.

* shader_recompiler: Specialize on vertex attribute number types.

* shader_recompiler: Move GetDrawOffsets to fetch shader
2024-12-04 13:03:47 +02:00
Jamie Tong
b0860d6e8c
implement DS_AND_B32, DS_OR_B32, DS_XOR_B32 (#1593)
* implement DS_OR_B32

* implement DS_AND_B32, DS_XOR_B32
2024-11-30 22:39:11 +02:00
baggins183
fde1726af5
recompiler: fix how srt pass handles step rate sharps in special case (#1587) 2024-11-24 11:49:59 +01:00
¥IGA
c83ac654ce
Bump to Clang 18 (#1549) 2024-11-21 12:08:22 +02:00
Daniel R.
17c47bcd96
shader_recompiler/frontend: Implement bitcmp instructions (#1550) 2024-11-19 21:38:32 +01:00
Lander Gallastegi
b64dcd2f56
Assert fix (#1521) 2024-11-12 09:26:48 +02:00
psucien
204bba9be8 hot-fix: pr merge conflict resolved 2024-11-05 22:59:45 +01:00
Lander Gallastegi
aa4c6c0178
shader_recompiler: patch fmask access instructions (#1439)
* Fix multisample texture fetch

* Patch some fmask reads

* clang-format

* Assert insteed of ignore, coordinate fixes

* Patch ImageQueryDimensions
2024-11-05 22:39:57 +01:00
baggins183
9ec75c3feb
Implement shader resource tables (#1165)
* Implement shader resource tables

* fix after rebase + squash

* address some review comments

* fix pipeline_common

* cleanup debug stuff

* switch to using single codegenerator
2024-11-01 08:55:53 +02:00
TheTurtle
87f8fea4de
renderer_vulkan: Commize and adjust buffer bindings (#1412)
* shader_recompiler: Implement finite cmp class

* shader_recompiler: Implement more opcodes

* renderer_vulkan: Commonize buffer binding

* liverpool: More dma data impl

* fix

* copy_shader: Handle additional instructions from Knack

* translator: Add V_CMPX_GE_I32
2024-10-19 15:30:58 +03:00
Herman Semenoff
96ea686eb6
Fixed return strict const iterator, replace to range-based loop C++17 and code refactor (#548)
Signed-off-by: Herman Semenov <GermanAizek@yandex.ru>
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
2024-10-18 11:06:11 +03:00
squidbus
0f91661660
resource_tracking_pass: Make sure immediate offset is accessed as correct type. (#1339) 2024-10-10 23:58:01 +03:00
squidbus
2f80d7565d
resource_tracking_pass: Fix type handling of sample offsets. (#1337) 2024-10-10 23:30:09 +03:00
squidbus
21eb175aa1
shader_recompiler: Add asserts for get/set register bounds. (#1336) 2024-10-10 23:14:50 +03:00
squidbus
0df0d0cb66
shader_recompiler: Fix last image sample address parameter. (#1334) 2024-10-10 22:51:11 +03:00
squidbus
d91ad6174e
shader_recompiler: Move sampling parameter resolution to tracking pass and support more derivative types. (#1290)
* shader_recompiler: Move sampling parameter resolution to tracking pass and support more derivative types.

* shader_recompiler: Only track sampler sharp on sample instructions.

* shader_recompiler: Fix Inst args size.
2024-10-10 19:27:34 +03:00
TheTurtle
100036aecf
spirv: Flush denormals if possible (#1302) 2024-10-10 17:47:39 +03:00
baggins183
3c0255b953
DebugPrintf in shaders (#1252)
* Add shader debug print opcode that uses NonSemantic DebugPrintf extension

* small correction for flags in Inst

* Fix IR Debug Print. Add StringLiteral op

* add missing microinstruction changes for debugprint

* cleanup. delete vaarg stuff. Smuggle format string in Info and flags

* more cleanup

* more

* (dont merge??) update sirit submodule

* fix num args 4 -> 5

* add notes about DebugPrint IR op

* use NumArgsOf again

* copyright

* update sirit submodule

* fix clangformat

* add new Value variant for string literal. Use arg0 for fmt string

* remove string pool changes

* Update src/shader_recompiler/ir/value.cpp

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>

---------

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-10-06 22:34:40 +03:00
psucien
927bb0c175
Initial support of Geometry shaders (#1244)
* video_core: initial GS support

* fix for components mapping; missing prim type
2024-10-06 01:26:50 +03:00
korenkonder
9f79764b01
Add various V_CVT opcodes (#1223) 2024-10-04 08:48:05 +02:00
TheTurtle
ee38eec7fe
shader_recompiler: Additional scope handling and user data as push constants (#1013)
* shader_recompiler: Use push constants for user data regs

* shader: Add some GR2 instructions

* shader: Add some instructions

* shader: Add instructions for knack

* touchups

* spirv: Better names

* buffer_cache: Ignore non gpu modified images

* clang format

* Add log

* more fixes
2024-09-23 08:55:43 +02:00
psucien
fb5bc371cb hot-fix: unnecessary optimization removed 2024-09-22 19:56:07 +02:00
IndecisiveTurtle
e1d03c35fd hotfix: Fix mipmap query for images 2024-09-22 19:17:54 +03:00
korenkonder
5db27109c9
Optimise out unnecessary shifts (#1021) 2024-09-22 15:02:20 +03:00
psucien
5f4ddc14fc
Image subresources barriers (#904)
* video_core: texture: image subresources state tracking

* shader_recompiler: use one binding if the same image is read and written

* video_core: added rebinding of changed textures after overlap resolve

* don't use pointers; slight `FindTexture` refactoring

* video_core: buffer_cache: don't copy over the image size

* redundant barriers removed; fixes

* regression fixes

* texture_cache: 3d texture layers count fixup

* shader_recompiler: support for partially bound cubemaps

* added support for cubemap arrays

* don't bind unused color buffers

* fixed depth promotion to do not use stencil

* doors

* bonfire lit

* cubemap array index calculation

* final touches
2024-09-21 21:45:56 +02:00
squidbus
913a46173a
resource_tracking_pass: Allow derivatives for 2D array images. (#1000) 2024-09-21 14:19:01 +02:00