Commit graph

72 commits

Author SHA1 Message Date
squidbus
17abbcd74d
misc: Fix clang format (#1673) 2024-12-06 02:21:35 +02:00
IndecisiveTurtle
77da8bac00 core: Return proper address of eh frame/add more opcodes 2024-12-06 00:47:11 +02:00
TheTurtle
22a2741ea0
shader_recompilers: Improvements to SSA phi generation and lane instruction elimination (#1667)
* shader_recompiler: Add use tracking for Insts

* ssa_rewrite: Recursively remove phis

* ssa_rewrite: Correct recursive trivial phi elimination

* ir: Improve read lane folding pass

* control_flow: Avoid adding unnecessary divergant blocks

* clang format

* externals: Update ext-boost

---------

Co-authored-by: Frodo Baggins <baggins31084@proton.me>
2024-12-05 23:14:16 +02:00
Marcin Mikołajczyk
642dedea8c
Handle INDIRECT_BUFFER_CONST in ProcessCeUpdate (#1613) 2024-12-05 23:09:59 +02:00
Daniel R.
98f0cb65d7
The way to Unity, pt.1 (#1659) 2024-12-05 17:21:35 +01:00
psucien
16e1d679dc
video_core: clean-up of indirect draws logic (#1589) 2024-11-24 15:43:28 +01:00
TheTurtle
c4506da0ae
kernel: Rewrite pthread emulation (#1440)
* libkernel: Cleanup some function places

* kernel: Refactor thread functions

* kernel: It builds

* kernel: Fix a bunch of bugs, kernel thread heap

* kernel: File cleanup pt1

* File cleanup pt2

* File cleanup pt3

* File cleanup pt4

* kernel: Add missing funcs

* kernel: Add basic exceptions for linux

* gnmdriver: Add workload functions

* kernel: Fix new pthreads code on macOS. (#1441)

* kernel: Downgrade edeadlk to log

* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload

* exception: Add context register population for macOS. (#1444)

* kernel: Pthread rewrite touchups for Windows

* kernel: Multiplatform thread implementation

* mutex: Remove spamming log

* pthread_spec: Make assert into a log

* pthread_spec: Zero initialize array

* Attempt to fix non-Windows builds

* hotfix: change incorrect NID for scePthreadAttrSetaffinity

* scePthreadAttrSetaffinity implementation

* Attempt to fix Linux

* windows: Address a bunch of address space problems

* address_space: Fix unmap of region surrounded by placeholders

* libs: Reduce logging

* pthread: Implement condvar with waitable atomics and sleepqueue

* sleepq: Separate and make faster

* time: Remove delay execution

* Causes high cpu usage in Tohou Luna Nights

* kernel: Cleanup files again

* pthread: Add missing include

* semaphore: Use binary_semaphore instead of condvar

* Seems more reliable

* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`

* libraries/kernel: implement `scePthreadSetPrio`

---------

Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
2024-11-21 22:59:38 +02:00
psucien
8fbd9187f8
libraries: gnmdriver: few more functions implemented (#1544) 2024-11-18 11:23:21 +02:00
TheTurtle
87f8fea4de
renderer_vulkan: Commize and adjust buffer bindings (#1412)
* shader_recompiler: Implement finite cmp class

* shader_recompiler: Implement more opcodes

* renderer_vulkan: Commonize buffer binding

* liverpool: More dma data impl

* fix

* copy_shader: Handle additional instructions from Knack

* translator: Add V_CMPX_GE_I32
2024-10-19 15:30:58 +03:00
Vinicius Rangel
25de4d6b65
Devtools improvements I (#1392)
* devtools: fix showing entire depth instead of bits

* devtools: show button for stage instead of menu bar

- fix batch view dockspace not rendering when window collapsed

* devtools: removed useless "Batch" collapse & don't collapse last batch

* devtools: refactor DrawRow to templating

* devtools: reg popup size adjusted to the content

* devtools: better window names

* devtools: regview layout compacted

* devtools: option to show collapsed frame dump

keep most popups open when selection changes
best popup windows positioning

* devtools: show compute shader regs

* devtools: tips popup
2024-10-16 13:12:46 +03:00
Vinicius Rangel
cf2e617f08
Devtools - Inspect regs/User data/Shader disassembly (#1358)
* devtools: pm4 - show markers

* SaveDataDialogLib: fix compile with mingw

* devtools: pm4 - show program state

* devtools: pm4 - show program disassembly

* devtools: pm4 - show frame regs

* devtools: pm4 - show color buffer info as popup

add ux improvements for open new windows with shift+click
better window titles

* imgui: skip all textures to avoid hanging with crash diagnostic enabled

not sure why this happens :c

* devtools: pm4 - show reg depth buffer
2024-10-13 15:02:22 +03:00
korenkonder
6e986f8133
video_core: Implement sceGnmInsertPushColorMarker (#989) 2024-10-10 18:03:12 +03:00
Daniel R.
80bf46da4c
core/memory: Pooled memory implementation (#1085) 2024-09-29 10:28:41 +03:00
baggins183
bc66fe8fb5
Fix copyGpuBuffers when resize invalidates commands in flight (#876)
* Fix copyGpuBuffers when resize invalidates commands in flight

* Use _MB macro for size constant
2024-09-12 21:54:54 +02:00
psucien
adfb3af95f hot-fix: nullGpu functionality restored 2024-09-09 08:59:47 +02:00
TheTurtle
13743b27fc
shader_recompiler: Implement data share append and consume operations (#814)
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
psucien
34ffd95306
video_core: added VK_LAYER_LUNARG_crash_diagnostic (#751) 2024-09-03 21:56:23 +02:00
baggins183
3f8a8d3a24
video_core: Add bounds checking for subspan use in liverpool functions (#717) 2024-09-03 13:58:45 +03:00
psucien
ca1613258f
video_core: added support for indirect draws (#678)
* video_core: added support for indirect draws

* barriers simplified
2024-08-30 22:59:56 +02:00
psucien
9d349a1308 video_core: added support for indirect dispatches (gfx only) 2024-08-29 12:32:37 +02:00
georgemoralis
be49871c68
Merge pull request #618 from vertver/main
video_core: Added copyGPUCmdBuffers option
2024-08-28 14:00:26 +03:00
Anton Kovalev
dfb30ea955 Use pair of spans instead of references in copy command buffers function 2024-08-28 11:24:15 +02:00
Random
c37679154e
Handle PM4 type-2 packets (#556)
* video_core: handle PM4 type-2 packets

* video_core: rewrite pm4 comand type handling into a switch statement
2024-08-28 09:53:27 +02:00
Anton Kovalev
87ccfdfbbd Fixed type on function 2024-08-28 09:42:31 +02:00
Anton Kovalev
1a02efbd15 clang-format style fix 2024-08-28 05:42:48 +02:00
Anton Kovalev
3842993a43 Use input dcb and ccb instead of copy 2024-08-28 00:21:12 +02:00
Anton Kovalev
3d46a5d492 Do not shrink buffer's size on submit 2024-08-27 23:33:24 +02:00
Anton Kovalev
595b845df0 clang-format fix 2024-08-27 23:31:04 +02:00
Anton Kovalev
659e7a4675 video_core: Added copyGPUCmdBuffers option 2024-08-27 23:16:14 +02:00
DanielSvoboda
2a737d0800
V_NOP | PfpSyncMe | S_CMPK_EQ_U32 (#426)
* V_NOP

V_NOP = Do nothing

* PfpSyncMe

PfpSyncMe ensures that all previous commands are completed before continuing.
'break' should be enough for now

* S_CMPK_EQ_U32

S_CMPK_EQ_U32
SCC = (D.u == SIMM16)

* S_CMPK_EQ_U32

* OperandField::Undefined:

* Update translate.cpp

remove  OperandField::Undefined:

* Update image_view.cpp

[Render.Vulkan] <Error> image_view.cpp:ImageViewInfo:109: Storage image (num_comps = 4) requires swizzling [BGRA]
format 43 dst_sel 3886

* Update liverpool_to_vk.cpp

* S_CMPK_EQ_U32

* S_CMPK_EQ_U32
2024-08-25 22:07:46 +02:00
psucien
b687ae5e34
GnmDriver: Clear context support (#567)
* gnmdriver: added support for gpu context reset

* shader_recompiler: minor validation fixes

* shader_recompiler: added `V_CMPX_GT_I32`

* shader_recompiler: fix for crash on inline sampler access

* compilation warnings and dead code elimination

* amdgpu: fix for registers addressing

* libraries: videoout: reduce logging pressure

* shader_recompiler: fix for devergence scope detection
2024-08-25 23:01:05 +03:00
psucien
27cb218584
video_core: CPU flip relay (#415)
* video_core: cpu flip is propagated via gpu thread now

* tentative fix for cpu flips racing

* libraries: videoout: better flip status handling
2024-08-14 11:36:11 +02:00
TheTurtle
d8b9d82ffa
video_core: Various fixes (#423)
* video_core: Various fixes

* clang format
2024-08-13 20:05:10 +03:00
psucien
3d0fdf11f0
Build stabilization (#413)
* shader_recompiler: fix for float convert and debug asserts

* libraries: kernel: correct return code on invalid semaphore

* amdgpu: additional case for cb extents retrieval heuristic

* removed redundant check in assert

* amdgpu: fix for linear tiling mode detection fin color buffers

* texture_cache: fix for unexpected scheduler flushes by detiler

* renderer_vulkan: missing depth barrier

* texture_cache: missed slices in rt view; + detiler format
2024-08-12 17:23:01 +03:00
psucien
ace39957ef
Video Core: debug tools (#412)
* video_core: better use of rdoc markers

* renderer_vulkan: added gpu assisted validation

* renderer_vulkan: make nv_checkpoints operational

* video_core: unified Vulkan objects names
2024-08-12 13:46:45 +02:00
TheTurtle
a7c9bfa5c5
shader_recompiler: Small instruction parsing refactor/bugfixes (#340)
* translator: Implemtn f32 to f16 convert

* shader_recompiler: Add bit instructions

* shader_recompiler: More data share instructions

* shader_recompiler: Remove exec contexts, fix S_MOV_B64

* shader_recompiler: Split instruction parsing into categories

* shader_recompiler: Better BFS search

* shader_recompiler: Constant propagation pass for cmp_class_f32

* shader_recompiler: Partial readfirstlane implementation

* shader_recompiler: Stub readlane/writelane only for non-compute

* hack: Fix swizzle on RDR

* Will properly fix this when merging this

* clang format

* address_space: Bump user area size to full

* shader_recompiler: V_INTERP_MOV_F32

* Should work the same as spirv will emit flat decoration on demand

* kernel: Add MAP_OP_MAP_FLEXIBLE

* image_view: Attempt to apply storage swizzle on format

* vk_scheduler: Barrier attachments on renderpass end

* clang format

* liverpool: cs state backup

* shader_recompiler: More instructions and formats

* vector_alu: Proper V_MBCNT_U32_B32

* shader_recompiler: Port some dark souls things

* file_system: Implement sceKernelRename

* more formats

* clang format

* resource_tracking_pass: Back to assert

* translate: Tracedata

* kernel: Remove tracy lock

* Solves random crashes in Dark Souls

* code: Review comments
2024-07-30 23:32:40 +02:00
TheTurtle
0d6edaa0a0
Move presentation to separate thread/improve sync (#303)
* video_out: Move presentation to separate thread

* liverpool: Better sync for CPU flips

* driver: Make flip blocking

* videoout: Proper flip rate and vblank management

* config: Add vblank divider option

* clang format

* videoout: added `sceVideoOutWaitVblank`

* clang format

* vk_scheduler: Silly merge conflict

* externals: Add renderdoc API

* clang format

* reuse

* rdoc: manual capture trigger

* clang fmt

---------

Co-authored-by: psucien <168137814+psucien@users.noreply.github.com>
2024-07-28 15:54:09 +02:00
squidbus
66fa29059c Add initial macOS support. 2024-07-21 22:36:12 +03:00
psucien
dc50cc55fb
missing line fix 2024-07-14 17:11:54 +02:00
psucien
b8916787b2 renderer: debug markers for ability to match cmdlists with rdoc captures 2024-07-14 11:37:52 +02:00
psucien
8144f835a9 amdgpu: additional heuristic for CB extents detection
Found in CUSA00144
2024-07-14 10:59:22 +02:00
psucien
986ed0662c gnmdriver, amdgpu: added gpu idle IRQ; submission lock logic improved 2024-07-06 18:03:34 +02:00
TheTurtle
6ceab6dfac
shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store (#231)
* shader_recompiler: Add LDEXP

* shader_recompiler: Add most image integer atomic ops

* shader_recompiler: Implement shared memory load/store

* shader_recompiler: More image atomics

* externals: Update sirit

* clang format

* cmake: Add missing files

* shader_recompiler: Fix some atomic bugs

* shader_recompiler: Vs outputs

* shader_recompiler: Shared mem has side-effects, fix format component order

* shader_recompiler: Inline constant buffer impl

* video_core: Fix regressions

* Work

* Fixup a few things
2024-07-05 00:15:44 +03:00
IndecisiveTurtle
22b930ba5e video_core: Track renderpass scopes properly 2024-07-01 13:56:14 +03:00
psucien
91940781b8 libraries: gnmdriver: complete HW stat init functions 2024-06-27 13:36:55 +02:00
IndecisiveTurtle
550bfa1c88 liverpool: Fix assert for compute queues 2024-06-26 20:00:09 +03:00
psucien
cb6b21de1f
Initial instancing and asynchronous compute queues (#207)
* gnm_driver: added `sceGnmRegisterOwner` and `sceGnmRegisterResource`

* video_out: `sceVideoOutGetDeviceCapabilityInfo` for sdk runtime

* gnm_driver: correct vqid index range

* amdgpu: indirect buffer, release mem and some additional irq modes

* amdgpu: added ASC commands processor

* shader_recompiler: added support for fetch instance id

* amdgpu: classic bitfields for T# representation (debugging experience)

* renderer_vulkan: skip zero sized VBs from binding

* texture_cache: image upload logic moved into `Image` object

* gnm_driver: `sceGnmDingDong` implementation

* texture_cache: `Image` usage flags moved; correct VO buffer pitch
2024-06-22 19:50:20 +03:00
TheTurtle
c5d1d579b1
core: Many things (#194)
* video_core: Add a few missed things

* libkernel: More proper memory mapped files

* memory: Fix tessellation buffer mapping

* Cuphead work

* sceKernelPollSema fix

* clang format

* fixed ngs2 lle loading and rtc lib

* draft pthreads keys implementation

* fixed return codes

* return error code if sceKernelLoadStartModule module is invalid

* re-enabled system modules and disable debug in libs.h

* Improve linux support

* fix windows build

* kernel: Rework keys

---------

Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
2024-06-15 14:36:07 +03:00
psucien
04b1226e9c tracy: basic markup and project palette 2024-06-11 12:14:33 +02:00
raphaelthegreat
ae7e6dafd5 shader_recompiler: Add more instructions and fix a few thinhs 2024-06-05 22:22:34 +03:00