Optimize AttributeBuffer to OutputVertex conversion (#3283)
Optimize AttributeBuffer to OutputVertex conversion First I unrolled the inner loop, then I pushed semantics validation outside of the hotloop. I also added overflow slots to avoid conditional branches. Super Mario 3D Land's intro runs at almost full speed when compiled with Clang, and theres a noticible speed increase in MSVC. GCC hasn't been tested but I'm confident in its ability to optimize this code.
This commit is contained in:
parent
3f7f2b42c0
commit
41929371dc
4 changed files with 34 additions and 18 deletions
|
@ -50,6 +50,7 @@ struct OutputVertex {
|
|||
INSERT_PADDING_WORDS(1);
|
||||
Math::Vec2<float24> tc2;
|
||||
|
||||
static void ValidateSemantics(const RasterizerRegs& regs);
|
||||
static OutputVertex FromAttributeBuffer(const RasterizerRegs& regs,
|
||||
const AttributeBuffer& output);
|
||||
};
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue