From: "Rémi Bernon" Subject: Re: [PATCH vkd3d 1/2] vkd3d-shader: Optimize get_opcode_info with direct opcode_table access. Message-Id: <95a6f7bc-cce7-b86a-8516-d47e49308154@codeweavers.com> Date: Fri, 4 Oct 2019 10:05:46 +0200 In-Reply-To: References: <20191003170933.12734-1-rbernon@codeweavers.com> On 10/3/19 9:05 PM, Henri Verbeet wrote: > On Thu, 3 Oct 2019 at 20:42, Rémi Bernon wrote: >> The shader_sm4_read_instruction function shows up in perf report when >> running SOTTR on Intel because of this loop. >> > That seems like a questionable claim. Does this actually improve > things? Do you have numbers? Direct3D 12 applications should ideally > not be creating pipeline states at all during rendering, but if they > do, actual shader compilation is going to be much more expensive than > anything we do here. > > That's not to say this can't be improved though. > Yes I did the measurements, and perf (with default settings) reports the function from ~2.5% self overhead down to 0.6% with this patch. For the second patch it was reporting the other function from 1.7% self overhead and didn't report it with the second patch, because it gets inlined somewhere - but the sum of all the vkd3d_spirv function overhead is lowered. My interpretation is that the shader compilation that happens at startup but is CPU bound - and it is noticeable. I then didn't let the game run for very long but it is highly GPU bound afterwards, so nothing in particular shows up in perf. -- Rémi Bernon