From: "Rémi Bernon" Subject: Re: [PATCH] winebuild: Clear RFLAGS before entering syscall. Message-Id: Date: Wed, 2 Jun 2021 22:10:32 +0200 In-Reply-To: References: <20210601084301.2074741-1-rbernon@codeweavers.com> <324e7d38-0d9e-79d6-c9a0-26818ac1c7fb@codeweavers.com> On 6/2/21 10:00 PM, Rémi Bernon wrote: > On 6/2/21 5:45 PM, Rémi Bernon wrote: >> On 6/2/21 5:35 PM, Jacek Caban wrote: >>> Hi Rémi, >>> >>> On 6/1/21 10:43 AM, Rémi Bernon wrote: >>>> We pushed the flags, but kept them set. Far Cry sets NT flags, which >>>> causes later iretd instruction to raise a GP fault exception. >>>> >>>> This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3. >>> >>> >>> iret is responsible for a fair chunk of syscall dispatcher overhead. >>> I plan to submit patches optimizing syscall dispatcher to not do iret >>> in the usual code path and use signal_restore_full_cpu_context if >>> context was modified during the syscall. I don't have 32-bit version >>> implemented yet, but the idea would be the same. I think those >>> optimization should fix the regression without adjusting flags on >>> each syscall. signal_restore_full_cpu_context would still have the >>> same problem, but maybe then we could do flags adjustment only there? >>> Or are there other reasons to do the adjustment (in which case I >>> would need to take that into account in my patches as well)? >>> >>> >>> Thanks, >>> >>> Jacek >>> >> >> I don't know for sure but if some other code inside the syscall does >> iretd with the NT flag still being set, it may cause a crash again. >> >> This was actually an issue a long time ago as [1] suggest, and it's >> been fixed in the Linux kernel [2]. As the Linux kernel now clears the >> flags on syscall entry it doesn't seem completely crazy to clear it on >> NT syscall entry (although probably not useful if you intend to >> replace the iretd). >> >> [1] https://bugs.winehq.org/show_bug.cgi?id=33275 >> [2] https://lkml.org/lkml/2014/10/27/825 > > Also FWIW, on my desktop (with a simple NtQPC benchmark, with perf > sampling) I don't see any overhead caused by iret, most of the syscall > overhead comes from xsavec64 / xrstor64. It may be hardware dependent. Nevermind, I can see it (although FPU state still is much bigger). -- Rémi Bernon