From: Matteo Bruni Subject: Re: [PATCH] include: Use __cdecl calling convention for wine_dbg_log(). Message-Id: Date: Sat, 19 Sep 2020 09:05:25 +0200 In-Reply-To: <8832f565-e0ee-763d-f712-7dccd83f6c4b@codeweavers.com> References: <20200910193811.420427-1-mbruni@codeweavers.com> <0af3be11-847d-0548-02d3-bf4eaa363480@codeweavers.com> <8832f565-e0ee-763d-f712-7dccd83f6c4b@codeweavers.com> On Fri, Sep 18, 2020 at 6:52 PM Rémi Bernon wrote: > > On 2020-09-10 22:30, Rémi Bernon wrote: > > On 2020-09-10 21:38, Matteo Bruni wrote: > >> The practical consequence is that registers are saved / restored > >> inside wine_dbg_log() instead of in the caller, which means not > >> incurring in the overhead when debug channels are disabled. > >> > >> Signed-off-by: Matteo Bruni > >> --- > >> This came up from hacking around yesterday with Paul and Rémi. I > >> haven't tested the patch extensively but it seems to do the > >> trick. wine_dbg_log() in practice is never inlined for me and that's > >> the reason why this is enough. It's not possible to remove the inline > >> keyword from the definition because that makes it trigger the "unused > >> static function" warning all over Wine. > >> > >> Worth mentioning that this patch makes sure that wine_dbg_log() has > >> the same calling convention as __wine_dbg_header() and the other > >> debug functions in ntdll. > >> --- > >> include/wine/debug.h | 12 ++++++------ > >> 1 file changed, 6 insertions(+), 6 deletions(-) > >> > >> diff --git a/include/wine/debug.h b/include/wine/debug.h > >> index 912d61a90af0..462a43e22e20 100644 > >> --- a/include/wine/debug.h > >> +++ b/include/wine/debug.h > >> @@ -193,12 +193,12 @@ static inline int __wine_dbg_cdecl > >> wine_dbg_printf( const char *format, ... ) > >> return __wine_dbg_output( buffer ); > >> } > >> -static int __wine_dbg_cdecl wine_dbg_log( enum __wine_debug_class cls, > >> - struct __wine_debug_channel > >> *channel, const char *func, > >> - const char *format, ... ) > >> __WINE_PRINTF_ATTR(4,5); > >> -static inline int __wine_dbg_cdecl wine_dbg_log( enum > >> __wine_debug_class cls, > >> - struct > >> __wine_debug_channel *channel, > >> - const char > >> *function, const char *format, ... ) > >> +static int __cdecl wine_dbg_log( enum __wine_debug_class cls, > >> + struct __wine_debug_channel > >> *channel, const char *func, > >> + const char *format, ... ) > >> __WINE_PRINTF_ATTR(4,5); > >> +static inline int __cdecl wine_dbg_log( enum __wine_debug_class cls, > >> + struct __wine_debug_channel > >> *channel, > >> + const char *function, const > >> char *format, ... ) > >> { > >> char buffer[1024]; > >> __wine_dbg_va_list args; > >> > > > > Well, if we intend to make these functions __cdecl, probably they all > > should be, in this header, including the inline helpers, as we've seen > > that any missing attribute can make the spilling leak to its callers. > > > > Then, I didn't look very precisely either, but I believe there's > > actually a big issue that makes it not so simple after all (correct me > > if I'm wrong though, maybe it's all good): > > > > If we set "ms_abi" attribute to these vararg functions, then their > > calling convention will be as such. Thus, we cannot use anymore the sysv > > va_list in there, and we have instead to use the __ms_va_list (defined > > to __builtin_ms_va_list in this case). > > > > Now, doing so will break the vsnprintf call, when it goes to libc, as it > > expects a sysv va_list argument instead. > > > > One possible way that I see would be instead to have a > > __wine_dbg_vsnprintf export in ntdll, that either forward the call to > > msvcrt vsnprintf (which already expects a __ms_va_list argument), or to > > an ntdll.so implementation. > > > > The ntdll.so implementation will have to be here anyway, to support its > > own traces, as it cannot go back to msvcrt for that. It will have to > > handle the __ms_va_arg somehow, possibly translating it for libc (which > > actually may not be so hard, with snprintf and one argument at a time), > > or handle the formatting on its own. > > > > The latter could actually make the whole thing worth the effort, if we > > consider that TRACE messages could benefit from a specific printf > > implementation with some extensions. For instances, it could print > > strings and wstrings at the same time, in a safe way, without needing > > the debugstr_a/w anymore. Same for guids for instance. > > > Hi! > > > I actually pushed the idea a little bit further and implemented it, in > the attached patches, to illustrate a bit and because I think it could > possibly still be interesting. > > > The SSE register spilling savings are actually a little bit > underwhelming. As we have seen, any ms_abi function that calls a > sysv_abi function (or a non specified, in an ELF builtin) will spill the > XMM registers to the stack and restore them after the call. > > Marking the debug functions ms_abi mostly helps with the leaf functions > that do not call anything else, but for instance in wined3d, there's a > lot of unspecified helpers that will cause the same spilling to happen. > We will have to specify ms_abi on all of them to solve the issue. > > Then, making all functions ms_abi pushes the spilling down the call > chain, and if there's a system function to call in the end, it will have > to spill the registers anyway. If that function is called much more > often than the Wine entry point, then it may also not be ideal to force > the register spilling there, and having it on the entry point would be > preferable. > > Anyway, making debug functions ms_abi should not hurt in any case, they > are not supposed to be called often, they never go back from sysv_abi to > ms_abi, and keeping their own induced spilling internally is better > regardless of what the surrounding code does. > > > Now, what I find the most useful, is what this implementation allows us > to do with debug TRACE messages. As it makes every debug formatting go > through ntdll.so, I think it has at least two interesting properties: > > 1) We can control the formatting, and as I did in the patch, extend the > format specifiers as we like, and get rid of most of the static inline > debug helpers (I don't know if that's a good thing or not, but I > personally like to be able to do so). > > 2) The actual value formatting is done consistently by glibc. There's no > value formatting difference between PE code and ELF code TRACE messages. > And thanks to 1) we can still support MSVC-specific format specifiers, > we just need to convert them to the glibc equivalents. > > > Cheers, > -- > Rémi Bernon Nice! I've been tinkering with your idea too but I didn't get to the end. One roadblock was that I wasn't sure that va_arg() would do the correct thing for a __ms_va_list. I guess it does :) I haven't reviewed your patch in detail but it looks great at a glance. It's pretty similar to what I have here except it's complete and nicer :P FWIW, I think it's worthwhile to have it even if only trivial functions get any improvement. In fact, those will get the best relative improvement (i.e. percentage of code moved out of the fast path) and they tend to be those called most frequently. For reference, the function that started the whole search into this on my side is wined3d_buffer_get_resource(). As it turns out, I had at least one game where (after a whole bunch of other wined3d changes, but that's another story...) getting rid of that one TRACE() made a tiny but measurable difference.