From: Jinoh Kang Subject: [PATCH 4/4] dlls/ntdll: Call try_recv inside sock_recv only if it is safe to do so. Message-Id: <518be0f1-0fd6-6b01-d061-eb6092b413aa@gmail.com> Date: Wed, 19 Jan 2022 14:50:16 +0900 In-Reply-To: <6cc2faff-cb91-7422-987e-6ca32cfba378@codeweavers.com> References: <2745e856-7ed9-e815-099f-9d2a75c399d2@codeweavers.com> <6cc2faff-cb91-7422-987e-6ca32cfba378@codeweavers.com> On 1/19/22 08:48, Zebediah Figura (she/her) wrote: >> If that is considered to be the likely case, the question is: if there is a planned change which is going to increase amount of server calls per every sync socket operation from 2 to 3 should it be presumed ok and subject to be proven otherwise? Or should instead some metrics and reasoning suggesting that the added overhead is not sufficient should be provided instead before making such a change? I am not sure if the issue addressed by these patches is a regression, maybe if that is a regression it can give a motivation to fix that first and think about optimization next. Otherwise I was under impression that (as usual for Wine patches?) it is the subject for patch justification to make some argument that it doesn't regress performance considerably. >> > > What concerns me is the prospect that once we do one server call we've already lost. It leads me to wonder if three server calls is perceptibly worse than two—and if it is, perhaps we should find a way to eliminate those two server calls instead. We can eliminate wait_async() only if: 1. The file object does not have any associated completion ports. 2. The request doesn't have a completion APC. 3. The request doesn't have a completion event. 4. Either the object has FILE_SKIP_SET_EVENT_ON_HANDLE flag, or is already signaled. 5. The server-side async object doesn't have completion_callback assigned. However, these are not the prerequisite. Additional measures have to be taken: 1. We need to make sure that listened FD events (poll flags) are set correctly at all times, before and after try_recv(). 2. We need to implement some kind of opportunistic lock (not unlike RCU); Later calls to receive_sock shall probe for any pending synchronous I/O on the same file object. Normally this would involve IPC, but we can short circuit if all I/O to the file object are happening in a single thread.* 3. Some other cases I haven't thought of. We can take a step further and remove the server calls entirely. This would however require some kind of shared memory to flag handles as async-safe. Such effort would apply generally to the entire Wine server-client architecture, though. *In principle, it's impossible to perform multiple synchronous I/O simultaneously. In the event Wine supports special user-mode APCs, we can simply block SIGUSR1 in the critical region. > > It's quite likely that I'm biased from working with frame-time performance, though. It is probably very hard to not make any server calls at all when doing socket I/O. So depending on how ugly it gets to cut out the third server call it's probably not worth arguing against it. If the server calls make up the majority of overhead, then it's possible that each removed call will translate to noticeable drop in I/O latency. Not that I have a concrete example, though. -- Sincerely, Jinoh Kang