[ros-kernel] Advice needed: thread termination, APCs
Ge van Geldorp
gvg at reactos.com
Tue Sep 14 11:40:03 CEST 2004
This is going to be a little bit hard to explain, so please bear with
I've got CMD.EXE running (process id 9). It's main thread (let's call it
thread 91) has started MC.EXE (process id 10, a console app). Thread 91
is waiting for process 10 to finish.
The first event happening is that process 10 is terminated. This causes
all threads in process 10 to terminate, but the process itself isn't
cleaned up yet because thread 91 still has a handle to the process.
However, thread 91 will get out of it's wait state and resume execution.
It will call NtClose() to close the handle it kept to process 10. Since
this is the last handle around that will trigger the cleanup of process
10. As part of that cleanup, the memory space of process 10 is released.
At one point MmFreeMemoryArea() is called. Since thread 91 belongs to
process 9, but we need to work on the memory map of process 10,
KeAttachProcess() is called, attaching process 10 to thread 91. It then
starts freeing pages.
At that point, a new thread is created in process 9 using
CreateRemoteThread(), let's call it thread 92. This thread 92 starts
running and almost immediately calls ExitProcesss(). This will start the
termination of process 9. There are two threads belonging to process 9,
thread 91 and 92. To force thread 91 to terminate, an APC is queued to
it that will cause 91 to self-destruct on return to usermode. The
crucial part is that the APC is queued while thread 91 had attached to
process 10. Thread 92 then self-destructs.
Thread 91 resumes execution in MmFreeMemoryArea(), attached to process
10. When its work is done in that function, it will call
KeDetachProcess() to clean up. KeDetachProcess() will call
KiSwapApcEnvironment(). Gunnar put in the following comment in that
"FIXME: Deliver any pending apc's queued to the attached environment
before switching back to the original environment. This is not crucial
thou, since i don't think we'll ever target apc's to attached
environments (or?)... Remove the following asserts if implementing
followed by asserts which check that no APCs have been queued. In the
scenario described above, the assert fails.
I don't have enough knowledge of APCs to figure out how this should be
fixed. My gut feeling tells me that in this particular case we should
just keep the APC queued but I could be totally wrong.
While trying to figure out what was going on here, I ran into what seems
to be another problem with APCs. According to the QueueUserAPC()
documentation, user-mode APCs are only delivered on return from a
handfull APIs: SleepEx(), SignalObjectAndWait(),
WaitForSingleObjectEx(), WaitForMultipleObjectsEx() and
MsgWaitForMultipleObjectsEx(). Sure enough, a small test showed that a
usermode APC is not delivered while waiting in GetMessage(). However, it
seems that in ReactOS the usermode APC is delivered after any syscall. I
assume that will have to be changed at some point. But when that's
changed, the current thread termination mechanism (queue a usermode APC
to self-destruct) won't work anymore. It would be possible for a thread
to call NtUserGetMessage(), be terminated while in kernel mode waiting
for a message, and then return from NtUserGetMessage() back to usermode
because the APC wouldn't be delivered.
So I've been thinking, maybe we shouldn't rely on an APC to terminate
the thread? It would be easy to check the thread state in
KiAfterSystemCall() (which is called just before returning to usermode).
Gé van Geldorp.
More information about the Ros-kernel