[ros-dev] [ros-diffs] [sir_richard] 45052: Patch that fixes VMWare boot (and

Ros Arm ros.arm at reactos.org
Tue Jan 12 15:08:51 CET 2010


Hi Timo,

Regarding #1, we actually have this working in our tree. It was required for the new SYSENTRY C handler. So thank you for the advice and for pointing that out. Unfortunately I will not credit you for the idea since the code was already written, I hope this is understandable. We had to do this not for performance, but for compatibility with the Eoi helper function which is exported. The performance difference between pushad and doing the instructions manually (and later fixing up) was actually not that big, since pushad has certain pipeline advantages. However, when dealing with interrupts, it starts becoming more of a problem (although that code is not in mainline yet). Also your sample code has some bugs: the last integer register save is duplicated, and it would be better to use a symbolic name instead of 10 * 4. That value is also wrong, since it depends on whether or not the trap generates an error code or not. x86 is not easy, have to be careful!

Regarding #2, I unfortunately do not see the need to add more assembly when it is not needed. There are no subtle compiler breakages involved, since it is impossible for the compiler to add a DS/ES access for no reason at all (can you come up with an example?), this is part of the contract the compiler has to make (if compilers did this, it would break a whole class of embedded and real-mode software). Because the stack is safe, I think it's worth avoiding the extra lines of assembly. The C inlines will produce exactly the same code anyway. Also, it's not as easy as you make it seem, since for a FAST-V86 trap, we don't want to save the registers, just change them. So we need to insert a check for EFLAGS to see if the V8086 flag is there or not, then conditionally do the registers, then jump to the C code that will do another conditional check, and pick the right trap entry C code... it gets quite complicated. Since there are no "real" issue with doing it in C, other than personal choice and style, I think we'll stick with what we have.

The patch for #1 will also start introducing some other performance enhancing features, such as branch prediction hints for the CPU and compiler, and other little boosts. With the pusha conversion code gone, the C code is actually more efficient than previous assembly implementations.

Thanks for your comments and suggestions, please keep them coming. Do not be discouraged by this particular decision regarding #2 (it's also slightly political) as a sign or pattern that any suggestion will be ignored (you were not the first to bring up #1, otherwise we'd have done it thanks to you).

Also, if anyone has any experience with MSC++ and could port the GCC-centric macros, that would be much appreciated. I've given up on MSC++ after the compiler and linker have tried, year after year, to add more and more shit to what used to be a simple portable codebase. Buffer this, buffer that, nx this, dep that, safe seh here, hotpatch there, manifest files to even get it to run... ARGH!

Finally, the recent fixes have solved the double fault problem, but it seems certain flavours of VMWare use a different Video BIOS than my test machines and that V8086 C emulation code must be hitting a bug. It could also be we're seeing the BOP and not handling it. I'll look into this today. Note that the ASM implementation of PUSHF  (or was it POPF) was completely broken. Someone must've made some serious typos when *ahem* writing the code. Hopefully there are no dependencies on that bug.

-r

> Hi,
> 
> I suggest the following changes to the current implementation:
> 
> 1.) Replace the PUSHA in the trap entry code, by a series of MOVs to the
> correct stack positions. The same for the trap exit code.
> KiTrapFrameFromPushaStack can be removed then. This would result in more
> clear, less complex and faster code, with only  a few additional
> assembly instructions in  the trap entry macro. The exit could be done
> either by normal call/return or by a jmp to an exit handler.
> 
> 2.) Segements should be fixed up before entering C code, as not doing so
> may introduce possible compiler dependend breakages. It's also much
> cleaner and there's no reason to do the same stuff later in inline
> assembly instead of direcly in the asm entry point.
> 
> The resulting code might looks something like this:
> 
>     /* Allocate KTRAP_FRAME */
>     sub esp, KTRAP_FRAME_LENGTH - 10 * 4
> 
>     /* Save integer registers */
>     mov [esp + KTRAP_FRAME_EBP], ebp
>     mov [esp + KTRAP_FRAME_EBX], ebx
>     mov [esp + KTRAP_FRAME_ESI], esi
>     mov [esp + KTRAP_FRAME_EDI], edi
>     mov [esp + KTRAP_FRAME_EAX], eax
>     mov [esp + KTRAP_FRAME_ECX], ecx
>     mov [esp + KTRAP_FRAME_EDX], edx
>     mov [esp + KTRAP_FRAME_EBX], ebx
> 
>     /* Save segment regs */
>     mov [esp + KTRAP_FRAME_SEGDS], ds
>     mov [esp + KTRAP_FRAME_SEGES], es
> 
>     /* Fixup segment regs */
>     mov ax, KGDT_R3_DATA | RPL_MASK
>     mov ds, ax
>     mov es, ax
> 
> 
> Timo
> 
> 
> _______________________________________________
> Ros-dev mailing list
> Ros-dev at reactos.org
> http://www.reactos.org/mailman/listinfo/ros-dev





More information about the Ros-dev mailing list