As several have mentioned over the last few years, it would be beneficial if we could port ReactOs to ARM, while not cannibalizing our human resources that are dedicated to x86-32/x86-64/PPC development. Thinking about the nature of Win32 applications, I was wondering what all of you might think of the following idea. It takes a somewhat different approach from pure run-time emulators, and, assuming that it worked, would allow x86-32 binaries to run on ARM devices at native ARM speed, without having to recompile, manually, the x86-32 applications:
First, some observations:
- The vast majority of Win32 applications consists of one or more EXE modules and one or more DLL modules.
- The format of these modules is well-documented, and obviously understood by the members of the ReactOS team that created the ReactOS ring-3 loader.
See: http://en.wikipedia.org/wiki/Portable_Executable - An EXE or DLL essentially runs in an execution "bubble". Anything beyond register/RAM state-manipulation goes through a documented interface: Win32.
- The calls that are made to the Win32 API are readily observable from examination of the invoking module's import table. Calls invoked via the LoadLibrary/GetProcAddress sequence are special, of course.
- Install the x86-32 application onto the ARM device as usual. The layout of the x86-32 application on the ARM file system would be essentially identical to what it would have been had the application been ARM-targeted. [This is an inductive step. See explanation below.]
- Execute the x86-32 application using the regular mechanism: CreateProcess will be invoked against the initial binary for the application, which of course, at this point, will contain ARM machine code.
- The ReactOs ring-3 loader would need to be modified so that it can detect if the executable image is x86-32-based or ARM-based.
- If the image is x86-32-based, fine. There is nothing to do. If the imagine is ARM-based, then the loader would perform dynamic translation as follows:
- Reverse-compile the x86-32 image using a machine-code-to-C disassembler.
See: http://en.wikibooks.org/wiki/X86_Disass ... ecompilers - Compile the generated C code back to ARM machine code.
- Take advantage of the fact that it is highly likely that the code-sequence for invocation of functions of the Win32 API will follow a regular, recognizable pattern. For example, when the function CreateFile is called, there would be a code sequence where the DLL import table is consulted in preparation of invocation of the CreateFile call. This is where the loader (or, equivalently, a user-mode agent that acts on behalf of the loader) performs any necessary modification so that the ARM machine code performs an equivalent operation as would the corresponding x86-32 code.
- Reverse-compile the x86-32 image using a machine-code-to-C disassembler.
- The conversion of an EXE, and of each DLL upon which it depends, would be a one-time operation. After the translation occurs, the generated ARM EXE and ARM DLL's would be stored in a cache on disk. The cache might be a set of files whose names are the name of the EXE or DLL affixed with its string-encoded SHA-256 hash of the file's contents as the contents were before translation. The contents of the file would be the machine code of the EXE or DLL as the machine code is after translation. A small bit of fancy foot-work would be necessary to move the ARM image to replace, temporarily, the x86-32 imagine so that GetCurrentDirectory/SetCurrentDirectory/etc. work as expected.
How is it possible to install an x32-86 application onto an OS that is purely ARM?
If the installation is primarily driven by an EXE, then the same mechanism described above would work. If the installation is driven by an MSI database, then something equivalent to msiexec.exe would need to be used.
Naturally, there are some problems with this technique. It will only work for well-behaved applications. Well-behaved applications consistently delegate OS-specific functions to the OS itself. There is also the matter of thread-local storage, where, I believe, the FS or GS register on x86-32 is accessed directly. Then there are specialized applications, like those that go snooping in their own thread-execution block (TEB).
One other benefit of this approach is that hoards of programmers who are comfortable using Visual Studio Express, for example, to write native C/C++ applications under Windows, would be able to target ARM without targeting ARM, by continuing to write well-behaved applications for x86-32.
Any thoughts appreciated.