Difference between revisions of "GDB"

From ReactOS Wiki
Jump to: navigation, search
m (Spelling correction)
(Add setjmp/longjmp known problem)
Line 126: Line 126:
 
Another problem is that GDB is triggered by "first-chance" exceptions. This means that GDB will get control, even if the code itself is prepared to handle the exception. The GDB stub should probably be changed to default
 
Another problem is that GDB is triggered by "first-chance" exceptions. This means that GDB will get control, even if the code itself is prepared to handle the exception. The GDB stub should probably be changed to default
 
to second-chance exception handling only (where it would only get invoked if the code didn't handle the exception).<br>
 
to second-chance exception handling only (where it would only get invoked if the code didn't handle the exception).<br>
When a thread raises an exception, other threads are not blocked from running. This means that the other threads can cause changes in the machine state, basically making them able to change stuff behind your back. It also means that another thread can raise an exception, and then we have two threads competing for GDBs attention. To guard against this, a fast mutex is used to prevent a second thread from entering the GDB stub. Since this raises IRQL to APC_LEVEL this has its own set of problems.
+
When a thread raises an exception, other threads are not blocked from running. This means that the other threads can cause changes in the machine state, basically making them able to change stuff behind your back. It also means that another thread can raise an exception, and then we have two threads competing for GDBs attention. To guard against this, a fast mutex is used to prevent a second thread from entering the GDB stub. Since this raises IRQL to APC_LEVEL this has its own set of problems.<br>
 +
GDB "knows" about setjmp/longjmp. This is a problem, because it will try to set a breakpoint by itself on the longjmp() call in MSVCRT. That's fine, when MSVCRT is actually loaded. If you're debugging a process which doesn't load MSVCRT, but you have "add-symbol-file msvcrt" in your .gdbinit, you'll run into trouble. GDB will try to set the breakpoint but it will fail and as a result your "continue" command will fail. Workaround: just don't put msvcrt symbol loading file in your .gdbinit unless you really need it.
  
 
= Building the modified GDB version from source =
 
= Building the modified GDB version from source =

Revision as of 19:15, 10 January 2006

Introduction

This HOWTO describes the installation and setup of GDB versions as kernel debuggers for ReactOS. It focuses on the CygWin GDB since it comes with a GUI in the form of the "insight" executable, however most of the information should be applicable to other GDB versions too. Note that it is also possible to debug user mode code using this setup, you can jump between kernel mode and user mode.

Hardware setup

Since we're using the "remote" capabilities of GDB, we have to make a serial connection from the host (where GDB is running) to the target (where ReactOS is running), using a null modem cable. A cable with the following layout works (DB9 connectors):

  • On each connector, tie 1, 4 and 6 (DTR, CD, DSR) together
  • Cross 2 and 3 from one connector to 3 and 2 on the other one (Rx, Tx)
  • Connect 5 on both connectors (GND)
  • Cross 7 and 8 from one connector to 8 and 7 on the other one (RTS, CTS)

The target machine can be another computer, or it can be a VMware virtual machine if your computer has two serial ports. To use a VMware virtual machine, connect the null modem cable between the two serial ports of your computer and configure VMware to connect the COM port of the virtual machine to one of your physical serial ports.

Installing CygWin

Download http://www.cygwin.com/setup.exe and start it. You can use default options throughout, only make sure you select "gdb" from the "Devel" category in the package selection dialog.
Although the standard CygWin gdb/insight should work, there is a slightly modified version available at http://www.reactos.nl/downloads/gdb_reactos.zip
Changes in the modified version:

  • Fix to allow hardware watchpoints for remote i386 targets
  • Removed a check to allow stack back tracing from kernel mode back into user mode
  • Mark insight.exe as a Windows subsystem executable (as opposed to a console subsys executable)

You can download the .zip file and replace the CygWin gdb.exe, gdbtui.exe and insight.exe (if you used defaults they will be located in C:\cygwin\bin) with the versions in the .zip file. If you use the modified version, you still need to install the CygWin gdb package, since it contains some support files that are not included in the .zip
For instructions on how to build the modified version from source see the end of this document.

Preparing ReactOS

At the moment, the GDB remote stub in ReactOS is hardcoded to use COM2. If you need to make it use COM1, edit reactos\ntoskrnl\kd\wrappers\gdbstub.c, search for GdbPortInfo and make the appropriate change (2 -> 1) there. Normally ReactOS is built with -Os optimization, this means the compiler can re-arrange some code and keep variables in registers. I recommend building without optimization by disabling the lines

 <compilerflag>-Os</compilerflag>
 <compilerflag>-Wno-strict-aliasing</compilerflag>
 <compilerflag>-ftracer</compilerflag>
 <compilerflag>-momit-leaf-frame-pointer</compilerflag>
 <compilerflag>-mpreferred-stack-boundary=2</compilerflag>

in reactos/ReactOS.xml.
GDB needs the "stabs" debug info which is normally stripped from the executables during build. To prevent this from happening, set an environment variable "ROS_BUILDNOSTRIP" to the value "yes". Then do a "make clean" and a "make".
If you check e.g. output-i386\ntoskrnl, you'll see a "ntoskrnl.exe" and a "ntoskrnl.nostrip.exe". The "ntoskrnl.exe" version is the one that is installed in the ReactOS image, the ntoskrnl.nostrip.exe is there just for the benefit of GDB. "make install" knows it should use "ntoskrnl.exe".
Change your freeldr.ini file to activate GDB debugging. The magic word is "Options=/DEBUGPORT=GDB", personally I also like to capture the debug log even when I'm using GDB, so I have "Options=/DEBUGPORT=COM1 /DEBUGPORT=GDB". The virtual COM1 port from my VMware machine is then redirected to a file, the virtual COM2 port to a physical port.

Starting a debug session

You're all set to start a debug session now. Boot ReactOS and in freeldr select the entry with the "/DEBUGPORT=GDB" option. Very early on in the boot process, the screen should display "Waiting for GDB to attach".
This is your cue to start Insight (insight.exe from c:\cygwin\bin). Open its console window (Console from the View menu, Ctrl+N shortcut or the "C:\" button in the button bar). You now need to tell GDB where your source is located. Since Insight is a CygWin app, you have to do so in "CygWin notation", e.g. if the root of your source tree is "H:\ros\reactos", you will specify "/cygdrive/h/ros/reactos". Give the following commands in the command window (adjusted for your own situation):

 directory /cygdrive/h/ros/reactos
 symbol-file /cygdrive/h/ros/reactos/output-i386/ntoskrnl/ntoskrnl.nostrip.exe

The first tells GDB the location of the source tree, the second tells it to load symbols from the ntoskrnl.nostrip.exe executable. Now, tell GDB that we're debugging remotely:

 set remotebaud 115200
 target remote /dev/ttyS0

(The 115200 baudrate is again hardcoded, you can change it in reactos/ntoskrnl/kd/gdbstub.c)
/dev/ttyS0 is the CygWin equivalent of COM1, if your setup is such that GDB should be using COM2 substitute /dev/ttyS1.
These commands should produce a message telling you that you're currently in DbgBreakPointWithStatus@4(). At this point GDB is active and you can use e.g. the "where" command to get a stack backtrace although I personally use the Insight Stack view more than the "where" command. Note that the status of the program is listed as "Program not running. Click on run icon to start.", but that's a lie. You need to use the "continue" (or just "c") command to continue execution. When you give the continue command, booting ReactOS should proceed normally. GDB will stay out of the way as long as there's no breakpoint or exception occuring. You can trigger activation of GDB by typing a "K" while holding down the "Tab" key in ReactOS.
The initial four commands given above are commands you will need to give on every invocation, so it's easier to store them in a .gdbinit file. I have my .gdbinit file stored in C:\cygwin\bin and use a shortcut with C:\cygwin\bin\insight.exe as target and C:\cygwin\bin as start directory.

Making GDB useful

When ReactOS is waiting for GDB to attach, only ntoskrnl and hal are loaded. This means that at that point you can only set breakpoints in ntoskrnl and hal (and as it turns out it is not easy to set a breakpoint in hal). The easiest way around this is to just put a breakpoint in the component you want to debug by inserting a line

 __asm__("int $3\n");

in the source of the component and then recompiling/installing it. When this breakpoint is hit, GDB is activated. But there's a slight problem, we only told GDB about the symbols in ntoskrnl, not about symbols in other components. This can be fixed by giving the command (example):

 add-symbol-file /cygdrive/h/ros/reactos/output-i386/lib/kernel32/kernel32.nostrip.dll

This tells GDB to additionally load the symbols for kernel32.dll, which is useful if you're debugging kernel32.dll. Incidentally, you can load these symbols during GDB startup, when the module itself hasn't been loaded by ReactOS yet. In other words, you can add these add-symbol-file commands to your .gdbinit file.
This (usually) works fine for user mode components. It kind of breaks when the .DLL needs to be relocated by ReactOS because its memory space overlaps with the memory space of another component. Unless you tell it otherwise, GDB will assume a component is loaded at the address specified in the PE header of the file. If the component is actually loaded at a different address, you'll have to inform GDB of this.
This is especially true for kernel mode components. Only ntoskrnl has the correct load address (0x80000000) in its header, all other components are loaded in the first free available kernel memory slot. Even worse, this can mean that kernel components might be loaded at a different addresses on a subsequent boot. To get around this, I use the following patch to ntoskrnl/ldr/loader.c:

Index: ntoskrnl/ldr/loader.c
===================================================================
--- ntoskrnl/ldr/loader.c       (revision 20628)
+++ ntoskrnl/ldr/loader.c       (working copy)
@@ -747,13 +747,26 @@

     /*  Allocate a virtual section for the module  */
     DriverBase = NULL;
+    if (0 == wcscmp(L"\\??\\C:\\ReactOS\\system32\\win32k.sys", FileName->Buffer))
+    {
+      DriverBase = (PVOID) 0xe8000000;
+    }
+    if (0 == wcscmp(L"\\SystemRoot\\system32\\drivers\\afd.sys", FileName->Buffer))
+    {
+      DriverBase = (PVOID) 0xe8100000;
+    }
+    if (0 == wcscmp(L"\\SystemRoot\\system32\\drivers\\tcpip.sys", FileName->Buffer))
+    {
+      DriverBase = (PVOID) 0xe8300000;
+    }
+
     DriverBase = MmAllocateSection(DriverSize, DriverBase);
     if (DriverBase == 0)
     {
         CPRINT("Failed to allocate a virtual section for driver\n");
         return STATUS_UNSUCCESSFUL;
     }
-    DPRINT("DriverBase for %wZ: %x\n", FileName, DriverBase);
+    DPRINT1("DriverBase for %wZ: %x\n", FileName, DriverBase);

     /*  Copy headers over */
     memcpy(DriverBase, ModuleLoadBase, PENtHeaders->OptionalHeader.SizeOfHeaders);

This causes win32k.sys, afd.sys and tcpip.sys to be loaded by preference at 0xe8000000, 0xe8100000 and 0xe8300000 respectively. I then add the following lines to my .gdbinit:

 add-symbol-file /cygdrive/h/ros/reactos/output-i386/subsys/win32k/win32k.nostrip.sys 0xe8001000
 add-symbol-file /cygdrive/h/ros/reactos/output-i386/drivers/net/afd/afd.nostrip.sys 0xe8101000
 add-symbol-file /cygdrive/h/ros/reactos/output-i386/drivers/net/tcpip/tcpip.nostrip.sys 0xe8301000

Note the correspondence between the addresses in the loader patch and the addresses in the .gdbinit file: those in .gdbinit are 0x1000 higher than the ones in the loader patch. This is because the in the loader we specify the start of the module, while in the .gdbinit file we need to specify the start of the code section. Since there's a 4096 (0x1000) byte header at the start of the module, the code section starts 4096 bytes after the start of the module.

Known problems

Although debugging with GDB generally works quite nice, there are still a few problems. The first is that you can't inspect the value of a static variable in kernel mode components. This is also caused by the relocation of the module. The "add-symbol-file <module> <code-address>" tells GDB that the code has been relocated to a new address, but it doesn't tell GDB that the data has been relocated too. It's possible to tell GDB to relocate other sections too, but usually I'm too lazy to bother with that and just make a small change in the source code, setting a local (stack-based) pointer variable to the static data and then inspecting the value pointed to by that local variable.
Another problem is that GDB is triggered by "first-chance" exceptions. This means that GDB will get control, even if the code itself is prepared to handle the exception. The GDB stub should probably be changed to default to second-chance exception handling only (where it would only get invoked if the code didn't handle the exception).
When a thread raises an exception, other threads are not blocked from running. This means that the other threads can cause changes in the machine state, basically making them able to change stuff behind your back. It also means that another thread can raise an exception, and then we have two threads competing for GDBs attention. To guard against this, a fast mutex is used to prevent a second thread from entering the GDB stub. Since this raises IRQL to APC_LEVEL this has its own set of problems.
GDB "knows" about setjmp/longjmp. This is a problem, because it will try to set a breakpoint by itself on the longjmp() call in MSVCRT. That's fine, when MSVCRT is actually loaded. If you're debugging a process which doesn't load MSVCRT, but you have "add-symbol-file msvcrt" in your .gdbinit, you'll run into trouble. GDB will try to set the breakpoint but it will fail and as a result your "continue" command will fail. Workaround: just don't put msvcrt symbol loading file in your .gdbinit unless you really need it.

Building the modified GDB version from source

To build from source, you need the following CygWin packages:

Category Devel

  • bison
  • flex
  • gcc
  • gdb (with source)
  • make
  • patchutils

Libs

  • libncurses-devel
  • tcltk (with source)

After downloading and installing these, download http://www.reactos.nl/downloads/gdb_reactos_patch.zip and unpack in /usr/src (C:\cygwin\usr\src if you unpack from Windows).
Start a Cygwin bash shell and check that /usr/src/tcltk.patch and /usr/src/gdb.patch are present. Then build the modified GDB using the following commands (from your bash shell):

cd /usr/src/tcltk-20030901-1; patch -p 0 < ../tcltk.patch
cd /usr/src/gdb-20041228-3; patch -p 0 < ../gdb.patch
ln -s ../tcltk-20030901-1/tk tk
ln -s ../tcltk-20030901-1/tcl tcl
ln -s ../tcltk-20030901-1/itcl itcl
cd /usr/src; mkdir gdb-build; cd gdb-build
../gdb-20041228-3/configure --with-prefix=/usr
make
cd gdb
strip gdb.exe; strip gdbtui.exe; strip insight.exe