header image
 

Laptop – hibernation problems

Since last weekend my Dell Latitude E6410 refuses to hibernate. It worked perfectly before, so I’m not sure what’s going on. OS is Windows 7 Professional x64. It won’t hibernate when the lid is closed, power button pressed or even Hibernate option is selected from the Start menu. What happens is – the screen switches off as it should, but immediately brightens a bit and the machine enters sleep mode instead of hibernating. It can then be woken up by mouse movement etc. Hiberfil.sys file is being created. I’ve looked into advanced power options – hybrid sleep is off, actions for lid closing and power button is ‘hibernate’. Event log does not show anything interesting. Nothing changes when I change the power plan or log on as a different user.

Tried suspend and hibernate from Backtrack installation – all works fine. I’m not in the mood debugging this problem now, too much other things to do…

JMP $FCE2

So, I’ve finally gathered some inner strength and created a blog. It became inconvenient to not have my own site when I wanted to publish something interesting. Well, I have had my own site or two for quite some time, but my laziness prevented me from making use of that. I relied on OpenRCE after I abandoned my old hand-crafted abomination of a site. Now I have this blog and I intend to use it! 😉

What can you expect to see here then? Posts about general programming (mostly C# nowadays), windows internals, reverse engineering/software protections, rants, personal thoughts – the usual stuff.  I’ll try to repost my old articles here if they are worth preserving.

(Yet another) Memory dumper [OpenRCE import]

I wrote a simple process memory dumper recently. Actually, it started as a in-memory string replacer, but I’m only posting the dumper part for now – the rest is in a terrible mess. 😉

The dumper saves all process memory to a single file. It uses NTFS sparse files though, so any non committed memory range does not use physical disk space (sparse zeros). It also checks process handle for access entries limiting VM operations and can print a nice memory map. Nothing fancy, but just what I needed for some work.

It’s officially 32-bit only (DWORDs for addresses etc), but seems to somewhat work with 64-bit processes. I’ll do a proper 64-bit version later (maybe ;)).

Sample output:

c:\code\MemoryDump\Release>MemoryDump.exe explorer.exe v
 Searching for target process...
 Failed to open process 0x0: 0x57
 Failed to open process 0x4: 0x5
 [...]
 Checking target process' ACL for problematic entries...
 Opened \Device\HarddiskVolume3\Windows\explorer.exe as PID 0xb30
 Target process suspended, 31 threads
 Proceeding with memory dump

 Address   Size     Type    State   Protect
    10000:    10000 MAPPED  COMMIT  READ&WRITE
    20000:     2000 MAPPED  COMMIT  READONLY
    22000:     e000 0       FREE    NOACCESS
    30000:     4000 MAPPED  COMMIT  READONLY
    34000:     c000 0       FREE    NOACCESS
    40000:     2000 MAPPED  COMMIT  READONLY
    42000:     e000 0       FREE    NOACCESS
    50000:     1000 PRIVATE COMMIT  READ&WRITE
    51000:     f000 0       FREE    NOACCESS
    60000:    10000 PRIVATE COMMIT  READ&WRITE
    70000:     7000 MAPPED  COMMIT  READONLY
    77000:     9000 0       FREE    NOACCESS
 [...]
 77610000:     3000 IMAGE   COMMIT  READONLY
 77613000:  79cd000 0       FREE    NOACCESS
 7efe0000:     5000 MAPPED  COMMIT  READONLY
 7efe5000:    fb000 MAPPED  RESERVE 0
 7f0e0000:   f00000 PRIVATE RESERVE 0
 7ffe0000:     1000 PRIVATE COMMIT  READONLY
 7ffe1000:     f000 PRIVATE RESERVE 0

 Process resumed. Memory dumped to 2864.mem

Get source & binary here

Kernel debugger vs user mode exceptions [OpenRCE import]

Kernel debugger is a nice and nifty tool allowing us to do things not otherwise possible. Total control over debugged OS and all processes is the main reason to use it. However, there are some hiccups and obstacles that may disrupt our work. One of the most common is the case of intercepting user-mode exceptions with kernel-mode debugger.

Let’s assume we have windbg connected to the debuggee OS as a kernel mode debugger. What can we do to catch user-mode exceptions that interest us? First, there is the ‘Debug | Event Filters’ menu (or sx* commands) that controls debugger’s behavior when it does encounter an exception in debugged code. In short, ‘Execution – Enabled’ option tells the debugger to break on the specific exception. There is a catch though – it only works for kernel mode code ‘out of the box’. That is, if we enable breaks on ‘Illegal instruction’ and run some user-mode program on the debugged OS that generates it, windbg won’t break. Why? Well, we’re in the kernel debug mode after all.

How to make it work then? It’s pretty simple. All NT-based Windows systems support ‘Global Flags’ debugging mechanism in the kernel, which is a collection of system-wide debugging flags. From within windbg we can access it using ‘!gflag’ extension command. And one of the flags is ‘Break on exceptions’ – which means kernel debugger will be notified not only of kernel-mode exceptions, but also user-mode ones. Neat. To activate it, use ‘!gflag +soe’ windbg command.

Now all is well, we can see that windbg breaks on every exception in user-mode code. Or does it? There is still one special case that evades our cleverly laid traps. If the user-mode program A is being debugged (using user-mode Debug API) by user-mode program B, we (windbg running as a kernel-mode debugger) won’t get exceptions coming from program A – program B will get them instead. It’s a bit counter-intuitive, as one would think that a kernel-mode debugger should receive every exception before user-mode debuggers. That isn’t the case though, and it seems to be the design decision by Microsoft. All is not lost though – we can still force windbg to receive every and all exceptions before they get to any user-mode debugger in the debugged OS.

To learn how to do that, we need to dive deep into the Windows’ kernel function responsible for kernel-mode exception dispatching – KiDispatchException. This is the ‘main’ code responsible for deciding what to do with an exception that was encountered. It services both kernel-mode and user-mode exceptions, first- and second-chance ones, and most importantly – decides whether to notify kernel debugger about the event or not. Not all events are forwarded to kd (kernel debugger), as we’ve learned before. But because we are in control of the target system, we can modify the KiDispatchException routine to do our bidding – or routing ALL exceptions to kernel debugger first.

The exact details of the patch vary between systems, but structure of KiDispatchException function is pretty much the same. Using IDA to reverse engineer the kernel, studying Windows Research Kernel or ReactOS sources certainly helps. Disassembly of original KiDispatchException function along with the patch point from two Windows systems is provided below – 32-bit Windows XP Pro and 64-bit Windows 7 with all updates as of 2010-07-14. Modifying other kernels is left as an exercise to the reader. 🙂

Windows 7 64-bit
Windows XP 32-bit

VC++ asm intrinsics [OpenRCE import]

Microsoft’s Visual C++ supports less and less of inline assembly in later versions, or not at all on x64 platform. However, it provides a hefty number of intrinsics that are basically equivalents of single instructions.

http://msdn.microsoft.com/en-us/library/x8zs5twb.aspx

Handy reference if you like writing low-level but somewhat portable code.

There are also architecture-specific intrinsics:
x86
x64

Self-modifying TLS callbacks [OpenRCE import]

Simple yet not widely known trick. If your PE image has TLS callbacks, these callbacks can alter TLS table while executing. That means you can have one callback at the start, but if this callback adds some other callbacks, those will execute as well. There are few interesting possibilities, because PE loader doesn’t cache TLS table at the beginning of image load. 🙂

Sample code

Asm code Show

Non-continuable exception trick [OpenRCE import]

I haven’t seen this before in public but it’s possible I’m not the first one who researched this subject. I implemented similar code about year ago in my “ever unfinished” crackme, but since I doubt I’ll finish the crackme, here it goes.

The idea revolves about non-continuable exceptions, that is exceptions with EXCEPTION_NONCONTINUABLE flag set in exception record. Normally, if your SEH procedure gets such an exception, you’re basically screwed: you can’t return ‘continue execution’ status, and your process is going to be mercilessly killed. If you try to continue, you will get STATUS_NONCONTINUABLE_EXCEPTION thrown by Windows exception dispatcher – there is no way out. Or is there? 😉

What if we patch or hook windows exception dispatcher (in our process only) and just clear the noncontinuable bit if it’s present before dispatching the exception down to SEH? It turns out that it works as expected – we can now escape and continue even after originally non-continuable exception. Furthermore, debuggers seem to not really like it. Olly simply refuses to continue even if we clear the noncontinuable flag (but olly can’t even properly handle hardware BPs set in the code so who cares ;). Windbg fares a bit better, but still falls in an infinite loop (maybe more experienced users could overcome that). IDA seems to not handle the “rethrow” of division by zero exception at the end properly (but I hardly use IDA’s debugger, so others may have more luck). Also, it doesn’t properly run on WINE I heard, but more tests would be nice. 🙂

Anyway, it’s quite fun code, maybe it will be useful to someone. Below is the FASM source, and here is the source+exe.

Asm code Show

Null pointer dereference in win32k [OpenRCE import]

Totally forgot about this. Some time ago I’ve accidentally found an unhandled exception condition in kernel-mode GDI. Microsoft is aware of this but frankly they still didn’t fix it. Well, it may be not security issue, but who knows 😉

Offending function: win32k!NtUserGetDCEx or its user-mode wrapper, GetDCEx.
Crash condition: call it with all 0s before any desktops are created (I’m not 100% sure of this, but it seems to be the case).
Sample scenario: Create a DLL that calls GetDcEx(0,0,0) in DllMain. MessageBox works too (that’s how I first stumbled on it). Add the DLL to AppInit_DLLs registry key. Reboot. Upon next system start the DLL will be mapped into winlogon’s memory and the fatal function called before any windows are present. Boom, BSOD.

PROCESS_NAME:  winlogon.exe

 FAULTING_IP: 
 win32k!NtUserGetDCEx+29
 bf83c00f 8b4904          mov     ecx,dword ptr [ecx+4]

 EXCEPTION_RECORD:  ffffffff -- (.exr 0xffffffffffffffff)
 ExceptionAddress: bf83c00f (win32k!NtUserGetDCEx+0x00000029)
    ExceptionCode: c0000005 (Access violation)
   ExceptionFlags: 00000000
 NumberParameters: 2
    Parameter[0]: 00000000
    Parameter[1]: 00000004
 Attempt to read from address 00000004

 ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

 READ_ADDRESS:  00000004 

 BUGCHECK_STR:  ACCESS_VIOLATION

 DEFAULT_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE

 LAST_CONTROL_TRANSFER:  from 8053ca28 to bf83c00f

 STACK_TEXT:  
 f88d9d50 8053ca28 00000000 00000000 00000003 win32k!NtUserGetDCEx+0x29
 f88d9d50 7c90eb94 00000000 00000000 00000003 nt!KiFastCallEntry+0xf8
 0006fb24 7e41e881 7e43a383 00000000 00000000 ntdll!KiFastSystemCallRet
 0006fddc 7e43a284 0006ff38 00000005 00000004 USER32!NtUserGetDCEx+0xc
 0006ff2c 7e4661d3 0006ff38 00000028 00000000 USER32!MessageBoxWorker+0x2ba
 0006ff84 7e4505f3 00000000 1000d9e8 1000b370 USER32!MessageBoxTimeoutW+0x7a
 0006ffa4 7e46634f 00000000 1000d9e8 1000b370 USER32!MessageBoxExW+0x1b
 0006ffc0 1000105f 00000000 1000d9e8 1000b370 USER32!MessageBoxW+0x45

Here’s the appropriate disassembly:

 bf83c003 a138a59abf      mov     eax,dword ptr [win32k!gptiCurrent (bf9aa538)]
 bf83c008 f6404b20        test    byte ptr [eax+4Bh],20h
 bf83c00c 8b483c          mov     ecx,dword ptr [eax+3Ch]
 bf83c00f 8b4904          mov     ecx,dword ptr [ecx+4] ds:0023:00000004=????????
 bf83c012 8b7108          mov     esi,dword ptr [ecx+8]

Interestingly, NtUserGetDC isn’t just a wrapper to the …Ex function. It has different code and isn’t vulnerable to this.

Dancing with exceptions [OpenRCE import]

During development of my unfinished crackme I encountered several interesting discrepancies in exception handling on various OS/vm configurations.

First of them caused my 32-bit code to run fine on 32-bit XP but crash on 64-bit XP. It was caused by difference in behavior of (Rtl)RaiseException: on 32-bit XP it captured full thread context, but on 64-bit XP it didn’t – debug registers were missing. And that caused problems, because I wanted to play with DRs in my SEH. 😉

Second “trick” is behavior of system exception dispatcher regarding DR6 and its status bits. Intel docs say that CPU itself never clears these bits after a hardware breakpoint occurs. 32-bit Windows clears them though, but it also depends whether OS is running in a VM or not…

Third glitch is similar to the old “prefetch queue” tricks. I didn’t have time to investigate it much – maybe its mechanism is completely different, but it surely looks familiar. Let’s say we execute UD2 instruction or cause exception in some other way. Then, in SEH handler, we set hardware breakpoint on instruction immediately following the one that caused exception. What will happen after return from SEH? It depends… On some systems bpx will be triggered, on some not… Adding single NOP between exception trigger and bpx target will ensure that bpx always hits.

Following table contains results of my experiments in various environments. Numbers after OS name indicate whether it’s 32 or 64-bit version, number after slash means that it’s running in VMware server on XP 32 or 64-bit host.

2k 32 = 32-bit 2k on real machine
xp 32/64 = 32-bit XP inside VM hosted on 64-bit XP

Most OSes were fully updated.

dr6 column shows whether OS clears DR6 status bits after hardware breakpoint hits.
context column shows whether (Rtl)RaiseException captures full CPU context or misses debug registers.
prefetch column shows the last test – whether the tricky breakpoint gets hit without “padding” nop or not.

 os           dr6       context  prefetch
 ----------------------------------------
 2k    32     clear     full     ?
 2k    32/32  preserve  full     hit
 2k    32/64  clear     full     hit
 xp    32     clear     full     hit
 xp    32/32  preserve  full     hit
 xp    32/64  preserve  full     hit
 xp    64     clear     partial  miss
 2k3   32     clear     partial  ?
 2k3   32/32  preserve  partial  miss
 2k3   32/64  preserve  partial  miss
 vista 32     clear     partial  ?
 vista 32/64  preserve  full     hit
 vista 64/64  clear     full?    hit

vista 64/64: In “context capturing” test I’ve got different results today than some time ago. Not sure what caused the difference – earlier test being run on pre-release build, maybe some updates…

FASM code for all examples