In this guest blog from researcher Marcin Wiązowski, he details CVE-2023-21822 – a Use-After-Free (UAF) in win32kfull that could lead to a privilege escalation. The bug was reported through the ZDI program and later patched by Microsoft. Marcin has graciously provided this detailed write-up of the vulnerability, examines how it could be exploited, and a look at the patch Microsoft released to address the bug.
In the Windows kernel, there are three APIs intended for general use by device drivers for the purpose of creating bitmaps: EngCreateBitmap
, EngCreateDeviceBitmap
and EngCreateDeviceSurface
. Each of these APIs return a bitmap handle. If the caller wants to perform some drawing operations on the bitmap, the caller must first lock the bitmap by passing its handle to EngLockSurface
. EngLockSurface
increases the bitmap’s reference counter and returns a pointer to a corresponding SURFOBJ
record. SURFOBJ
is a structure located in kernel memory containing all the information regarding the bitmap, such as the bitmap’s dimensions, pixel format, a pointer to the pixel buffer, and so forth. We’ll take a closer look at the SURFOBJ
structure later. After calling EngLockSurface
, the obtained SURFOBJ
pointer can be passed to various drawing APIs such as EngLineTo
and EngBitBlt
. See winddi.h for the complete list of these drawing APIs. After the caller is finished with drawing operations, they should call EngUnlockSurface
. At this point, the bitmap’s reference counter decreases to zero again, and the caller is no longer allowed to use the SURFOBJ
pointer. Finally, the caller can delete the bitmap by calling EngDeleteSurface
on its handle. Typical usage of these APIs is shown below:
All APIs discussed above are exported from win32k.sys
kernel-mode module. Note, though, that the functions in win32k.sys
are only wrappers, and the implementations are in win32kbase.sys
and win32kfull.sys
.
Many years ago, both display drivers and printer drivers worked in kernel mode, but since Windows Vista, printer drivers work only in user mode (hence User-Mode Printer Drivers, or UMPD). Two important facts emerge from this change:
-- During printing operations, the kernel must now perform some callbacks to user mode to call the appropriate user-mode printer driver.
-- To allow printer driver code to run in user mode, some kernel APIs have now been made available from user mode.
As a result, all the kernel APIs described above now have user-mode counterparts, exported from the gdi32.dll
user-mode module. Let’s try to execute the same code shown above, but, this time, from user mode:
Note the reference counter values shown in the comments. The value is still zero after locking the bitmap. Why is this?
Kernel-mode code is always trusted, while user-mode code is always untrusted. So, now that printer drivers execute in user mode, they are considered untrusted and potentially malicious.
Suppose that the user-mode EngLockSurface
call would increase the bitmap’s reference counter in the same way that the kernel-mode version does. An attacker, acting as a user-mode printer driver, could call EngLockSurface
many times in a loop on a bitmap in order to overflow the bitmap’s reference counter, causing it to wrap around to zero. Then the bitmap could be deleted, leading to a use-after-free on the bitmap.
For this reason, the Windows kernel has implemented a different approach. The EngLockSurface
API is expected to return a pointer to the bitmap’s SURFOBJ
record – and it does, but, in user mode, this is a user-mode copy of the “true”, kernel-mode SURFOBJ
record. We can reconstruct this user-mode data structure as follows:
The user-mode EngLockSurface
implementation returns a pointer to the UMSO.so
field, which is a copy of the true, kernel-mode SURFOBJ
record, so that everything will work as expected. Internally, the user-mode EngLockSurface
call jumps to its kernel-mode implementation win32kfull.sys!NtGdiEngLockSurface
, where the user-mode UMSO
record is allocated and filled in. In kernel mode, the “true”, kernel-mode EngLockSurface
call is made on the bitmap, which is needed to access the bitmap’s SURFOBJ
record so its data can be copied into the UMSO.so
field. Afterwards, though, NtGdiEngLockSurface
calls the kernel-mode EngUnlockSurface
, which decreases the bitmap’s reference counter to zero again. This explains the observed reference counter values.
Once we call the user-mode EngLockSurface
, we are allowed to pass its result (which is a pointer to the copied SURFOBJ
data) to various drawing functions, such as EngLineTo
or EngBitBlt
. When corresponding calls are made from kernel mode, it works in a straightforward manner, but when calling from user mode, an additional layer is needed to translate the user-mode SURFOBJ
pointers into true, kernel-mode pointers. So, for example, if the user-mode code calls gdi32.dll!EngLineTo
, this will jump to the kernel-mode win32kfull.sys!NtGdiEngLineTo
wrapper. The wrapper will obtain the bitmap’s true kernel-mode SURFOBJ
record, so the kernel-mode win32kfull.sys!EngLineTo
drawing handler ultimately can be executed.
How does the kernel obtain the needed kernel-mode SURFOBJ
record? A SURFOBJ
record contains sensitive data such as the bitmap’s pixel buffer pointer, so the kernel never relies on the contents of SURFOBJ
records coming from user mode. Otherwise, there would be a security risk from malicious user-mode code that tampers with the contents of UMSO.so
structures. Instead, inside the wrapper function (such as win32kfull.sys!NtGdiEngLineTo
in the example above), the kernel verifies the UMSO.magic
value, and then uses the UMSO.hsurf
bitmap handle value to lock the bitmap by calling EngLockSurface
. In this way, the kernel safely obtains the requested bitmap’s kernel-mode SURFOBJ
record, which it can then pass to the appropriate kernel-mode win32kfull.sys!EngXXX
drawing function.
The Vulnerability
The user-mode EngLockSurface
function performs some validation on the supplied bitmap handle, meaning that not every kind of bitmap can be passed successfully to this call (we will discuss this in more detail later). But malicious user-mode code can now bypass this in any of these ways:
1) After making the EngLockSurface
call, we can delete the already-validated bitmap and create some other bitmap with the same handle value. We could choose to create a bitmap of a kind that couldn’t be successfully passed to EngLockSurface
.
2) After making the EngLockSurface
call, we receive a pointer to a user-mode SURFOBJ
record, which, as we already know, is a part of a UMSO
record. So, we can overwrite the UMSO.hsurf
field, setting it to the handle of any bitmap that we want. We can choose to set it to the handle of a bitmap that couldn’t be successfully passed to EngLockSurface
.
3) Simplest of all, we could prepare a UMSO
record from scratch, without making any EngLockSurface
call first. All we need to do is allocate some user-mode memory, set UMSO.magic
to 0x554D534F
, and set UMSO.hsurf
to the handle of a bitmap of our choice. The remaining part of this record (the UMSO.so
field, containing the SURFOBJ
record under normal circumstances) can be zeroed, as it will be disregarded by the kernel in any event.
Each of the three possibilities above will allow us to bypass the bitmap validation performed by the user-mode version of the EngLockSurface
API.
Now that we have seen that the validation can be bypassed, we must ask what is the purpose of that validation, and what ramifications does it have for security? To answer this question, we must look at the SURFOBJ
record definition. Some fields are publicly documented, while others have been reconstructed, as shown below:
The bitmap’s flags
field is undocumented, but it is known to contain some documented HOOK_XXX
flags found in the winddi.h header file. These flags tell the win32k subsystem which drawing operations should be handled by win32k itself, and which should instead be directed to a specialized device driver. The device driver is indicated by the bitmap’s hdev
field.
For example, suppose we want to draw a line on some bitmap. We’ll call EngLineTo
, passing a pointer to the bitmap’s SURFOBJ
record. Internally, the kernel will convert the requested line into a more general drawing construct known as a “path” (which can be a sequence of lines and curves). It will then check if the bitmap’s SURFOBJ.flags
field has the HOOK_STROKEPATH
flag set. If this flag is not present, it will use the generic code for drawing (“stroking”) paths provided by win32kfull
. If HOOK_STROKEPATH
is present, though, the kernel will direct the drawing request to the device driver specified by the SURFOBJ.hdev
field. The latter case, where possible, offers improved performance, as it allows individual device drivers to take advantage of accelerations offered by the specific hardware. For example, a graphics adapter may offer hardware-accelerated path stroking. Similarly, printer devices have specialized acceleration for outputting text.
So, if we prepare a bitmap that has a screen-related SURFOBJ.hdev
value, and also has the appropriate HOOK_XXX
flag set, and we pass it to one of the EngXXX
drawing APIs, there is the possibility of reaching an entry point of a specialized display driver, working in kernel mode. This could be cdd.dll!DrvXXX
in the single-monitor case or win32kfull.sys!MulXXX
in the multi-monitor case (though, there is not always a simple relationship between the requested functionality and the driver entry point ultimately called, as noted in the example above). The pointer to the bitmap’s SURFOBJ
record will be passed as a parameter to the driver’s entry point.
Further note that some EngXXX
APIs take not only one bitmap as a parameter, but rather two: a source bitmap and a destination bitmap. (Some optionally also take a mask bitmap, but that is not interesting for us). An example of such an API is EngBitBlt
, which copies a rectangle of pixels from a source bitmap to a destination bitmap. APIs that work on two bitmaps use the SURFOBJ.flags
and SURFOBJ.hdev
values of the destination bitmap when determining the ultimate device driver to receive the call. Nonetheless, when the final driver’s entry point is called, both the source and destination bitmaps are passed to it.
Hence, a properly prepared, screen-related bitmap, when passed to some EngXXX
API as the destination bitmap, allows us to reach a kernel-mode display driver, while also allowing an arbitrary bitmap of our choice to be passed as the source bitmap.
There is still no obvious security problem here, but let’s look at the SURFOBJ
record definition once again. It contains a dhsurf
field (not to be confused with the hsurf
field discussed above). The win32k subsystem treats SURFOBJ.dhsurf
as an opaque value. It is reserved for individual device drivers to use for their internal purposes. Setting this field on a new bitmap is easy: the EngCreateDeviceBitmap
and EngCreateDeviceSurface
bitmap creation APIs just take the dhsurf
value as a parameter. Both the Canonical Display Driver (cdd.dll
, used for single-monitor graphics output) and the multi-display driver (win32kfull.sys!MulXXX
) expect to work only with their own bitmaps – bitmaps with SURFOBJ.dhsurf
values set by that specific driver – rather than on arbitrary bitmaps created from user mode (or by other drivers). Internally, each of these drivers use the SURFOBJ.dhsurf
value as a pointer to a block of kernel-mode memory, containing private data owned by that driver.
But we can reach a kernel-mode display driver by passing a properly prepared, destination bitmap to the EngXXX
call, and we can also pass some arbitrary bitmap of our choice as the source bitmap to the same EngXXX
call. This source bitmap can be an arbitrary bitmap we created, and its SURFOBJ.dhsurf
value may point to arbitrary controllable memory. The kernel-mode display driver, such as the Canonical Display Driver, will work on this block of memory as if it were its own block of kernel-mode memory. This means “game over”.
For these reasons, the user-mode EngLockSurface
implementation has validation to reject screen-related bitmaps that could be used to reach a kernel-mode display driver. But, thanks to the vulnerability described above, we can bypass this EngLockSurface
validation easily. In fact, we can get away with not calling EngLockSurface
at all, and just preparing the needed UMSO
record from scratch instead, as we have explained.
Exploitation
We must first notice that user-mode EngXXX
calls are intended to be used by user-mode printer drivers only, so most of these APIs will fail unless they are called during a callback from kernel to user-mode for a printing operation. But this doesn’t complicate things too much: the user-mode part of the callback is implemented as a gdi32.dll!GdiPrinterThunk
function, which is a public export from gdi32.dll
. It’s enough to hook or patch this function and perform our main exploitation there. This function receives four parameters (the input buffer, the input buffer size, the output buffer, and the output buffer size), but we don’t need the parameters during our exploitation at all. (However, if you are interested in more details, see Selecting Bitmaps into Mismatched Device Contexts. In particular, see sections titled “User-Mode Printer Drivers (UMPD)” and “Hooking the UMPD implementation”.)
We first need to get a callback from the kernel to our hooked gdi32.dll!GdiPrinterThunk
function. To achieve this, we need to initiate some printing operation. First we must locate an installed printer. There is at least one virtual printer installed by default on every Windows machine. We can locate installed printers using a call to the user-mode winspool.drv!EnumPrintersA/W
API. Then we must create a printer-related device context:
This call will go down to kernel mode, which will then perform several callbacks to user mode again – so our hooked gdi32.dll!GdiPrinterThunk
function will be invoked, exactly as we need. Our main exploitation phase starts here.
First, we need to obtain a bitmap with a screen-related SURFOBJ.hdev
value and a useful HOOK_XXX
flag set in its SURFOBJ.flags
field. To obtain such a bitmap, we can create a window with proper parameters, obtain the window’s device context, and grab the underlying bitmap. The obtained bitmap will act as our destination bitmap:
We also need a source bitmap, with its SURFOBJ.dhsurf
field pointing to controlled user-mode memory (our FakeDhsurfBlock
):
Now we can prepare two UMSO
records, one for the destination bitmap and one for the source bitmap:
At this point, we have everything that we need to make a malicious EngXXX
call with our bitmaps. Our screen-related, destination bitmap will have all the defined HOOK_XXX
flags set, so we are free to choose any of the EngXXX
APIs that accept two bitmaps:
Through reverse engineering the Canonical Display Driver or multi-display driver internals, we can learn how to prepare the user-mode FakeDhsurfBlock
so that the call to the display driver yields exploitable memory primitives.
The Patch
As discussed earlier, each of the user-mode EngXXX
drawing APIs (such as EngLineTo
and EngBitBlt
) calls its corresponding kernel-mode win32kfull.sys!NtGdiEngXXX
wrapper, where, amongst other things, user-mode SURFOBJ
pointers are converted to kernel-mode SURFOBJ
pointers. Afterwards, a kernel-mode win32kfull.sys!EngXXX
driver endpoint is called to perform the requested drawing operation.
Although it’s not related to our vulnerability, it’s worth mentioning that, for the duration of the gdi32.dll!GdiPrinterThunk
user-mode callback, the kernel maintains a mapping of known user-mode SURFOBJ
records to kernel-mode SURFOBJ
records. When a user-mode printer driver passes a user-mode SURFOBJ
pointer to some user-mode EngXXX
call, the kernel will try to use the mapping to find the corresponding kernel-mode SURFOBJ
pointer so it can be passed to the corresponding kernel-mode EngXXX
call.
The mapping is prepared before the user-mode GdiPrinterThunk
callback begins. This is because some bitmaps may be passed to the callback as parameters (though, during our exploitation, we made no use of the GdiPrinterThunk
input data). However, this means that bitmaps “locked” later, that is, by calls to EngLockSurface
made from inside the callback, are not present in the mapping.
Whenever some win32kfull.sys!NtGdiEngXXX
receives a user-mode SURFOBJ
pointer as a parameter and is not able to find it in the mapping, it assumes that the received SURFOBJ
record is contained in an UMSO
record (as its UMSO.so
field).
Before the patch, such cases were directed to the internal win32kfull.sys!UMPDSURFOBJ::GetLockedSURFOBJ
function, where the UMSO.magic
value would be verified against the 0x554D534F
value, and then the kernel-mode EngLockSurface
call would be made on the UMSO.hsurf
handle value, yielding the needed pointer to the “true”, kernel-mode SURFOBJ
record, as discussed earlier.
As you may have noticed, the name GetLockedSURFOBJ
is misleading, as it suggests that the bitmap is already locked. In reality, when coming from user mode, a bitmap’s reference counter is still zero. And as we saw above, a malicious user-mode printer driver may not have called EngLockSurface
at all, but instead just prepared the needed UMSO
record from scratch.
After the patch, the function name was changed to GetLockableSURFOBJ
. A user-mode printer driver can still perform all the manipulations described above, but now GetLockableSURFOBJ
considers the received bitmap handle (UMSO.hsurf
) as untrusted. After using the UMSO.hsurf
value to lock the bitmap in kernel mode, GetLockableSURFOBJ
now once again performs the same bitmap validation that is performed when calling the user-mode EngLockSurface
API. This validation is performed by calling win32kfull.sys!IsSurfaceLockable
. In this way, screen-related bitmaps that could be used to reach the kernel-mode display driver from within the user-mode printer driver are now rejected by GetLockableSURFOBJ
.
Thanks again to Marcin for providing this thorough write-up. He has contributed multiple bugs to the ZDI program over the last few years, and we certainly hope to see more submissions from them in the future. Until then, follow the team on Twitter, Mastodon, LinkedIn, or Instagram for the latest in exploit techniques and security patches.