This was my first time diving into raw USB packet analysis — and I knew I had to start by finding something both interesting and useful to guide the investigation forward. The only clue I had? This PCAP file was somehow linked to a keylogger. With that in mind, I set out to uncover what secrets were hidden in the HID traffic.
USB defines four fundamental types of data transfers, each serving a different purpose:
1) Control Transfers: Used for short, command-based communication to set up or control a USB device.Sends small packets (e.g., to configure a device or get its details, like vendor ID).
2) Interrupt Transfers:For small, regular data updates that need quick responses.Key presses on a USB keyboard or mouse movements.
3) Isochronous Transfers:For continuous, time-sensitive data streams where timing matters but errors are okay. Sends data at a steady rate without retrying errors (e.g., audio or video streams).
4) Bulk Transfers:For sending large amounts of data where accuracy matters more than speed.Sends data in big chunks, retrying if errors occur (e.g., file transfers).Copying files to a USB flash drive or sending print jobs to a printer.
While analyzing the USB traffic in Wireshark, we noticed that most of the packets fall into two categories: URB_BULK out
and URB_INTERRUPT in
.
Since we know this capture involves keylogging activity, our main focus shifted to the URB_INTERRUPT in
packets. These hold the actual keystroke data — and decoding them is the key to revealing what the user typed.
I’m using TShark to extract USB HID data from the PCAP file.
tshark -r klogger.pcapng -Y "usb.endpoint_address.direction == IN " -T fields -e usb.capdata
The output contains multiple USB HID input reports, each following the standard 8-byte structure used by USB keyboards. Below is the breakdown of that structure.
13 05 01 00 01 01 00 00
Byte | Value | Meaning
--------------------------------------------------------
0 | 13 | Modifier byte
1 | 05 | Reserved (ignore)
2–7 | 01 00 01 01 00 00 | Keycodes (up to 6 keys)
Note: In this post, I’ve intentionally kept the explanation of the modifier byte and the reserved byte minimal to maintain focus on the core keystroke decoding process.
Next, I created a file named usbkeystrok.txt
to store all the extracted USB HID keyboard input reports. Then, I wrote a simple Python script to decode the hexadecimal HID data into actual human-readable keystrokes.
tshark -r klogger.pcapng -Y "usb.endpoint_address.direction == IN " -T fields -e usb.capdata > usbkeystrok.txt
# usb_hid_decoder.pyHID_KEYCODES = {
0x04: 'a', 0x05: 'b', 0x06: 'c', 0x07: 'd', 0x08: 'e',
0x09: 'f', 0x0A: 'g', 0x0B: 'h', 0x0C: 'i', 0x0D: 'j',
0x0E: 'k', 0x0F: 'l', 0x10: 'm', 0x11: 'n', 0x12: 'o',
0x13: 'p', 0x14: 'q', 0x15: 'r', 0x16: 's', 0x17: 't',
0x18: 'u', 0x19: 'v', 0x1A: 'w', 0x1B: 'x', 0x1C: 'y',
0x1D: 'z', 0x1E: '1', 0x1F: '2', 0x20: '3', 0x21: '4',
0x22: '5', 0x23: '6', 0x24: '7', 0x25: '8', 0x26: '9',
0x27: '0', 0x28: '\n', 0x2C: ' ', 0x2D: '-', 0x2E: '=',
0x2F: '[', 0x30: ']', 0x31: '\\', 0x33: ';', 0x34: "'",
0x35: '`', 0x36: ',', 0x37: '.', 0x38: '/'
}
SHIFTED_CHARS = {
'1': '!', '2': '@', '3': '#', '4': '$', '5': '%',
'6': '^', '7': '&', '8': '*', '9': '(', '0': ')',
'-': '_', '=': '+', '[': '{', ']': '}', '\\': '|',
';': ':', "'": '"', '`': '~', ',': '<', '.': '>', '/': '?'
}
def decode_hid_report(line):
line = line.strip()
if len(line) < 16:
return ''
bytes_list = [int(line[i:i+2], 16) for i in range(0, 16, 2)]
modifier = bytes_list[0]
keycodes = bytes_list[2:]
shift = modifier & 0x22 # Left or Right Shift
keys = []
for keycode in keycodes:
if keycode == 0 or keycode == 1 or keycode == 255:
continue
char = HID_KEYCODES.get(keycode)
if char:
if shift and char in SHIFTED_CHARS:
keys.append(SHIFTED_CHARS[char])
elif shift:
keys.append(char.upper())
else:
keys.append(char)
return ''.join(keys)
# ---- MAIN ----
decoded = []
previous_keys = set()
with open("usbkeystrok.txt") as f:
for line in f:
line = line.strip()
if len(line) < 16:
This will help you to understand the HID keyboard usage .