Linux Keylogger: How to Read the Linux Keyboard Buffer

Have you ever wanted to build a Linux Keylogger? Well it turns out quite a few people have at one point or another. A quick github search shows a bunch of results written in multiple languages. In my own journey to create one, I decided to try and document the main ideas so that anyone else could use this as a bit of a reference.

How do I find the keyboard buffer file?

All the attributes of input devices for your Linux computer can be found with the following:

cat /proc/bus/input/devices

For a full explanation you can reference this Unix Stack Exchange answer, but what you will be looking for is EV=120013. When you find that value we’re going to need the eventx value from Handlers=eventx. You can use this command to grab the device for you.

grep -E  'Handlers|EV=' /proc/bus/input/devices | \
grep -B1 'EV=120013' | \
grep -Eo 'event[0-9]+'

This command will output a single word, mine says event2 for example. The keyboard file will then be located in the following location /dev/input/event2 so we can see if we’re correct by running:

sudo cat /dev/input/event2

Once you run that you can start typing keys to see if output responds. It will all be garbage on your screen that won’t be readable but that’s ok we’re at least on the right track.

How do I read this gobbledygook?

The reason that the screen appears to be unreadable is because we’re reading binary data, like the 1’s and 0’s you hear people talk about. The problem with reading raw binary is you can’t decipher it like you can a text file so we’re going to use xxd to help us understand things. Let’s try to read the data for pressing the letter a on our keyboard, it’s going to be a bit tricky though because extra keypresses will generate more output so here’s how I accomplished it:

  • In a terminal run sleep 2 && sudo cat /dev/input/event2 | xxd
  • Within 2 seconds move focus to any other window so when you type the letter a it won’t appear in the terminal
  • ONLY PRESS a and look at memorize the address on the side, it should end at 00000080, because as soon as you start touching the keyboard it will pollute the ouput.

The following is a sample of text from when I pressed a:

00000000: 696c 645d 0000 0000 bbd7 0700 0000 0000  ild]............
00000010: 0400 0400 0400 0700 696c 645d 0000 0000  ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000  ................
00000030: 696c 645d 0000 0000 bbd7 0700 0000 0000  ild]............
00000040: 0000 0000 0000 0000 696c 645d 0000 0000  ........ild]....
00000050: b0d1 0800 0000 0000 0400 0400 0400 0700  ................
00000060: 696c 645d 0000 0000 b0d1 0800 0000 0000  ild]............
00000070: 0100 1e00 0000 0000 696c 645d 0000 0000  ........ild]....
00000080: b0d1 0800 0000 0000 0000 0000 0000 0000  ................

We now have something to work with. Next we have to decipher the binary data which will take a bit of explanation.

How do I interpret the binary data?

If you don’t know how binary and hexadecimal work I’ll try to give it a super quick rundown. Each hexadecimal character (0-f) is represented by 4 bits (a bit is a 1 or 0). Here’s the conversion table:

Binary   Hex
0000   =  0
0001   =  1
0010   =  2
0011   =  3
0100   =  4
0101   =  5
0110   =  6
0111   =  7
1000   =  8
1001   =  9
1010   =  a
1011   =  b
1100   =  c
1101   =  d
1110   =  e
1111   =  f

A “byte” of data is 8 bits (e.g 10110011) which means that a “byte” is represented by 2 hexadecimal characters:

1011 0011
  b    3

Hexadecimal makes its easier to display larger volumes of binary data and is also less confusing to read than a screen full of 1’s and 0’s. If you run xxd -b then it will show it in the 1’s and 0’s instead of hex. Let’s take the first line of the keyboard output and try and break it down:

00000000: 696c 645d 0000 0000 bbd7 0700 0000 0000  ild]............

00000000 is the address in hexadecimal.

696c 645d 0000 0000 bbd7 0700 0000 0000
1 2  3 4  5 6  7 8  9 10 1112 1314 1516

Here are the 16 bytes of data starting at 00000000:

ild]............

This is the ascii representation of the binary (the garbage printed to the screen earlier). Here’s an ascii table or reference. The table shows that ascii has 128 values, but a byte can represent a total of 256 bytes. When bytes fall above 7f they just show up as blank. Let’s take the first 4 bytes of our binary data and look them up on the ascii table:

69 = 'i'
6c = 'l'
64 = 'd'
5d = ']'

Everything looks hunky dory. Looking at bytes 9 -11:

bb is > 7f so blank
d7 is > 7f so blank
07 = 'BEL' so blank

So now hopefully I haven’t confused you too much and you got a basic love for reading binary data. Binary data means nothing unless an encoding is applied to it, ascii is just one way to decode data. To reference how Linux stores key presses as binary data we need to reference the input_event struct inside the input.h file. If you aren’t familiar with C code defines the struct probably looks like this once evaluated:

struct input_event {
    __kernel_ulong_t __sec;
    __kernel_ulong_t __usec;
    __u16 type;
	__u16 code;
	__s32 value;
}

__kernel_ulong_t is 64 bits, __u16 is you guessed it 16 bits and I’m sure you can figure out how many bits are in a __s32. All of this data will be stored sequentially in memory but lets break down what it stores.

type: represents an event type, as defined in input-event-codes.h which you can see is included in input.h, and we will want to look for EV_KEY for a key press, which is 01.

code: represents the key that was pressed, again defined in input-event-codes.h. We are pressing a which is KEY_A is defined as 30 (in decimal) which is 1e in hexadecimal. NOTE that 1e is not the same as the ascii representation of a, which is 61 (ascii table reference)

value: represents whether the key was pressed or released. This is pretty simple as 1 means pressed and 0 means released.

So lets build out an example of the data that we’re looking for in the stream for a keypress event:

                                             Code
                                         Type  |
|----- seconds ---| |---- useconds ---|   |    |  |-Value-|
0000 0000 0000 0000 0000 0000 0000 0000 0001 001e 0000 0001
       64 bits             64 bits      16b  16b   32 bits

Skipping the seconds and usecs lets look at some of the output I saved from before let’s scan binary representing (type, code, value):

00000010: 0400 0400 0400 0700 696c 645d 0000 0000  ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000  ................
                              ^ this looks similar to our example bytes

But wait? The data is kinda backwards? What gives?

0001 001e 0000 0001 <- Our "constructed" data
0100 1e00 0100 0000 <- Data from the buffer

Maybe your machine doesn’t have the backwards, that means you have a Big Endian system! If yours is backwards like mine you have a Little Endian system. How does this work? Big endian store numbers left to right with most significant byte (big end) is at the lowest address. Little endian stores the least significant (little end) and the lowest address. Let’s look at a couple examples:

16 bit big endian

0123
1 2

16 bit little endian

2301
2 1

32 bit big endian

0123 4567
1 2  3 4

32 it little endian

6756 2301
4 3  2 1

So our example bytes of:

0001 001e 0000 0001
1 2  1 2  1 2  3 4

Gets stored as:

0100 1e00 0100 0000
2 1  2 1  4 3  2 1

Which matches the line from our keyboard buffer, SUCCESS! How about the seconds? Linux stores seconds as timestamps and we know that they take 64 bits so looking at the output we can see the seconds are stored:

                              |-- Starting here
00000010: 0400 0400 0400 0700 696c 645d 0000 0000  ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000  ................
696c 645d 0000 0000

It’s little endian so the actual value is

696c 645d 0000 0000
1 2  3 4  5 6  7 8

0000 0000 5d64 6c69
8 7  6 5  4 3  2 1

And now we can convert 5d64 6c69 into a decimal number with this calculator, which is 1566862441. You can take that and use a timestamp converter to get: 08/26/2019 @ 11:34pm (UTC). Pretty cool huh? The rest of binary output I didn’t go deep enough to figure it out but most of the different code repos that I was looking at while researching this just skipped skipped that data. Hopefully you had some fun getting some insight into how keyboards work in Linux.