How to Test for Format String Vulnerabilities


Applies To


*Compiled code (C/C++)

Summary

Testing for format string vulnerabilities involves the following 4 steps:
  1. Identify entry points.
  2. Craft attack data for each entry point.
  3. Pass attack data to each entry point.
  4. Look for application crashes or corrupt output.

1. Identify entry points

Entry points are the means by which you can provide input to the application under test. The following are common entry points:
*Public APIs
*Web service methods
*DCOM methods
*Network ports
*UI input fields
*File input (can take the form of configuration, data, or serialization information)
*Registry input

As you explore the set of entry points look for the ability to provide string input or data that may be used later in output – either to stdout or to a log – using the printf() family of functions or any other function that can take a format string. A format string is a format specification used by functions such as printf and scanf to replace variables within the string with parameter values passed to the function. The set of functions that include format strings includes:
*Sprintf
*_snprintf
*Printf
*Fprintf
*Scanf
*Fscanf
*Sscanf
*Swprintf
*wsprintfA
*wsprintfW
*Vsprintf
*Vswprintf
*_snwprintf
*_vsnprintf
*_vsnwprintf
*Vprintf
*Vwprintf
*Vfprintf
*Vwfprintf
*Fwscanf
*Wscanf
*Swscanf
		 Some examples of string input entry points are: 
	
*Web application UI input field such as username, password, or search box.
*Thick client application UI input field such as configuration options, search, or username
*Public API with a string parameter type
*String data in a file
*String data in a registry key
*String data in a network packet
There are a couple of common scenarios in which input data may be used in a related to a function call that takes a format string:
*File names – usually an error message of the type (“Unable to find file %s”) is printed to the log or stdout.
*Logged entities – such as usernames, request identifiers, and/or methods
For instance, the following code uses a format string to print a file name:
		 snprintf(buf, BUFSIZE, “Error code %d: File %s not found”, code, s);
		 fprintf (stderr, buf);
	

2. Craft attack data for an entry point

When crafting attack data for an input to code that may be vulnerable to a format string attack, include the following:
*“%x” signs – to print arbitrary 32-bit values on the stack
*“%s” signs – to print string data. This may also potentially crash the application as it will print values from an arbitrary location in memory until a NULL is encountered
*“%n” signs – to write into the arbitrary memory location pointed by the stack. This will very likely cause an application crash
In order to properly craft the attack data, it is important to understand exactly how format string attack works.
Format string bugs are exploitable due to three caveats in *printf() functions. These caveats are common to ALL variants, but for brevity, we discuss printf() alone:

  1. Variable Argument List – All Printf functions work with variable arguments. This means that an arbitrary number of arguments may be passed to printf. Arguments are passed in right-to-left order (from last to first) on the stack. Due to the FIFO (first-in-first-out) nature of the stack, the arguments may be popped by printf in the right order. Printf therefore obtains the format string (first or second argument), and follows to pop further by the number of “%” signs in the format string. Excessive % signs can lead to over popping of the stack.
  2. The “%n” format specifier – this argument expects a pointer, and will write to the location specified by this pointer. This gives printf() the powerful ability of writing into memory the number of bytes formatted so far.
  3. The value that will be written by %n is controllable by whoever supplies the input. By using format specifiers (e.g. %20x), printf() can be made to pop the stack once (4 bytes) but be tricked into counting an arbitrary number of bytes (in this example, 20).

3. Pass attack data to an entry point

It is possible to manually test for format string vulnerabilities. However, due to the large number of variations in attack strings and the ease of automation it is preferable to use automation that can pass your attack strings to the target interface. This is easier with programmatic methods such as an API or web service but is also possible with UI. A combination of automated attack generation and automated attack execution can allow your automation to find format string vulnerabilities while you focus on other tasks.

4. Look for application crashes or corrupt output

If, following the testing, the application’s output appears corrupt; it is a strong sign of a format string vulnerability – as the application incorrectly processes “%” signs in its input and interpolates them with memory data.
If, following a “%n”, the application crashes, it is a positive identification that format string vulnerability was found in your application.

Repro Example

Flawed Code


In this example, the user input is treated as a filename to be opened. If it is not found, an error message is formatted, with the filename embedded in it. Note the use of snprintf() would prevent a buffer overflow attack. A Format string bug, however, still lurks in the code.

		 #define BUFSIZE 1024
		 #define ERR_FILE_NOT_FOUND 42
	

		 void error(int code, char *s) 
		 {
		   char buf[BUFSIZE];
	

		   switch (code)
		   {
		      Case ERR_FILE_NOT_FOUND:
		         snprintf(buf, BUFSIZE, 
		                 “Error code %d: File %s not found”, code, s);
	

		   } 
		   /* Log to standard Error.. */
	

		   fprintf (stderr, buf);
	

		   /* Also, Possibly write contents of buf to a logfile */
	

		 }
	

		 int main(int argc, char **argv) 
		 {
		   /* attempt to open command line argument as a file */
		   /* if unsuccessful, report an error                */
		   error(ERR_FILE_NOT_FOUND, argv[1]);
	

		 }
	

Test Example

To exploit the flawed code above, the following attack string can be used:

*Passing “Foo%x%x%x%x%x%x%x%x%x%x” as the command line argument will cause a stack pop - revealing itself to the user as corrupt output.
*Passing “Foo%n%n%n” as the command line argument will write into arbitrary memory – revealing itself to the user as an application crash.
Microsoft Communities