Stack Smashing Protector

From OSDev Wiki
Jump to: navigation, search

The Stack Smashing Protector (SSP) compiler feature helps detect stack buffer overrun by aborting if a secret value on the stack is changed. This serves a dual purpose in making the occurrence of such bugs visible and as exploit mitigation against return-oriented programming. SSP merely detects stack buffer overruns, they are not prevented. The detection can be beaten by preparing the input such that the stack canary is overwritten with the correct value and thus does not offer perfect protection. The stack canary is native word sized and if chosen randomly, an attacker will have to guess the right value among 2^32 or 2^64 combinations (and revealing the bug if the guess is wrong), or resort to clever means of determining it.

Contents

Description

Compilers implement this feature by selecting appropriate functions, storing the stack canary during the function prologue and checking the value at the epilogue, invoking a failure handler if it was changed. For instance, consider the code:

void foo(const char* str)
{
	char buffer[16];
	strcpy(buffer, str);
}

SSP automatically illustratively transforms that code into this:

/* Note how buffer overruns are undefined behavior and the compilers tend to
   optimize these checks away if you wrote them yourself, this only works
   robustly because the compiler did it itself. */
extern uintptr_t __stack_chk_guard;
noreturn void __stack_chk_fail(void);
void foo(const char* str)
{
	uintptr_t canary = __stack_chk_guard;
	char buffer[16];
	strcpy(buffer, str);
	if ( (canary = canary ^ __stack_chk_guard) != 0 )
		__stack_chk_fail();
}

Note how the secret value is stored in a global variable (initialized at program load time) and is copied into the stack frame, and how the it is safely erased from the stack as part of check. Since stacks grow downwards on many architectures, the canary gets overwritten whenever input to strcpy is at least 16 characters. The caller return pointer exploited in return-oriented programming attacks is not accessed until after the value was validated, thus defusing such attacks.

The detection is perfect is a impossible to fake the correct value, i.e. the attacker doesn't have full control over what bytes can be written. The attacker cannot change further stack contents undetected if faking the correct value stops the output. For instance, if the canary in the strcpy example above contains a zero byte, it is impossible to fake that byte in the canary without stopping the output. This forces the attacker to either not attack, be detected, or not change any further stack contents. This doesn't mean the buffer overrun is always unexploitable: The string is now 16 characters instead of the intended limit of 15 characters, this can cause other unintended behavior during the continued program execution.

Note how there is only a single protective value, not every variable is protected in this manner. The a heuristic is often used that first (downwards) stores the canary, then buffers (that might overflow into each other) and finally all the small variables unaffected from overruns. This is based on the idea that it is generally less dangerous if arrays are modified, compared to variables that hold flags, pointers and function pointers, which may more seriously alter execution.

Some compilers randomize the order of stack variables and randomize the stack frame layout, which further complicates determining the right input with the intended malicious effect.

Usage

Compilers such as GCC enables this feature if requested through compiler options, or if the compiler supplier enabled it by default. It is worth considering enabling it by default if your operating system is security conscious and you provide support. It is possible to use it in your entire operating system (even kernel and standard library); perhaps excusing ports with really poor code quality. The feature enabled with the right -ffoo option and can be disabled with the -fno-foo counterpart. Several options exist that provide different variants of SSP:

-fstack-protector: Check for stack smashing in functions with vulnerable objects. This includes functions with buffers larger than 8 bytes or calls to alloca.

-fstack-protector-strong: Like -fstack-protector, but also includes functions with local arrays or references to local frame addresses.

-fstack-protector-all: Check for stack smashing in every function.

Some operating systems have extended their compiler with more relevant options:

-fstack-shuffle: (Found in OpenBSD) Randomize the order of stack variables at compile time. This helps find bugs.

When you activate the feature, the compiler will attempt to link in libssp and libssp_nonshared (if statically linked) for run-time support. This is disabled if you pass -nostdlib as you do when linking a kernel and you'll need to supply your own implementation. For user-space, you have two options:

  • Supply your own implementation in libc (so libc can take advantage of the feature) and install empty libssp and libssp_nonshared libraries (or change your toolchain to not use them).
  • Use the libssp implementation that comes with GCC.

Implementation

Run-time support needs only two components: A global variable and a check failure handler. For instance, a minimal implementation could be:

#include <stdint.h>
#include <stdlib.h>
 
#if UINT32_MAX == UINTPTR_MAX
#define STACK_CHK_GUARD 0xe2dee396
#else
#define STACK_CHK_GUARD 0x595e9fbd94fda766
#endif
 
uintptr_t __stack_chk_guard = STACK_CHK_GUARD;
 
__attribute__((noreturn))
void __stack_chk_fail(void)
{
#if __STDC_HOSTED__
	abort();
#elif __is_myos_kernel
	panic("Stack smashing detected");
#endif
}

Note how the secret guard value is hard-coded rather than being decided during program load. You should have the program loader (the bootloader in the case of the kernel) randomize the values. You can do this by putting the guard value in a special segment that the loader knows to randomize. The numbers shown here are not special, they are just example randomly generated numbers. You can still take advantage of the bug-discovering properties of SSP even if the guard value is not cryptographically secure (unless you anticipate sufficiently-obscure bugs that intelligently circumvent SSP).

Alternatively, you could have an early phase in your code that initializes the guard value, perhaps written in assembly or in C but built without stack smash protection. This approach adds code complexity and early phases where language features are not online. You may take such approaches with thread-local storage, errno, paging, gdt, scheduling, and so on, and suddenly a bootstrap is very complex with many dependencies between language features. Once a function built with stack-smashing protection is run, the guard value cannot be changed or a spurious failure will occur.

Secure Handling

Beware how you implement the stack smash detection handler: This code is only run in cases where the bug was triggered innocently, or where the bug is being exploited maliciously. By now the attacker is assumed to have, at least, corrupted an unknown amount of this thread's stack. This means the environment is hostile. The stack is currently under your control and none of the new local variables are affected. Note however that the stack smash protection may have occurred from a signal handler or another inopportune time where another thread holds locks to critical standard library state or such. Beware how if pointers to caller stack variables are currently inside the standard library, and using standard library functions accesses that memory, the attacker may control the stack smash detection handler even.

Assuming a handler invocation implies an intelligent exploit is happening, the best course of action is is:

  • Eliminate attacker influence.
  • Alert user or system administrator of a potential breach.
  • Diagnose the details of the buffer overrun so the defect can be fixed.

You should assume the worst if you wish to eliminate the attacker influence. The used exploit may well be combined with other exploited vulnerabilities, and a sufficiently skilled attacker may even influence and exploit the actions of the handler. There are many creative ways an attacker could influence the handler or even take advantage of it:

  • Pointers to earlier stack variables (now to be considered potentially corrupted) could be stored somewhere and accessed by the functions you use.
  • The handler could be run at a very inopportune time where the process is fragile, perhaps from a signal handler, perhaps the current thread owns non-recursive locks you could deadlock.
  • Printing a stack trace (if at all possible) and other diagnostic information to the stderr file descriptor (which might not even exist in this process, but instead fd 2 is used for another purpose) might result in the output being sent to the attacker. This is imaginable for a webserver, which perhaps includes the stderr contents in an error response. The attacker could learn things this way he isn't supposed to.
  • The thread might be multi-threaded and who knows how that might interact with a thread that is malfunctioning and compromised. It could have pointers to variables on the stack of the compromised thread, and SSP won't help if it accesses those.

Your approach should be to discard the process as soon as possible. Use only async-signal-safe functions, preferably without state that could influence them. Don't write to any standard streams but open the terminal anew or write to the system log. Ensure none of these operations fail (for instance, if the process is in a chroot or out of file descriptors).

The ideal approach is perhaps to have a special system call that does these tasks and invoke it unconditionally and immediately. Kernel code must not trust user-space code or be unsafely influenced it by it, so it can be considered safe. It can then stop all threads in the process, investigate where the issue seemed to occur in the process, and alert the user or system administrator appropriately.

libssp

Alternatively, to your own implementation, you can use the implementation that comes with GCC. This means you have to build libssp as part of your toolchain.

TODO: I have never built it for osdev purposes before, but I guess that you do make all-target-libssp and make install-target-libssp like with libstdc++. It's probable that depends on libc for no good reason at all (as the gcc developers put fortify source functions in it and it wants to check whether they work).

The libssp approach is to have an initialization function marked as attribute constructor, which is run among the global constructors during process startup. This means SSP isn't properly online during the early parts of process initialization (but perhaps that's not a problem if all those C stack frames are gone before that point and the default null guard value was used until now). The startup code then proceeds to attempt opening /dev/urandom which might fail if you are in a chroot, are out of file descriptors, or your system doesn't have such a file (perhaps by design). If it fails, it falls back on a reasonable but known value. You can read the libssp initialization code here.

The libssp __stack_chk_fail implementation tries to open the terminal, construct an error message with alloca, then use write to output it, if the terminal isn't accessible, it tries to the system log. It then attempts to destroy the process by invoking __builtin_trap(), writing a 0 int to the int at -1 (which is also undefined behavior and an unaligned pointer, in addition to probably crashing), and finally attempting to _exit(). This exiting strategy doesn't feel super robust. You can read the libssp handler code here.

Read the secure handling section above and read the code, then decide whether you want this linked into your programs, or whether it is cleaner to make your own implementation. You can also modify this code as part of your OS Specific Toolchain.

See Also

Threads

External Links

Personal tools
Namespaces
Variants
Actions
Navigation
About
Toolbox