119

I need to catch segmentation fault in third party library cleanup operations. This happens sometimes just before my program exits, and I cannot fix the real reason of this. In Windows programming I could do this with __try - __catch. Is there cross-platform or platform-specific way to do the same? I need this in Linux, gcc.

0

5 Answers 5

106

On Linux we can have these as exceptions, too.

Normally, when your program performs a segmentation fault, it is sent a SIGSEGV signal. You can set up your own handler for this signal and mitigate the consequences. Of course you should really be sure that you can recover from the situation. In your case, I think, you should debug your code instead.

Back to the topic. I recently encountered a library (short manual) that transforms such signals to exceptions, so you can write code like this:

try
{
    *(int*) 0 = 0;
}
catch (std::exception& e)
{
    std::cerr << "Exception caught : " << e.what() << std::endl;
}

Didn't check it, though. Works on my x86-64 Gentoo box. It has a platform-specific backend (borrowed from gcc's java implementation), so it can work on many platforms. It just supports x86 and x86-64 out of the box, but you can get backends from libjava, which resides in gcc sources.

Sign up to request clarification or add additional context in comments.

7 Comments

+1 for be sure that you can recover before catching sig segfault
Throwing from a signal handler is a very dangerous thing to do. Most compilers assume that only calls can generate exceptions, and set up unwind information accordingly. Languages that transform hardware exceptions into software exceptions, like Java and C#, are aware that anything can throw; this is not the case with C++. With GCC, you at least need -fnon-call-exceptions to ensure that it works–and there is a performance cost to that. There is also a danger that you'll be throwing from a function without exception support (like a C function) and leak/crash later.
I agree with zneak. Don't throw from a signal handler.
The library is now in github.com/Plaristote/segvcatch, but I couldn't find the manual or compile it. ./build_gcc_linux_release gives several errors.
@Gulzar, why would you catch an exception if you don't know how to recover from it? If you don't know for sure you can recover from an exception, you simply shouldn't catch it.
|
68
+50

Here's an example of how to do it in C.

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void segfault_sigaction(int signal, siginfo_t *si, void *arg)
{
    printf("Caught segfault at address %p\n", si->si_addr);
    exit(0);
}

int main(void)
{
    int *foo = NULL;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = segfault_sigaction;
    sa.sa_flags   = SA_SIGINFO;

    sigaction(SIGSEGV, &sa, NULL);

    /* Cause a seg fault */
    *foo = 1;

    return 0;
}

12 Comments

Doing IO in a signal handler is a recipe for disaster.
@TimSeguine: that's is not true. You just need to make sure you know what you are doing. signal(7) lists all async-signal-safe functions that can be used with relatively little care. In the example above it is also completely safe because nothing else in the program is touching stdout but the printf call in the handler.
@stefanct This is a toy example. Virtually any non-toy program is going to hold the lock on stdout at some point. With this signal handler, the worst that can probably happen is a deadlock on segfault, but that can be bad enough if you currently have no mechanism to kill rogue processes in your use case.
according to 2.4.3 Signal Actions, calling printf from within a signal handler which is called as a result of an illegal indirection, whether the program is multithreaded or not is just plain undefined behavior period.
@TylerDurden First of all, I printf out of exceptions all the time. So you have low standards for the code you write, and you publicly attach your name to that fact. I'm not sure why you're proud of that, but OK. Secondly your idea that printf uses the heap is totally wrong and you would know that if actually read the source code to a typical printf function. ORLY?!?! You just might want to take your own advice about reading source code before posting something.
|
24
+100

For portability, one should probably use std::signal from the standard C++ library, but there is a lot of restriction on what a signal handler can do. Unfortunately, it is not possible to catch a SIGSEGV from within a C++ program without introducing undefined behavior because the specification says:

  1. it is undefined behavior to call any library function from within the handler other than a very narrow subset of the standard library functions (abort, exit, some atomic functions, reinstall current signal handler, memcpy, memmove, type traits, std::move, std::forward, and some more).
  2. it is undefined behavior if handler use a throw expression.
  3. it is undefined behavior if the handler returns when handling SIGFPE, SIGILL, SIGSEGV

This proves that it is impossible to catch SIGSEGV from within a program using strictly standard and portable C++. SIGSEGV is still caught by the operating system and is normally reported to the parent process when a wait family function is called.

You will probably run into the same kind of trouble using POSIX signal because there is a clause that says in 2.4.3 Signal Actions:

The behavior of a process is undefined after it returns normally from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), sigqueue(), or raise().

A word about the longjumps. Assuming we are using POSIX signals, using longjump to simulate stack unwinding won't help:

Although longjmp() is an async-signal-safe function, if it is invoked from a signal handler which interrupted a non-async-signal-safe function or equivalent (such as the processing equivalent to exit() performed after a return from the initial call to main()), the behavior of any subsequent call to a non-async-signal-safe function or equivalent is undefined.

This means that the continuation invoked by the call to longjump cannot reliably call usually useful library function such as printf, malloc or exit or return from main without inducing undefined behavior. As such, the continuation can only do a restricted operations and may only exit through some abnormal termination mechanism.

To put things short, catching a SIGSEGV and resuming execution of the program in a portable is probably infeasible without introducing undefined behavior. Even if you are working on a Windows platform for which you have access to Structured exception handling, it is worth mentioning that MSDN suggest to never attempt to handle hardware exceptions: Hardware Exceptions.

At last but not least, whether any SIGSEGV would be raised when dereferencing a null valued pointer (or invalid valued pointer) is not a requirement from the standard. Because indirection through a null valued pointer or any invalid valued pointer is an undefined behaviour, which means the compiler assumes your code will never attempt such a thing at runtime, the compiler is free to make code transformation that would elide such undefined behavior. For example, from cppreference,

int foo(int* p) {
    int x = *p;
    if(!p)
        return x; // Either undefined behavior above or this branch is never taken
    else
        return 0;
}
 
int main() {
    int* p = nullptr;
    std::cout << foo(p);
}

Here the true path of the if could be completely elided by the compiler as an optimization; only the else part could be kept. Said otherwise, the compiler infers foo() will never receive a null valued pointer at runtime since it would lead to an undefined behaviour. Invoking it with a null valued pointer, you may observe the value 0 printed to standard output and no crash, you may observe a crash with SIGSEG, in fact you could observe anything since no sensible requirements are imposed on programs that are not free of undefined behaviors.

1 Comment

SIGSEGV is hardly a hardware exception, though. One could always use a parent-child architecture where the parent is able to detect the case of a child that got killed by the kernel and use IPC to share relevant program state in order to resume where we left of. I believe modern browsers can be seen this way, as they use IPC mechanisms to communicate with that one process per browser tab. Obviously the security boundary between processes is a bonus in the browser scenario.
8

C++ solution found here (http://www.cplusplus.com/forum/unices/16430/)

#include <signal.h>
#include <stdio.h>
#include <unistd.h>
void ouch(int sig)
{
    printf("OUCH! - I got signal %d\n", sig);
}
int main()
{
    struct sigaction act;
    act.sa_handler = ouch;
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;
    sigaction(SIGINT, &act, 0);
    while(1) {
        printf("Hello World!\n");
        sleep(1);
    }
}

10 Comments

I know this is just an example that you didn't write, but doing IO in a signal handler is a recipe for disaster.
@TimSeguine: repeating stuff that is at best very misleading is not a good idea (cf. stackoverflow.com/questions/2350489/…)
@stefanct The precautions necessary in order to use printf safely in a signal handler are not trivial. There is nothing misleading about that. This is a toy example. And even in this toy example it is possible to deadlock if you time the SIGINT right. Deadlocks are dangerous precisely BECAUSE they are rare. If you think this advice was misleading, then stay away from my code, because I don't trust you within a mile of it.
@stefanct If you want to nitpick and ignore the context of the statement, then that is your problem. Who said I was talking about I/O in general? You. I just have a major problem with people posting toy answers to difficult problems. Even in the case you use async safe functions, there is still a lot to think about and this answer makes it seem like it is trivial.
@Gulzar There are a lot of different reasons for segmentation faults to occur, so it is not possible to give a general answer. But the most common reason to encounter one is that there is a bug in your code that causes unsafe memory access. In those cases trying to catch them and recover is not usually possible or advisable because stack or heap corruption has already occurred and the only safe thing to do is die. The simplest advice would be to try to find the bug by running it with valgrind or by recompiling with "address sanitizer" to try to find the bug. Both give hints about where to look
|
7

Sometimes we want to catch a SIGSEGV to find out if a pointer is valid, that is, if it references a valid memory address. (Or even check if some arbitrary value may be a pointer.)

One option is to check it with isValidPtr() (worked on Android):

int isValidPtr(const void*p, int len) {
    if (!p) {
    return 0;
    }
    int ret = 1;
    int nullfd = open("/dev/random", O_WRONLY);
    if (write(nullfd, p, len) < 0) {
    ret = 0;
    /* Not OK */
    }
    close(nullfd);
    return ret;
}
int isValidOrNullPtr(const void*p, int len) {
    return !p||isValidPtr(p, len);
}

Another option is to read the memory protection attributes, which is a bit more tricky (worked on Android):

re_mprot.c:

#include <errno.h>
#include <malloc.h>
//#define PAGE_SIZE 4096
#include "dlog.h"
#include "stdlib.h"
#include "re_mprot.h"

struct buffer {
    int pos;
    int size;
    char* mem;
};

char* _buf_reset(struct buffer*b) {
    b->mem[b->pos] = 0;
    b->pos = 0;
    return b->mem;
}

struct buffer* _new_buffer(int length) {
    struct buffer* res = malloc(sizeof(struct buffer)+length+4);
    res->pos = 0;
    res->size = length;
    res->mem = (void*)(res+1);
    return res;
}

int _buf_putchar(struct buffer*b, int c) {
    b->mem[b->pos++] = c;
    return b->pos >= b->size;
}

void show_mappings(void)
{
    DLOG("-----------------------------------------------\n");
    int a;
    FILE *f = fopen("/proc/self/maps", "r");
    struct buffer* b = _new_buffer(1024);
    while ((a = fgetc(f)) >= 0) {
    if (_buf_putchar(b,a) || a == '\n') {
        DLOG("/proc/self/maps: %s",_buf_reset(b));
    }
    }
    if (b->pos) {
    DLOG("/proc/self/maps: %s",_buf_reset(b));
    }
    free(b);
    fclose(f);
    DLOG("-----------------------------------------------\n");
}

unsigned int read_mprotection(void* addr) {
    int a;
    unsigned int res = MPROT_0;
    FILE *f = fopen("/proc/self/maps", "r");
    struct buffer* b = _new_buffer(1024);
    while ((a = fgetc(f)) >= 0) {
    if (_buf_putchar(b,a) || a == '\n') {
        char*end0 = (void*)0;
        unsigned long addr0 = strtoul(b->mem, &end0, 0x10);
        char*end1 = (void*)0;
        unsigned long addr1 = strtoul(end0+1, &end1, 0x10);
        if ((void*)addr0 < addr && addr < (void*)addr1) {
            res |= (end1+1)[0] == 'r' ? MPROT_R : 0;
            res |= (end1+1)[1] == 'w' ? MPROT_W : 0;
            res |= (end1+1)[2] == 'x' ? MPROT_X : 0;
            res |= (end1+1)[3] == 'p' ? MPROT_P
                 : (end1+1)[3] == 's' ? MPROT_S : 0;
            break;
        }
        _buf_reset(b);
    }
    }
    free(b);
    fclose(f);
    return res;
}

int has_mprotection(void* addr, unsigned int prot, unsigned int prot_mask) {
    unsigned prot1 = read_mprotection(addr);
    return (prot1 & prot_mask) == prot;
}

char* _mprot_tostring_(char*buf, unsigned int prot) {
    buf[0] = prot & MPROT_R ? 'r' : '-';
    buf[1] = prot & MPROT_W ? 'w' : '-';
    buf[2] = prot & MPROT_X ? 'x' : '-';
    buf[3] = prot & MPROT_S ? 's' : prot & MPROT_P ? 'p' :  '-';
    buf[4] = 0;
    return buf;
}

re_mprot.h:

#include <alloca.h>
#include "re_bits.h"
#include <sys/mman.h>

void show_mappings(void);

enum {
    MPROT_0 = 0, // not found at all
    MPROT_R = PROT_READ,                                 // readable
    MPROT_W = PROT_WRITE,                                // writable
    MPROT_X = PROT_EXEC,                                 // executable
    MPROT_S = FIRST_UNUSED_BIT(MPROT_R|MPROT_W|MPROT_X), // shared
    MPROT_P = MPROT_S<<1,                                // private
};

// returns a non-zero value if the address is mapped (because either MPROT_P or MPROT_S will be set for valid addresses)
unsigned int read_mprotection(void* addr);

// check memory protection against the mask
// returns true if all bits corresponding to non-zero bits in the mask
// are the same in prot and read_mprotection(addr)
int has_mprotection(void* addr, unsigned int prot, unsigned int prot_mask);

// convert the protection mask into a string. Uses alloca(), no need to free() the memory!
#define mprot_tostring(x) ( _mprot_tostring_( (char*)alloca(8) , (x) ) )
char* _mprot_tostring_(char*buf, unsigned int prot);

PS DLOG() is printf() to the Android log. FIRST_UNUSED_BIT() is defined here.

PPS It may not be a good idea to call alloca() in a loop -- the memory may be not freed until the function returns.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.