2

I have a python script that seems to work in an Eclipse runtime configuration. When I run it in at the Ubuntu command-line, I get a segmentation fault after the main program ends. Why is it happening and how can I solve it or even debug it?

$:~/ober/code/impute/impute/batch-beagle$ python ~/ober/code/impute/bin/ibd_segments.py -v 1 ~/ober/data/hutt/chr22/hutt.stage5.npz ~/ober/data/hutt/hutt.kinship /home/oren/ober/code/impute/impute/batch-beagle/out/node-0/node-0/ibd-segment-0.in
[[0 0]]
Pair 1/4: (0,0) (0,0)
0 3218 16484792 51156934 0 0
Pair 2/4: (0,0) (0,1)
Pair 3/4: (0,1) (0,0)
Pair 4/4: (0,1) (0,1)
0 3218 16484792 51156934 0 1
Done
Segmentation fault (core dumped)

Script:

import os, sys, impute as im, itertools, csv, optparse, traceback, util, numpy as np

####################################################################################
if __name__ == '__main__':
    '''
    --------------------------------------------------
    Main program
    --------------------------------------------------
    '''
    # Parse and validate command-line arguments
    PROGRAM = os.path.basename(sys.argv[0])
    usage = 'Usage: %s [flags] <phased-data-file> <kinship-file> <input-file>\n\n' \
        'Locate IBD segments among a subset of sample in an NPZ phased data set.\n' \
        'Sample pairs are read from standard input. Segments are written to standard output.\n' \
        '\tphased-data-file - NPZ file containing phasing results\n' \
        '\tkinship-file - Sorted identity coefficient file\n' \
        '\tpair-list-file - Sorted identity coefficient file\n' \
        '\tout-file - File to output segments to\n' \
        '\n' \
        'Example:\n' \
        'phased-data-file = /home/oren/ober/data/hutt/chr22/hutt.stage5.npz\n' \
        'kinship-file = /home/oren/ober/data/hutt/hutt.kinship\n' \
        'pair-list-file contains the lines\n' \
        '0 1\n' \
        '...\n' \
        '0 100\n' \
        '\n' \
        'Type ''%s -h'' to display full help.' % (PROGRAM, PROGRAM)
    parser = optparse.OptionParser(usage=usage)
    parser.add_option('-v', '--debug', type='int', dest='debug', default=0,
                      help='Debug Level (0=quiet; 1=summary; 2=full debug)')
    (options, args) = parser.parse_args(sys.argv[1:])
    if len(args) != 3:
        print usage
        sys.exit(1)
    phased_data_file, kinship_file, input_file = args

    try:
        # Load data
        problem = im.io.read_npz(phased_data_file)
        params = im.PhaseParam(kinship_file=kinship_file, debug=(options.debug >= 2))

        # Read all pairs from stdin first
        # pairs = [(int(line[0]), int(line[1])) for line in csv.reader(sys.stdin, delimiter=' ', skipinitialspace=True) if line]
        pairs = np.loadtxt(input_file, dtype=np.uint)
        if len(pairs.shape) < 2: 
            pairs = pairs[np.newaxis]
        print pairs

        # Loop over pairs and output segments to output file
        num_pairs = 4 * len(pairs)
        for k, ((i, j), (a, b)) in enumerate(itertools.product(pairs, itertools.product(im.constants.ALLELES, im.constants.ALLELES))):
            if options.debug >= 1:
                print 'Pair %d/%d: (%d,%d) (%d,%d)' % (k + 1, num_pairs, i, a, j, b)
            segments = im.ih.hap_segments(problem, i, a, j, b, params)
            segments.save(sys.stdout)
        print 'Done'
    except:
        traceback.print_exc(file=sys.stdout)
        sys.exit(util.EXIT_FAILURE)
3
  • Could be related to numpy: $ gdb python Program received signal SIGSEGV, Segmentation fault. 0x00007ffff5d37bf8 in PyArray_Item_XDECREF (data= 0x41113e1 "(\371\322\002\304Z|\n\200\361\214?\270\211", <incomplete sequence \373>, descr=0x2d3c5d0) at numpy/core/src/multiarray/refcount.c:71 71 numpy/core/src/multiarray/refcount.c: No such file or directory. Commented Jan 29, 2013 at 21:47
  • Did it happen to generate a core file? Also, have you tried running it with strace? Commented Jan 29, 2013 at 21:49
  • to get a meaningful traceback on segfault you could use faulthandler module (in stdlib since Python 3.3, it can be installed independently on Python 2.7) Commented Jan 29, 2013 at 22:55

1 Answer 1

2

It turns out I was loading a numpy npz file (with numpy.load()) from a corrupt file that I transferred via rsync from my home computer to this one. After I regenerated the NPZ file on this machine, everything worked. Thanks for your feedback.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.