1

I have the following merge sort algorithm that works when I test with 100, 1000, or 10000 long linked lists. However, it returns a segmentation fault on the line containing Node->Next=Merge(Front,Back->Next,Type); when using 100000 or 1000000 long linked lists. This has lead me to believe it is a stack overflow rather than a segmentation fault. According to gdb at the time of the error the stack is extraordinarily full. I cannot find a way to examine exactly how many items are in the call stack to give an exact figure. Any help with reworking merge sort to handle large amounts of data would be greatly appreciated.

struct IntList
{
int Value;
int Frequency;
struct IntList* Next;
struct IntList* Prev;
};//Struct for Integer Linked List
void SortList(struct IntList** Values,enum SortBy Type)
{
    struct IntList* Head = *Values;
    if(Head==NULL||Head->Next==NULL)
    {
        return;
    }//If Base Case
    struct IntList* Front;
    struct IntList* Back;
    Split(Head,&Front,&Back);//Splits Linked List
    SortList(&Front,Type);
    SortList(&Back,Type);//Recursively Sorts
    *Values=Merge(Front,Back,Type);//Merges Halves
    return;
}

void Split(struct IntList* Head,struct IntList** Front,struct IntList** Back)
{
    struct IntList* Fast;
    struct IntList* Slow;
    if (Head==NULL||Head->Next==NULL)
    {
        *Front=Head;
        *Back=NULL;
    }//If Length <2
    else
    {
        Slow=Head;
        Fast=Head->Next;
    }
    while(Fast!=NULL)
    {
        Fast=Fast->Next;
        if(Fast!=NULL)
        {
            Fast=Fast->Next;
            Slow=Slow->Next;
        }
    }//Find Midpoint
    *Front=Head;
    *Back=Slow->Next;
    Slow->Next=NULL;//Breaks Link
    return;
}


struct IntList* Merge(struct IntList* Front,struct IntList* Back,enum SortBy Type)
{
    if(Front==NULL)
    {
        return Back;
    }
    if (Back==NULL)
    {
        return Front;
    }//Base Cases

    struct IntList* Node;
    if(Type==DATA)
    {
        if(Front->Value <= Back->Value)
        {
           Node=Front;
           Node->Next=Merge(Front->Next,Back,Type);
        }
        else
        {
            Node=Back;
            Node->Next=Merge(Front,Back->Next,Type);
        }//Takes Greatest Value for Sorted List
    }//If Sorting by Data
    if(Type==FREQUENCY)
    {
        if(Front->Frequency < Back->Frequency)
        {
           Node=Front;
           Node->Next=Merge(Front->Next,Back,Type);
        }
        else
        {
            Node=Back;
            Node->Next=Merge(Front,Back->Next,Type);
        }//Takes Greatest Frequency for Sorted List
    }//If Sorting by Frequency
    return(Node);
3
  • 8
    Stop using recursion to merge lists; do it iteratively. Commented Aug 30, 2017 at 22:51
  • 1
    Well, you could block it in your hosts file, you could use a parental controls app... Oh! You meant a literal stack overflow! Well, in that case, user2357112 is right, you've probably hit the recursion limit. ;-) Commented Aug 30, 2017 at 22:59
  • Using recursion produces elegant code,, but you see first hand here its trouble and limitations. At the segfault (or any breakpoint) in gdb you can type bt (short for backtrace) and it will print out the call stack ... can't quiiite tell if you know this already. Commented Aug 30, 2017 at 23:12

2 Answers 2

1

If you want to use recursion, it is best to try to express it in tail-call form (so that nothing is done after the recursive call returns other than a return). That way most compilers will optimize the tail-call into a simple jump and not use any stack space. For your Merge function, it becomes something like:

void Merge(struct IntList **merged, struct IntList* Front,struct IntList* Back,enum SortBy Type)
{
    if(Front==NULL) {
        *merged = Back;
    } else if (Back==NULL) {
        *merged = Front;
    } else if(Type==DATA) {
        if(Front->Value <= Back->Value) {
            *merged = Front;
            Merge(&Front->Next, Front->Next, Back, Type);
        } else {
            *merged = Back;
            Merge(&Back->Next, Front, Back->Next,Type);
        }//Takes Greatest Value for Sorted List if Sorting by Value
    } else if(Type==FREQUENCY) {
        if(Front->Frequency < Back->Frequency) {
            *merged = Front;
            Merge(&Front->next, Front->Next, Back, Type);
        } else {
            *merged = Back;
            Merge(&Back->Next, Front, Back->Next, Type);
        }//Takes Greatest Frequency for Sorted List
    }//If Sorting by Frequency
}

If your compiler doesn't support tail recursion optimization, you can do it yourself by making the body of the function a loop:

void Merge(struct IntList **merged, struct IntList* Front,struct IntList* Back,enum SortBy Type)
{
  while(Front || Back) {
    if(Front==NULL) {
        *merged = Back;
        Back = NULL;
    } else if (Back==NULL) {
        *merged = Front;
        Front = NULL
    } else if(Type==DATA) {
        if(Front->Value <= Back->Value) {
            *merged = Front;
            merged = &Front->Next;
            Front = Front->Next;
        } else {
            *merged = Back;
            merged = &Back->Next;
            Back = Back->Next;
        }//Takes Greatest Value for Sorted List if Sorting by Value
    } else if(Type==FREQUENCY) {
        if(Front->Frequency < Back->Frequency) {
            *merged = Front;
            merged = &Front->Next;
            Front = Front->Next;
        } else {
            *merged = Back;
            merged = &Back->Next;
            Back = Back->Next;
        }//Takes Greatest Frequency for Sorted List
    }//If Sorting by Frequency
  }
}
Sign up to request clarification or add additional context in comments.

Comments

0

This simplest answer is not to use recursion.

However, if you are dead set on using recursion you can limit what goes on the stack by moving temp variables used in the function to outside the scope of the function or declaring them static so that they can be reused rather than making room for them every time merge is called (can have adverse effects if you are interesting in multi-threaded applications). It doesn't look like you have any variables that would be safe to do this with.

You can also encapsulate parameters with another struct so that you don't have to pass in three pointers onto the new stack call, I.E., one parameter takes up less space than 3.

Something like:

struct IntList* Merge(struct mergeData* data)

There are also ways of adjusting the stack size so that your application can handle the data sets you expect.

Overall, this is a limitation of recursion. If you deal with embedded systems that have limited resources (like myself), then you usually avoid it.

2 Comments

You can also encapsulate parameters with another struct so that you don't have to pass in three pointers onto the new stack call, I.E., one parameter takes up less space than 3. --- what? You do realize that you will have to make space for the struct on the caller stack?
Allocate in the heap is another option.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.