17

I want to use a structure like:

struct node {
   char[10] tag;
   struct node *next;
};

I want to use the above structure to create a doubly-linked list. Is that possible and if yes, then how I can achieve it?

5
  • Your accepted answer is C++ only. Please remove the C tag, as you seem looking for C++. Commented Nov 20, 2015 at 11:42
  • 3
    While the example was in C++ the core of the question and answer is the same for both. Commented Nov 20, 2015 at 14:50
  • 1
    If you detail the operations needed on your list - even better answers can be had. The XOR approach does not provide "the ability to delete a node from the list knowing only its address or the ability to insert a new node before or after an existing node when knowing only the address of the existing node." So unless you are giving up some feature of a doubly linked list, code cannot solve the issue with a single pointer/data. Commented Nov 20, 2015 at 16:59
  • <sarcasm>Good question, since the original doubly-linked list appears to be patented: link You wouldn't want to be accused of stealing someone else's intellectual property...</sarcasm> Commented Nov 20, 2015 at 18:18
  • You can trivially make a doubly linked list of length 2 ;) Commented Nov 25, 2015 at 14:57

5 Answers 5

31

Yes, it's possible, but it's a dirty hack.

It's called XOR linked list. (https://en.wikipedia.org/wiki/XOR_linked_list)

Each node stores a XOR of next and prev as a uintptr_t.


Here is an example:

#include <cstddef>
#include <iostream>

struct Node
{
    int num;
    uintptr_t ptr;
};

int main()
{
    Node *arr[4];
    // Here we create a new list.
    int num = 0;
    for (auto &it : arr)
    {
        it = new Node;
        it->num = ++num;
    }
    arr[0]->ptr =                     (uintptr_t)arr[1];
    arr[1]->ptr = (uintptr_t)arr[0] ^ (uintptr_t)arr[2];
    arr[2]->ptr = (uintptr_t)arr[1] ^ (uintptr_t)arr[3];
    arr[3]->ptr = (uintptr_t)arr[2];

    // And here we iterate over it
    Node *cur = arr[0], *prev = 0;
    do
    {
        std::cout << cur->num << ' ';
        prev = (Node *)(cur->ptr ^ (uintptr_t)prev);
        std::swap(cur, prev);
    }
    while (cur);
    return 0;
}

It prints 1 2 3 4 as expected.

Sign up to request clarification or add additional context in comments.

18 Comments

Nice trick, hadn't seen that before. You do need the previous or next node as well as the current node in order to traverse the list though.
Do it have any issue like performance or something else.
@Avi Yes, performance can be an issue. Normal double linked list is faster. You should use XOR linked list only if you need to keep memory usage very low.
@Avi If performance is a concern, the performance of a linked-list (single or double) or any linked structure in general heavily depends on the memory layout given to it by its memory allocator. Whether trees or linked lists, if you want to take the performance of such structures to the next level, you want to look at it from the standpoint of memory allocation, and getting back some of those contiguous qualities that arrays have which make them so well-suited for exploiting locality of reference.
@Deduplicator: Ah! I see. Indeed, it could for example work within a single shared memory area.
|
18

I'd like to offer an alternative answer which boils down to "yes and no".

First, it's "sort of impossible" if you want to get the full benefits of a doubly-linked list with only one single pointer per node.

XOR List

Yet cited here was also the XOR linked list. It retains one main benefit by having a lossy compression of two pointers fitting into one that you lose with a singly-linked list: the ability to traverse it in reverse. It cannot do things like remove elements from the middle of the list in constant-time given only the node address, and being able to go back to a previous element in a forward iteration and removing an arbitrary element in linear time is even simpler without the XOR list (you likewise keep two node pointers there: previous and current).

Performance

Yet also cited in the comments was a desire for performance. Given that, I think there are some practical alternatives.

First, a next/prev pointer in a doubly-linked list doesn't have to be, say, a 64-bit pointer on 64-bit systems. It can be two indices into a 32-bit contiguous address space. Now you got two indices for the memory price of one pointer. Nevertheless, trying to emulate 32-bit addressing on 64-bit is quite involved, maybe not exactly what you want.

However, to reap the full performance benefits of a linked structure (trees included) often requires you to get back control over how the nodes are allocated and distributed in memory. Linked structures tend to be bottlenecky because, if you just use malloc or plain operator new for every node, e.g., you lose control over memory layout. Often (not always -- you can get lucky depending on the memory allocator, and whether you allocate all your nodes at once or not) this means a loss of contiguity, which means a loss of spatial locality.

It's why data-oriented design stresses arrays more than anything else: linked structures are not normally very friendly for performance. The process of moving chunks from larger memory to smaller, faster memory likes it if you are going to access neighboring data within the same chunk (cache line/page, e.g.) prior to eviction.

The Not-So-Often-Cited Unrolled List

So there's a hybrid solution here which is not so often discussed, which is the unrolled list. Example:

struct Element
{
    ...
};

struct UnrolledNode
{
    struct Element elements[32];
    struct UnrolledNode* prev;
    struct UnrolledNode* next;
};

An unrolled list combines the characteristics of arrays and doubly-linked lists all into one. It'll give you back a lot of spatial locality without having to look to the memory allocator.

It can traverse forwards and backwards, it can remove arbitrary elements from the middle at any given time for cheap.

And it reduces the linked list overhead to the absolute minimum: in this case I hard-coded an unrolled array size of 32 elements per node. That means the cost of storing the list pointers has shrunk to 1/32th of its normal size. That's even cheaper from a list pointer overhead standpoint than a singly-linked list, with often faster traversal (because of cache locality).

It's not a perfect replacement for a doubly-linked list. For a start, if you are worried about the invalidation of existing pointers to elements in the list on removal, then you have to start worrying about leaving vacant spaces (holes/tombstones) behind that get reclaimed (possibly by associating free bits in each unrolled node). At that point you're dealing with many similar concerns of implementing a memory allocator, including some minor forms of fragmentation (ex: having an unrolled node with 31 vacant spaces and only one element occupied -- the node still has to stay around in memory to avoid invalidation until it becomes completely empty).

An "iterator" to it which allows insertion/removal to/from the middle typically has to be larger than a pointer (unless, as noted in the comments, you store additional metadata with each element). It can waste memory (typically moot unless you have really teeny lists) by requiring, say, the memory for 32 elements even if you have a list of only 1 element. It does tend to be a little more complex to implement than any of these above solutions. But it's a very useful solution in a performance-critical scenario, and often one that probably deserves more attention. It's one that's not brought up so much in computer science since it doesn't do any better from an algorithmic perspective than a regular linked list, but locality of reference has a significant impact on performance as well in real-world scenarios.

16 Comments

One thing you could do is take your unrolled list and add a field to each Element which combines a flag saying whether the element is in use and also a number saying which index the Element is within it's parent UnrolledNode object. This would allow you to remove an Element or iterate through the list given only a pointer to an Element.
@Ike: In any case, for alignment reasons, packing the tombstone flags in a bit array (a single uint32_t in your case) outside the elements array is probably a good idea. Don't want to accidentally bump the size of the elements array just for 1 bit per element... It also hints that on a 64 bits system you would probably aim for a 64 elements array.
@Ike: Note that looking through 32 or 64 bits is a constant-time operation as it does not depend on the number of elements in the sequence. Regarding packing an "index" into the element, a potential trick is using struct NodeData: Element, Index {} where the latter only contains a single byte. If there is a spare byte at the end of Element, it'll be for free... but I'd personally favor having meta-data in the iterator.
@Ike: First of all, looping over 32 or 64 values is constant time, in that it does not vary depending on the number of elements in the list (that is O(64) = O(1)). That being said, compilers generally offer built-ins for bit operations (which if available map to equivalent CPU instructions). Finding the first 0 bit of an integer can be accomplished by calling __builtin_clz on gcc (count leading zeroes) on the negated integer. See gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html for a list of plenty of built-ins :)
@Ike: O(...) in itself does not mean anything more than 1 or 2, it's just a quantity without unit. In order to be fully formal, one should thus specify the quantity, for example insertion in a binary search tree of N elements is O(log N) comparisons. I am afraid we were a bit sloppy earlier, and thus ended up talking past each others as we each had our own units in mind... Beyond unit, note that the O(...), o(...) and Θ(...) notations express the algorithmic complexity when N tends toward infinity. I do not think they should really be used for bounded quantities normally.
|
8

It's not completely possible. A doubly linked list requires two pointers, one for the link in each direction.

Depending on what you need the XOR linked list may do what you need (see HolyBlackCat's answer).

Another option is to work around this limitation a little by doing things like remembering the last node you processed as you iterate through the list. This will let you go back one step during the processing but it doesn't make the list doubly linked.

Comments

4

You can declare and support two initial pointers to nodes head and tail. In this case you will be able to add nodes to the both ends of the list.

Such a list sometimes called a two-sided list.

However the list itself will be a forward list.

Using such a list you can for example simulate a queue.

4 Comments

While suitable for some applicatoins of doubly linked lists, this does not have O(1) backwards iteration, O(1) insertion at pointed-to position, O(1) deletion at pointed-to position. -- One could even get rid of all pointers if one wanted to implement a queue of bounded length ...
@HagenvonEitzen It does not has backward iterations And I said about this in my answer. So you comment does not make any sense.
@HagenvonEitzen: Actually, it has O(1) insertion/deletion at any position... but you need to give the position of the element prior to the one you wish to remove.
it's a deque really.
1

It is not possible in a portable way without invoking undefined behavior: Can an XOR linked list be implemented in C++ without causing undefined behavior?

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.