1

I am new to golang and I am writing some graph routing algorithms. My graph representation looks like so

type Vertex [2]float64
type Edges map[Vertex]BagOfVertices
type BagOfVertices map[*Vertex]bool

I want to be able to represent edges from a particular Vertex, as a set of references to other Vertices. I am very memory constrained. To avoid memory costs of allocating lots of duplicate Vertex objects, I want to use pointers to vertex objects.

I have 1,335,262 nodes and 4,895,070 edges and about 800MB of RAM.

Here is my attempt at doing that

func (e *Edges) GetOrCreateVertex(vertex Vertex) *Vertex {
    edges := *e
    if _, ok := edges[vertex]; ok {
        fmt.Println("Found val")
        return &vertex
    }
    edges[vertex] = make(BagOfVertices)
    fmt.Println("Create val")
    return &vertex
}

func TestEdges(t *testing.T) {
    var edges Edges = make(map[Vertex]BagOfVertices)

    // Create edge from vertex 0 to vertex 1
    v0 := edges.GetOrCreateVertex(Vertex{0, 0})
    v1 := edges.GetOrCreateVertex(Vertex{1, 1})
    edges[*v0][v1] = true

    // Check edge exist from vertex 0 to vertex 1
    v0 = edges.GetOrCreateVertex(Vertex{0, 0})
    v1 = edges.GetOrCreateVertex(Vertex{1, 1})
    if _, ok := edges[*v0][v1]; !ok {
        t.Errorf("Edge from %v to %v does not exist", v0, v1)
    }
}

Clearly the pointer returned by GetOrCreateVertex points to the value which was just created rather than the key of Edges. How can I get GetOrCreateVertex to return the pointer to the key in the Edges map?

My work around was to create

Demo of failing test


My workaround is to have a second map to store the pointers to the vertices.

type Vertex [2]float64
type GraphType struct {
    vertices Vertices
    edges    Edges
}
type Vertices map[Vertex]*Vertex
type Edges map[*Vertex]BagOfVertices
type BagOfVertices map[*Vertex]bool

func (graph *GraphType) Init() {
    graph.vertices = make(Vertices)
    graph.edges = make(Edges)
}
func (graph *GraphType) GetOrCreateVertex(vertex Vertex) *Vertex {
    if val, ok := graph.vertices[vertex]; ok {
        fmt.Println("Found val")
        return val
    }
    graph.vertices[vertex] = &vertex
    graph.edges[&vertex] = make(BagOfVertices)
    fmt.Println("Create val")
    return &vertex
}

func TestEdges(t *testing.T) {
    var graph GraphType
    graph.Init()
    // Create vertex 0 and vertex 1
    graph.GetOrCreateVertex(Vertex{0, 0})
    graph.GetOrCreateVertex(Vertex{1, 1})

    // Create edge from vertex 0 to vertex 1
    v0 := graph.GetOrCreateVertex(Vertex{0, 0})
    v1 := graph.GetOrCreateVertex(Vertex{1, 1})
    graph.edges[v0][v1] = true

    // Check edge exist from vertex 0 to vertex 1
    v0 = graph.GetOrCreateVertex(Vertex{0, 0})
    v1 = graph.GetOrCreateVertex(Vertex{1, 1})
    if _, ok := graph.edges[v0][v1]; !ok {
        t.Errorf("Edge from %v to %v does not exist", v0, v1)
    }
}

5 Answers 5

2

Worth noting: the size of Vertex is the size of 2 float64s, or 16 bytes. The size of a pointer to Vertex is probably 8 bytes. So de-duplicating Vertex instances could, at least potentially, cut the size in half, if there are lots of duplicate vertices.

If you choose to do this, you do need something like your second version of your code. You can, however, either use a per-graph de-duplicator, as you are doing here, or you could simply use a global vertex de-duplicator. The latter means that the de-duplicator cannot be garbage collected when any one graph is discarded.

How many vertices will be active in any one graph? How many graphs will you create and destroy over time, in what pattern? The answers to these two questions will determine whether the space-savings from de-duplication / interning Vertex instances is worthwhile.

Sign up to request clarification or add additional context in comments.

4 Comments

Updated question - I have 1,335,262 nodes and 4,895,070 edges and about 800MB of RAM.
Only 800 MB? Are you on a 32-bit system? If so, the pointers will be just 32 bits. In any case: 5 million unique edges will consume at least 80 MB for the edge values (5 * 16). To store them via pointers, add another 20 or 40 MB for the pointers.
Free tier on heroku 😜 demo of what I am building here 7gl5f.csb.app source code here github.com/mfbx9da4/brightpath-backend
If you have 5 million vertices but only, say, 1 million of them are unique, the deduplication would save ~64 MB of data while costing either 16 or 32 MB of pointers.
2

Do you really need this many indirections? If you change vertex representation to keep its own edges, I think the representation becomes much cleaner, it is easier to work with, and with a small memory footprint.

type Vertex struct {
   Values [2]float64
   Edges  map[*Vertex]struct{}
}

Comments

1

Since these are just vertices comprised of coordinates, do you really need the memory access?

Here's your test case passing without using pointers: https://play.golang.org/p/RBV0NNf9F_m

Here's a version with pointers but please note how I passed the same v1 and v2 instance in the second call. With pointers, even the same (x,y) would cause a new memory address. So please be aware of the consequences of that.

Here's the code with pointers: https://play.golang.org/p/y9UCNUVIMVP

10 Comments

Yes but won't this use at least double the memory than a pointer?
I do not have deep expertise in Go but as far as I know things are different in Go compared to C/C++ in regards to how pointers work. So to understand your point better, which part might be consuming more memory in your opinion?
I don't think using pointers would save memory. It might help in performance since every function gets a copy of the data and memory addresses are faster to copy. But dereferencing those are not free either. Apart from that, the data is still stored in memory and the pointer is the address to that memory. Not sure if that may somehow consume more memory.
using a *Vertex as an index to your map would be a bad idea - as an ok check would need the exact same *Vertex memory pointer to return true. With this answer you can use Vertex{0, 0} in the second call - and it will get a match. Using say &Vertex{0, 0} will create a brand new memory pointer (of the same point) and fail.
Here's some coding style suggestions on method receivers etc. play.golang.org/p/y0lHPLhJRu2
|
0

I managed to find a way to keep my compact representation, it does unfortunately come with cost degree(Vertex) to see if an edge exists.

https://play.golang.org/p/xnw2p6Ut78p

type Vertex [2]float64
type Edges map[Vertex]BagOfVertices
type BagOfVertices map[*Vertex]bool

func (e *Edges) AddEdge(v1 Vertex, v2 *Vertex) {
    edges := *e
    if edges[v1] == nil {
        edges[v1] = make(BagOfVertices)
    }
    if edges[*v2] == nil {
        edges[*v2] = make(BagOfVertices)
    }
    edges[v1][v2] = true
}

func (e *Edges) HasEdge(v0 Vertex, v1 Vertex) bool {
    edges := *e
    found := false
    for child := range edges[v0] {
        if *child == v1 {
            found = true
        }
    }
    return found
}

func TestEdges(t *testing.T) {
    var edges Edges = make(Edges)

    // Create edge from vertex 0 to vertex 1
    v0 := Vertex{0, 0}
    v1 := Vertex{1, 1}
    edges.AddEdge(v0, &v1)

    // Check edge exist from vertex 0 to vertex 1
    v0 = Vertex{0, 0}
    v1 = Vertex{1, 1}
    if !edges.HasEdge(v0, v1) {
        t.Errorf("Edge from %v to %v does not exist", v0, v1)
    }
}

In my case I am iterating through all children so checking HasEdge will rarely get called and space is a more important constraint so this works best for me, though might not be suited to all cases.

Comments

0

One way to tackle this is to add a reference to the Vector key in its value like so

https://play.golang.org/p/s4T8cV8uYm8

type Vertex [2]float64
type Edges map[Vertex]EdgeVal
type EdgeVal struct {
    Ref *Vertex
    BagOfVertices BagOfVertices
}
type BagOfVertices map[*Vertex]bool

func (e *Edges) GetOrCreateVertex(vertex Vertex) *Vertex {
    edges := *e
    if val, ok := edges[vertex]; ok {
        fmt.Println("Found val")
        return val.Ref
    }
    edges[vertex] = EdgeVal{&vertex, make(BagOfVertices)}
    fmt.Println("Create val")
    return &vertex
}

func TestEdges(t *testing.T) {
    var edges Edges = make(map[Vertex]EdgeVal)

    // Create edge from vertex 0 to vertex 1
    v0 := edges.GetOrCreateVertex(Vertex{0, 0})
    v1 := edges.GetOrCreateVertex(Vertex{1, 1})
    edges[*v0].BagOfVertices[v1] = true

    // Check edge exist from vertex 0 to vertex 1
    v0 = edges.GetOrCreateVertex(Vertex{0, 0})
    v1 = edges.GetOrCreateVertex(Vertex{1, 1})
    if _, ok := edges[*v0].BagOfVertices[v1]; !ok {
        t.Errorf("Edge from %v to %v does not exist", v0, v1)
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.