0

I'm conducting research with temporal graph data using Pytorch-geometric.

I'm facing some issues of memory usage when making PyG data in dense format (with to_dense_batch() and to_dense_adj()).

I have tried 3 kinds of batching approaches, but I got stuck in the issues of 1) memory usage, 2) inconsistent tensor sizes, and 3) overly sparse snapshots:

  1. each batch contains multiple edges (e.g., 400 edges per batch)
  2. each batch contains multiple snapshots (1 snapshot for each timestamp)
  3. each batch contains multiple graph sequences (e.g., each sequence containing 5 snapshots)

I'm wondering:

  1. Is it possible to treat a batch of snapshots (with different numbers of nodes) as a sequence? If yes, how can one feed such a batch to LSTM or Transformer-based architectures?

  2. Is it possible to batch multiple graph sequences (e.g., a batch of 4 sequences, with each sequence containing 5 snapshots) and feed their dense node/edge embeddings into NN models with LSTM or Transformer architectures? Or it is suggested to use sparse matrices?

  3. How to split csv data with rows of records (including columns ['source', 'target', 'interaction_type', 'timestamp']) so that 1) the density of snapshots can be sufficiently high (e.g., more than 0.5); 2) the number of nodes in each snapshot keeps consistent?

Looking forward to suggestions from anyone having experiences in handling temporal graphs data. Any suggestion will be greatly appriciated. Thanks.

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.