-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Background and motivation
Given that the new family of Tensor<T>, specially when using strides for sparse tensors is not trivially serializable, I would like to ask about guidelines about how to generalize a roundtrip serialization for Tensor<T> family.
API Proposal
To begin with, it's not trivial to serialize Tensor<T> because T although most of the time it will be a known, serializable type.
Then it comes the formatting, I am not aware if there's a standarization for that, so my API proposal would be:
namespace System.Numerics.Tensors
public class SerializableTensor<T>
{
public SerializableTensor(Tensor<T> sourceTensor) { ... }
public nint[] Lengths { get; }
public nint[] Strides {get; }
public List<T[]> InternalData { get; } // not sure about how to deal with this one, but it needs to cover +2Gb and sparse data.
public Tensor<T> ToTensor() { ... }
// easter eggs
public void WriteBinary(Stream s) { ... }
public void ReadBinary(Stream s) { ... }
}Or maybe, instead of an internalData, it could be possible to create some kind of "visitor" API for reading and writing chunks of the internal tensor data, which would help avoid creating a copy of the data.
API Usage
Tensor<float> someTensor = ...
var sTensor = new SerializableTensor<float>(someTensor);
using Stream s = File.Create("tensordata");
sTensor.WriteBinary(s);Alternative Designs
Maybe putting a class such as SerializableTensor in System.Numerics.Tensors is too much, so I would be happy to have these helpers, maybe in a Microsoft.Extensions.Tensors lateral library?
But I would be glad to just have a code snippet, because right now, I can't see a way through, given some of the mechanics of the tensors are not directly accessible.
Risks
No response