-5

I have a System.Text.JsonElement with about 100 elements inside. I know of EnumerateArray(), but is there a way to do this backwards, meaning: Start at the last element of the list and enumerate until reaching the very first item?

I know that I could always cast the array to a List and enumerate over it. But is there a way to do this directly on the JsonElement? I suspect directly working on it would be better performance-wise. The following code expects JSON as well.

The data looks like this. I receive it from a third-party source, so I have no control over how it is ordered.

{
    "data": [
        {
            "subProperty": "",
            "moreData": { },
            "evenMoreData": { }
        },
        {
            "subProperty": "",
            "moreData": { },
            "evenMoreData": { }
        }
        /* Around 100 items */
    ]
}

The code I am currently using:

public void ParseDataPoint(JsonElement result)
{
    int i = 0;
    var data = result.GetProperty("data");
    foreach (var item in data.EnumerateArray())
    {
        var subProperty = item.GetProperty("subProperty");
        if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
        {
            File.WriteAllText($"{i}.json", item.ToString());
            i++;
        }
    }
}

But sadly the data items are in the exact opposite order in which I need it. And I cannot simply start with the largest file name and count down, since I do not know at the start how many of the elements satisfy SubPropertyValueSatisfiesSpecificCondition().

7
  • How about EnumerateArray().Reverse()? Commented Jun 4 at 16:24
  • 2
    You didn't post any code. JsonElement has an indexer method that can be used to iterate from the last index (returned by GetArrayLength()-1) to 0 Commented Jun 4 at 16:25
  • 2
    @JayBuckman Enumerable.Reverse copies the source into an array internally before iterating from the end. While it works, it's slow Commented Jun 4 at 16:28
  • 1
    Please provide enough code so others can better understand or reproduce the problem. Commented Jun 4 at 16:29
  • 1
    Why don't you deserialize to objects instead of first deserializing to JsonDocument and then manually to objects? You're doing double the runtime work (and a lot of manual work) when you could use eg theData=JsonSerializer.DeserializeObject<TheDTO>(json); ? Unless someone tried to make JSON "dynamic", not realizing that a JSON object is a dictionary ? And instead of subProperty: "Name" you could have ` "Name": { "MoreData":{...} }` ? Commented Jun 5 at 6:40

1 Answer 1

2

Given your array is fairly small at 100 elements, I would simply use LINQ's Reverse() method:

foreach (var item in data.EnumerateArray().Reverse())
{
    // Remainder as before

Or if most of the elements are getting filtered out, you could do the Reverse() after the filter like so:

public void ParseDataPoint(JsonElement result) =>
    result.GetProperty("data").EnumerateArray()
    .Where(item => SubPropertyValueSatisfiesSpecificCondition(item.GetProperty("subProperty").GetString()))
    .Reverse()
    .Aggregate(0, (i, item) => 
               {
                   File.WriteAllText($"{i}.json", item.ToString());
                   return i++;
               });

That being said, if your JSON array were huge enough for performance to matter, be aware that the JsonElement API doesn't seem to provide a performant way to reverse an array:

  1. The LINQ Reverse() method unconditionally copies the incoming collection to a temporary array, then reverses that. (See here for confirmation)

    Thus reversing your array with LINQ is O(N) in memory, which you may not want.

  2. The indexer JsonElement.Item[Int32] is not necessarily O(1), it is O(I) when the array contains at least one "complex child", which seems to be defined as an item which is:

    • A JSON object, or
    • A JSON array, or
    • A JSON string that included escaping.

    For confirmation, see the method JsonDocument.GetArrayIndexElement(int currentIndex, int arrayIndex) which scans through the contents of the array unless row.HasComplexChildren is false.

    Thus reversing your array using the following method:

    public static IEnumerable<JsonElement> EnumerateArrayIndexReversed(this JsonElement element)
    {
        if (element.ValueKind != JsonValueKind.Array)
            throw new ArgumentException($"Element type {element.ValueKind} is not an array");
        var count = element.GetArrayLength(); // This is an O(1) operation;
        for (int i = 0; i < count; i++)
            yield return element[i]; // This can be O(i)
    }
    

    Will be O(N * N) in the array size, since your array does indeed contain complex elements. You definitely don't want that for very large arrays.

As an alternative, you might consider deserializing your JSON to a partial data model, where the relevant properties are captured using C# properties but any overflow properties are captured by some JsonElement. For instance, you could define the following root model and extension method:

public record DataListModel<T>(List<T> data);

public static partial class EnumerableExtensions
{
    // Enumerate a list backwards without snapshotting the list
    public static IEnumerable<T> ReverseList<T>(this IList<T> list)
    {
        static IEnumerable<T> ReverseList(IList<T> list)
        {
            for (var i = list.Count - 1; i >= 0; i--)
                yield return list[i];
        }
        ArgumentNullException.ThrowIfNull(list);
        return ReverseList(list);
    }
}

Then if you can deserialize your JSON directly to a DataListModel<JsonElement>, you can rewrite your ParseDataPoint() as follows:

public void ParseDataPoint(DataListModel<JsonElement> result)
{
    int i = 0;
    foreach (var item in result.data.ReverseList())
    {
        var subProperty = item.GetProperty("subProperty");
        if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
        {
            File.WriteAllText($"{i}.json", item.ToString());
            i++;
        }
    }
}

And you should get the benefits of O(1) memory use and O(N) time for your reverse enumeration.

To test this, using BenchmarkDotNet I benchmarked the ParseDataPoint(DataListModel<JsonElement> result) method above against two methods taking a JsonElement, once using EnumerateArray().Reverse() and one using index-based reversal:

public void ParseDataPointWithLinqReverse(JsonElement result)
{
    int i = 0;
    var data = result.GetProperty("data");
    foreach (var item in data.EnumerateArray().Reverse())
    {
        var subProperty = item.GetProperty("subProperty");
        if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
        {
            File.WriteAllText($"{i}.json", item.ToString());
            i++;
        }
    }
}

public void ParseDataPointWithIndexReverse(JsonElement result)
{
    int i = 0;
    var data = result.GetProperty("data");
    foreach (var item in data.EnumerateArrayIndexReversed())
    {
        var subProperty = item.GetProperty("subProperty");
        if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
        {
            File.WriteAllText($"{i}.json", item.ToString());
            i++;
        }
    }
}   

Here are the results for arrays with either 100 or 1000 elements:

| Method                                          | N    | Mean         | Error      | StdDev     |
|------------------------------------------------ |----- |-------------:|-----------:|-----------:|
| Test_ParseDataPoint_WithDataListModel           | 100  |     7.744 us |  0.1538 us |  0.1511 us |
| Test_ParseDataPoint_WithJsonElementIndexReverse | 100  |    23.597 us |  0.3517 us |  0.3290 us |
| Test_ParseDataPoint_WithJsonElementLinqReverse  | 100  |     8.871 us |  0.1715 us |  0.2041 us |
| Test_ParseDataPoint_WithDataListModel           | 1000 |    79.491 us |  1.5771 us |  1.4752 us |
| Test_ParseDataPoint_WithJsonElementIndexReverse | 1000 | 2,053.375 us | 20.5950 us | 17.1978 us |
| Test_ParseDataPoint_WithJsonElementLinqReverse  | 1000 |    91.698 us |  1.3967 us |  1.2382 us |

From which we can see:

  1. Enumerating in reverse through the data list model's list is fastest.

  2. Enumerating using LINQ Reverse() is a close second, around 15% slower (though it does require a temporary buffer.)

  3. Reversing by using the indexer is almost shockingly slow -- nearly 2x slower for an array of 100 elements, and 25x slower for an array with 1000 elements.

  4. Thus in performance-critical situations, enumerating through a JSON array using the JsonElement array indexer must needs be avoided. Deserialization to some data model should be preferred, and if not convenient, LINQ methods combined with the array enumerator may be the best alternative.

  5. You might want to test the performance of JsonNode document object model, it wouldn't surprise me if the JsonArray indexer were performant.

Benchmark code here.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.