Given your array is fairly small at 100 elements, I would simply use LINQ's Reverse() method:
foreach (var item in data.EnumerateArray().Reverse())
{
// Remainder as before
Or if most of the elements are getting filtered out, you could do the Reverse() after the filter like so:
public void ParseDataPoint(JsonElement result) =>
result.GetProperty("data").EnumerateArray()
.Where(item => SubPropertyValueSatisfiesSpecificCondition(item.GetProperty("subProperty").GetString()))
.Reverse()
.Aggregate(0, (i, item) =>
{
File.WriteAllText($"{i}.json", item.ToString());
return i++;
});
That being said, if your JSON array were huge enough for performance to matter, be aware that the JsonElement API doesn't seem to provide a performant way to reverse an array:
The LINQ Reverse() method unconditionally copies the incoming collection to a temporary array, then reverses that. (See here for confirmation)
Thus reversing your array with LINQ is O(N) in memory, which you may not want.
The indexer JsonElement.Item[Int32] is not necessarily O(1), it is O(I) when the array contains at least one "complex child", which seems to be defined as an item which is:
- A JSON object, or
- A JSON array, or
- A JSON string that included escaping.
For confirmation, see the method JsonDocument.GetArrayIndexElement(int currentIndex, int arrayIndex) which scans through the contents of the array unless row.HasComplexChildren is false.
Thus reversing your array using the following method:
public static IEnumerable<JsonElement> EnumerateArrayIndexReversed(this JsonElement element)
{
if (element.ValueKind != JsonValueKind.Array)
throw new ArgumentException($"Element type {element.ValueKind} is not an array");
var count = element.GetArrayLength(); // This is an O(1) operation;
for (int i = 0; i < count; i++)
yield return element[i]; // This can be O(i)
}
Will be O(N * N) in the array size, since your array does indeed contain complex elements. You definitely don't want that for very large arrays.
As an alternative, you might consider deserializing your JSON to a partial data model, where the relevant properties are captured using C# properties but any overflow properties are captured by some JsonElement. For instance, you could define the following root model and extension method:
public record DataListModel<T>(List<T> data);
public static partial class EnumerableExtensions
{
// Enumerate a list backwards without snapshotting the list
public static IEnumerable<T> ReverseList<T>(this IList<T> list)
{
static IEnumerable<T> ReverseList(IList<T> list)
{
for (var i = list.Count - 1; i >= 0; i--)
yield return list[i];
}
ArgumentNullException.ThrowIfNull(list);
return ReverseList(list);
}
}
Then if you can deserialize your JSON directly to a DataListModel<JsonElement>, you can rewrite your ParseDataPoint() as follows:
public void ParseDataPoint(DataListModel<JsonElement> result)
{
int i = 0;
foreach (var item in result.data.ReverseList())
{
var subProperty = item.GetProperty("subProperty");
if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
{
File.WriteAllText($"{i}.json", item.ToString());
i++;
}
}
}
And you should get the benefits of O(1) memory use and O(N) time for your reverse enumeration.
To test this, using BenchmarkDotNet I benchmarked the ParseDataPoint(DataListModel<JsonElement> result) method above against two methods taking a JsonElement, once using EnumerateArray().Reverse() and one using index-based reversal:
public void ParseDataPointWithLinqReverse(JsonElement result)
{
int i = 0;
var data = result.GetProperty("data");
foreach (var item in data.EnumerateArray().Reverse())
{
var subProperty = item.GetProperty("subProperty");
if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
{
File.WriteAllText($"{i}.json", item.ToString());
i++;
}
}
}
public void ParseDataPointWithIndexReverse(JsonElement result)
{
int i = 0;
var data = result.GetProperty("data");
foreach (var item in data.EnumerateArrayIndexReversed())
{
var subProperty = item.GetProperty("subProperty");
if (SubPropertyValueSatisfiesSpecificCondition(subProperty.GetString()))
{
File.WriteAllText($"{i}.json", item.ToString());
i++;
}
}
}
Here are the results for arrays with either 100 or 1000 elements:
| Method | N | Mean | Error | StdDev |
|------------------------------------------------ |----- |-------------:|-----------:|-----------:|
| Test_ParseDataPoint_WithDataListModel | 100 | 7.744 us | 0.1538 us | 0.1511 us |
| Test_ParseDataPoint_WithJsonElementIndexReverse | 100 | 23.597 us | 0.3517 us | 0.3290 us |
| Test_ParseDataPoint_WithJsonElementLinqReverse | 100 | 8.871 us | 0.1715 us | 0.2041 us |
| Test_ParseDataPoint_WithDataListModel | 1000 | 79.491 us | 1.5771 us | 1.4752 us |
| Test_ParseDataPoint_WithJsonElementIndexReverse | 1000 | 2,053.375 us | 20.5950 us | 17.1978 us |
| Test_ParseDataPoint_WithJsonElementLinqReverse | 1000 | 91.698 us | 1.3967 us | 1.2382 us |
From which we can see:
Enumerating in reverse through the data list model's list is fastest.
Enumerating using LINQ Reverse() is a close second, around 15% slower (though it does require a temporary buffer.)
Reversing by using the indexer is almost shockingly slow -- nearly 2x slower for an array of 100 elements, and 25x slower for an array with 1000 elements.
Thus in performance-critical situations, enumerating through a JSON array using the JsonElement array indexer must needs be avoided. Deserialization to some data model should be preferred, and if not convenient, LINQ methods combined with the array enumerator may be the best alternative.
You might want to test the performance of JsonNode document object model, it wouldn't surprise me if the JsonArray indexer were performant.
Benchmark code here.
EnumerateArray().Reverse()?Enumerable.Reversecopies the source into an array internally before iterating from the end. While it works, it's slowtheData=JsonSerializer.DeserializeObject<TheDTO>(json);? Unless someone tried to make JSON "dynamic", not realizing that a JSON object is a dictionary ? And instead ofsubProperty: "Name"you could have ` "Name": { "MoreData":{...} }` ?