Before we go crazy, see if any of the following meet your performance requirements:
mylist.reverse(); json.dumps(mylist); mylist.reverse()
json.dumps(mylist[::-1])
json.dumps(tuple(reversed(mylist)))
You mentioned defining your own JSONEncoder default function, which is fairly simple to do (example at the very bottom*), but I don't think it works here since the json.JSONEncoder requires the default function to convert the object into one of the following:
None, True, False, str, int, float, list, tuple, dict
Converting an iterator to a list or tuple would create a large object, which is what we're trying to avoid.
You'll either need to modify your json library or monkey-patch it.
Here's the CPython source code of json.encoder. PyPy, Jython, and other Python implementations are probably using the same code for the json module.
https://github.com/python/cpython/blob/master/Lib/json/encoder.py#L204
def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
_key_separator, _item_separator, _sort_keys, _skipkeys, _one_shot,
## HACK: hand-optimized bytecode; turn globals into locals
ValueError=ValueError,
dict=dict,
float=float,
id=id,
int=int,
isinstance=isinstance,
list=list,
str=str,
tuple=tuple,
_intstr=int.__str__,
...
def _iterencode(o, _current_indent_level):
if isinstance(o, str):
yield _encoder(o)
...
elif isinstance(o, (list, tuple)):
yield from _iterencode_list(o, _current_indent_level)
# Add support for processing iterators
elif isinstance(o, iterator_types):
# Side-effect: this will consume the iterator.
# This is probably why it's not included in the official json module
# We could use itertools.tee to be able to iterate over
# the original iterator while still having an unconsumed iterator
# but this would require updating all references to the original
# iterator with the new unconsumed iterator.
# The side effect may be unavoidable.
yield from _iterencode_list(o, _current_index_level)
For performance reasons, you'll want to define the iterator types outside of the function and bring it in as a local.
str_iterator = type(iter( str() ))
list_iterator = type(iter( list() ))
tuple_iterator = type(iter( tuple() ))
range_iterator = type(iter( range(0) ))
list_reverseiterator = type(reversed( list() ))
reverseiterator = type(reversed( tuple() )) #same as <class 'reversed'>
# Add any other iterator classes that you need here, plus any container data types that json doesn't support (sets, frozensets, bytes, bytearray, array.array, numpy.array)
iterator_types = (str_iterator, list_iterator, tuple_iterator, range_iterator,
list_reverseiterator, reversed)
If you want to go the monkey-patching route, you'll need to redefine the json.encoder._make_iterencode function, replacing all occurrences of isinstance(X, (list, tuple)) with isinstance(X, (list, tuple)+iterator_types)
import json
def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
_key_separator, _item_separator, _sort_keys, _skipkeys, _one_shot,
iterable_types=_get_iterable_types(),
...
):
...
json.encoder._make_iterencode = _make_iterencode
These changes look something like this: https://github.com/python/cpython/pull/3034/files
*As promised, how to define your own default function, though not useful for dumping iterators without copying the iterator into a list or tuple first.
class JSONEncoderThatSupportsIterators(json.JSONEncoder):
def default(self, o):
try:
iterable = iter(o)
except TypeError:
pass
else:
return list(iterable)
# Let the base class default method raise the TypeError
return json.JSONEncoder.default(self, o)
li = range(10000000) # or xrange if Python 2
dumped = JSONEncoderThatSupportsIterators().encode(reversed(li))
assert dumped.startswith('[999999, 999998, 999997, ')
assert dumped.endswith('6, 5, 4, 3, 2, 1, 0]')
Alternatively, rather than subclassing json.JSONEncoder, you can define the default(self, o) function and pass it as an argument to json.dumps(default=default).
mylist.reverse()(avoids the copy) - do your serialization, then reverse it again if needs be?{test: range(10)}expanded... but not for the entirereverseof your data. It's further complicated by the fact that some levels are handled by the C implementation and other bits by_functionswith nested_functions... For sheer simplicity I'm sticking withlist.reverse:)json.dumps(mylist[::-1])is another way of doing this, but duplicates the list.