Python cbor2 encode float in preferred (efficient) format

Question

The CBOR docs state that the most efficient (less number of bytes) encoding should be preferred.

floats can be encoded as 64-bit floats, or with extensions as 32-bit, 16-bit, BigFloat or DecimalFloat formats.

Stanards 64-bit encoding uses 9 bytes. Some floating values can take much less space if using an alterantive format (e.g the values 0.0, 1.0, 1.5 can be represented as 4 bytes using BigFloats).

Some values are better represented as standard floats (e.g. 0.123456789 is represented by 9 bytes as 64-bit float or 29 bytes with BigFloats.

The cbor2 python library supports BigFloats if using the Decimal type, or the float if using the float type.

How can I get cbor2 to automatically emit the most efficient type depending on the actual value?

I have tried various arbitrary values using cbor2.dumps(). floats are always encoded as CBOR floats, and Decimal types are alwasy encoded as CBOR BigFloats.

>>> x=0.0 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
0.0
b'\xfb\x00\x00\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82\x00\x00'
4

>>> x=1.0 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
1.0
b'\xfb?\xf0\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82\x00\x01'
4

>>> x=1.5 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
1.5
b'\xfb?\xf8\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82 \x0f'
4

>>> x=0.123456789 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
0.123456789
b'\xfb?\xbf\x9a\xdd79c_'
9
b'\xc4\x8287\xc2W\x80\xe5\x18Js\xc0\xe4\x8f-\xf1\xc9\xf0\x90\xf4u%+\x93\xa7\n\x88\xa2?'
29

user19007114 · Accepted Answer · 2023-01-21 11:36:06Z

3

So I found the answer is a combination of using the canonical=True argument to dumps() and casting the floats to lower precision floats (using numpy) where suitable (if any loss of precision is tolerable/acceptable).

NOTE: have to cast back to python float as cbor can't encode numpy classes at the momement.

>>> x=0.123456789 ; x ; d1=dumps(x, canonical=True) ; d1 ; len(d1)
0.123456789
b'\xfb?\xbf\x9a\xdd79c_'
9

>>> x=float( np.float32( 0.123456789 ) ) ; x ; d1=dumps(x, canonical=True) ; d1 ; len(d1)
0.12345679104328156
b'\xfa=\xfc\xd6\xea'
5

>>> x=float( np.float16( 0.123456789 ) ) ; x ; d1=dumps(x, canonical=True) ; d1 ; len(d1)
0.12347412109375
b'\xf9/\xe7'
3

answered Jan 21, 2023 at 11:36

user19007114

636 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python cbor2 encode float in preferred (efficient) format

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related