Assume I read some content from socket in Python and have to decode it to UTF-8 on-the-fly.
I can not afford to keep all the content in memory, so I must decode it as I receive and save to file.
It can happen, that I will only receive partial bytes of character, (€-sign is represented by three bytes for example in Python as '\xe2\x82\xac').
Assume I have received only the first two bytes (\xe2\x82), if I try to decode it, I'm getting 'UnicodeDecodeError', as expected.
I could always try to decode the current content and check if it throws an Exception
- But how reliable is this approach?
- How can I know or determine if I can decode the current content?
- How to do it correct?
Thanks