0

Hello and thanks for reading. I am running into some issues decoding previously encoded files using base64. For example, suppose I want to encode a pdf file using base64. The result is a nice 80 char delimited series of strings. The code that does the encoding (cribbed from this board) is nice and easy:

    def encode_file_base64(bin_input):
      flag = 0
      try:
        with open(bin_input, 'rb') as fin, open('tmp.bin_hex', 'w') as fout:
        base64.encode(fin, fout)
      except:
        traceback.print_exc()
        flag = -1
      return flag 

Now the decoding function:

    def decode_file_base64(bin_output):
      flag = 0
      try:
        with open('tmp.bin_hex', 'rb') as fin, open(bin_output, 'w') as fout:
          base64.decode(fin, fout)
      except:
          traceback.print_exc()
          flag = -1
      return flag

It does the job, but when I try to open the output file, I am not able to and the file appears to be 'corrupt'. I have been struggling with this more than a fair amount and I'm about to give up. I suppose I could use other types of encodings but the BOSS insists on base64 (he must have heard that it's the best...).

5
  • What does "I am not able to" mean? Or "appears to be 'corrupt'"? Do you get an error from your code? If so, what error, and what's the traceback? Or do you get an error from some other code that tries to open the b64 file? Or does it just not look right to you? Or…? Commented Apr 9, 2013 at 22:10
  • 1
    Aside from the binary/text mode problem noted by @abarnert, note that this is not idiomatic Python. In Python you don't catch every single exception to convert it to a C-style "error return". Instead, you let exceptions propagate, and leave them to be handled by code that actually knows how to handle them. Returning -1 is not helpful to the caller, where a propagated exception can allow the caller to log the error message. Commented Apr 9, 2013 at 22:15
  • On top of @user4815162342's point: Even if the silly 0/-1 API is an outside requirement, going into contortions to avoid early return is also not pythonic; just return 0 and return -1 and get rid of flag. And the inconsistent indentation (anywhere from 1 to 4 characters in different places) is not pythonic, and seems to have led you to a bug that I didn't notice: the base64.encode is not indented under the with, so you won't even be able to run this. Commented Apr 9, 2013 at 22:22
  • OK I was aware of the silly exception handling and meant to clean it up. The issue is not in the code, and I think that abarnert hit the nail on the head. The decoded file is slightly bigger than the original PDF. I will check later tonight. Thank you very much! Commented Apr 9, 2013 at 22:34
  • 1
    BTW, with respect to your boss's requirement, base64 is usually a very good choice for passing binary data through a printable-ASCII channel. Hexlify is only a little bit simpler, and means 100% waste instead of 33%; base-96 is a lot more complicated, and only cuts the waste from 33% to 14%. In other words, your boss is probably right here. Commented Apr 9, 2013 at 22:49

1 Answer 1

3

I don't know if this is your problem (I don't even know what your problem is), but if you're on a platform/version/implementation where binary-mode makes a difference, you're doing it wrong:

with open('tmp.bin_hex', 'rb') as fin, open(bin_output, 'w') as fout:

You're opening the text file (the b64 file you wrote in text mode) in binary mode, and the binary file in text mode. Try this:

with open('tmp.bin_hex', 'r') as fin, open(bin_output, 'wb') as fout:

Meanwhile, for debugging purposes, you might want to try comparing a file to the result of encoding and decoding it. If, for example, you see that the new file is a little longer, and a hexdump shows that this is because every 0x0A byte has been replaced by two 0x0D 0x0A bytes, you know that the problem is that you're translating newlines, which in turn means that you're in text mode.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.