If your source is ASCII, should you specify coding?

Question

We have a large project that is entirely coded in ASCII. Is it worth putting coding statements at the beginning of each source file (e.g. #coding=utf-8) for some reason if the source doesn't have any unicode in it?

Thanks, --Peter

I'd certanly do it... A python script would do it pretty quickly ;-) — Savir
– Savir, Commented Apr 14, 2014 at 17:25
no I see no reason to specify encoding unless required to run the script ... furthermore I think that adding #coding=utf8 to the top would make it confusing if theres no utf8 characters ... (not overly so but meh ...) — Joran Beasley
– Joran Beasley, Commented Apr 14, 2014 at 17:26
Depends what you want to happen if somebody pastes some non-ASCII UTF-8 data into a string in one of the files. Do you want it to work, or to complain that they aren't supposed to do that? What do you want if they paste in some non-ASCII data in an encoding other than UTF-8? That is to say, are your files ASCII by policy, or are they UTF-8 by policy that happens only to contain ASCII characters? — Steve Jessop
– Steve Jessop, Commented Apr 14, 2014 at 17:26
In our case, there is no policy as to the encoding, but the files do happen to be almost universally ASCII. Seems to me that the best course of action, since Python 3 will (I hope) be in our future at some point, and since we are getting a very small benefit from ASCII encoding (namely, that should there accidentally be unicode committed to the file that the interpreter will raise an exception), that we should go ahead and explicitly label the encoding ASCII (except for any files that actually need UTF-8) — pbanka
– pbanka, Commented Apr 14, 2014 at 21:02

Burhan Khalid · Accepted Answer · 2014-04-14 17:28:51Z

2

For portability I would explicitly declare it, especially as the default file encoding is changing in Python 3 (see PEP-3120):

This PEP proposes to change the default source encoding from ASCII to UTF-8. Support for alternative source encodings continues to exist; an explicit encoding declaration takes precedence over the default.

Although it doesn't affect you with ASCII, seeing how explicit is better than implicit I would recommend you add it to the top of your file.

answered Apr 14, 2014 at 17:28

Burhan Khalid

175k20 gold badges254 silver badges291 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jfs Over a year ago

the source code encoding declaration is redundant redundant for ascii-only files.

jfs · Accepted Answer · 2014-04-14 17:38:54Z

1

ASCII is the default in Python 2. UTF-8 is the default in Python 3.

If your files are ascii-only; you don't need to declare the source code encoding in both version (ascii is a subset of utf-8).

Non-ASCII character leads to SyntaxError in Python 2 therefore an accidental non-ascii character won't go unnoticed and won't corrupt any data. There is no reason to declare source code encoding for ascii-only files.

answered Apr 14, 2014 at 17:38

jfs

417k210 gold badges1k silver badges1.7k bronze badges

Comments

Deduplicator · Accepted Answer · 2014-04-14 17:40:10Z

1

You should do one of two things (at least):

Add a hook to your repository making it verify on checkin that all python files are still pure ASCII.
Put the explicit ASCII-encoding tag in the files.

You might want to check if you get significantly better startup when the explicit tag is UTF-8 though. Anyway, I would consider that a bug of the interpreter.

This way, if anyone slips and mistakenly adds some non-ASCII characters, you won't have to chase that (potential) bug. Explicitly restricting to ASCII has one advantage: You actually can reliably see what each string contains and there are no equal-seeming distinct names.

edited Apr 14, 2014 at 17:40

answered Apr 14, 2014 at 17:34

Deduplicator

46k7 gold badges73 silver badges125 bronze badges

Collectives™ on Stack Overflow

If your source is ASCII, should you specify coding?

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related