Python 3: How to check if a string can be a valid variable?

Question

I have a string and want to check if it can be used as a valid variable without getting a syntax error. For example

def variableName(string):
    #if string is valid variable name:
        #return True
    #else:
        #return False

input >>> variableName("validVariable")
output >>> True
input >>> variableName("992variable")
output >>> False

I would not like to use the .isidentifier(). I want to make a function of my own.

DYZ · Accepted Answer · 2018-03-17 06:11:49Z

7

The following answer is true only for "old-style" Python-2.7 identifiers;

"validVariable".isidentifier()
#True
"992variable".isidentifier()
#False

Since you changed your question after I posted the answer, consider writing a regular expression:

re.match(r"[_a-z]\w*$", yourstring,flags=re.I)

edited Mar 17, 2018 at 6:11

answered Mar 17, 2018 at 2:00

DYZ

57.3k10 gold badges73 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

user8177336 Over a year ago

OP says:"I would not like to use the .isidentifier(). I want to make a function of my own." So your solution isn't answeing the question, i think. forgive me if im wrong.

Radhe Krishna Over a year ago

Yes. @BOi is correct. your answer does not answer my question. I do not want to use .isidentifier() i want to create my own function

DYZ Over a year ago

@RadheKrishna You changed your question after I posted my answer. Consider using regular expressions, then. (I modified the answer.)

ohmu Over a year ago

You can simplify the second part of your regular expression to \\w which matches letters, digits, and _.

anthony sottile Over a year ago

@DyZ >>> Ä = 1 >>> print(Ä) 1 (python3 extends identifiers to a bunch of non-ascii characters)

|

Community · Accepted Answer · 2020-06-20 09:12:55Z

In Python 3 a valid identifier can have characters outside of ASCII range, as you don't want to use str.isidentifier, you can write your own version of it in Python.

Its specification can be found here: https://www.python.org/dev/peps/pep-3131/#specification-of-language-changes

Implementation:

import keyword
import re
import unicodedata


def is_other_id_start(char):
    """
    Item belongs to Other_ID_Start in
    http://unicode.org/Public/UNIDATA/PropList.txt
    """
    return bool(re.match(r'[\u1885-\u1886\u2118\u212E\u309B-\u309C]', char))


def is_other_id_continue(char):
    """
    Item belongs to Other_ID_Continue in
    http://unicode.org/Public/UNIDATA/PropList.txt
    """
    return bool(re.match(r'[\u00B7\u0387\u1369-\u1371\u19DA]', char))


def is_xid_start(char):

    # ID_Start is defined as all characters having one of
    # the general categories uppercase letters(Lu), lowercase
    # letters(Ll), titlecase letters(Lt), modifier letters(Lm),
    # other letters(Lo), letter numbers(Nl), the underscore, and
    # characters carrying the Other_ID_Start property. XID_Start
    # then closes this set under normalization, by removing all
    # characters whose NFKC normalization is not of the form
    # ID_Start ID_Continue * anymore.

    category = unicodedata.category(char)
    return (
        category in {'Lu', 'Ll', 'Lt', 'Lm', 'Lo', 'Nl'} or
        is_other_id_start(char)
    )


def is_xid_continue(char):
    # ID_Continue is defined as all characters in ID_Start, plus
    # nonspacing marks (Mn), spacing combining marks (Mc), decimal
    # number (Nd), connector punctuations (Pc), and characters
    # carryig the Other_ID_Continue property. Again, XID_Continue
    # closes this set under NFKC-normalization; it also adds U+00B7
    # to support Catalan.

    category = unicodedata.category(char)
    return (
        is_xid_start(char) or
        category in {'Mn', 'Mc', 'Nd', 'Pc'} or
        is_other_id_continue(char)
    )


def is_valid_identifier(name):
    # All identifiers are converted into the normal form NFKC
    # while parsing; comparison of identifiers is based on NFKC.
    name = unicodedata.normalize(
        'NFKC', name
    )

    # check if it's a keyword
    if keyword.iskeyword(name):
        return False

    # The identifier syntax is <XID_Start> <XID_Continue>*.
    if not (is_xid_start(name[0]) or name[0] == '_'):
        return False

    return all(is_xid_continue(char) for char in name[1:])

if __name__ == '__main__':
    # From goo.gl/pvpYg6
    assert is_valid_identifier("a") is True
    assert is_valid_identifier("Z") is True
    assert is_valid_identifier("_") is True
    assert is_valid_identifier("b0") is True
    assert is_valid_identifier("bc") is True
    assert is_valid_identifier("b_") is True
    assert is_valid_identifier("µ") is True
    assert is_valid_identifier("𝔘𝔫𝔦𝔠𝔬𝔡𝔢") is True

    assert is_valid_identifier(" ") is False
    assert is_valid_identifier("[") is False
    assert is_valid_identifier("©") is False
    assert is_valid_identifier("0") is False

You can check CPython and Pypy's implmentation here and here respectively.

Alain T. · Accepted Answer · 2018-03-17 02:26:10Z

0

You could use a regular expression.

For example:

isValidIdentifier = re.match("[A-Za-z_](0-9A-Za-z_)*",identifier)

Note that his only checks for alphanumeric characters. The actual standard supports other characters. See here: https://www.python.org/dev/peps/pep-3131/

You may also need to exclude reserved words such as def, True, False, ... see here: https://www.programiz.com/python-programming/keywords-identifier

answered Mar 17, 2018 at 2:26

Alain T.

42.2k4 gold badges36 silver badges57 bronze badges

2 Comments

ohmu Over a year ago

Your regular expression is malformed. I think you meant [0-9A-Za-z_] instead of (0-9A-Za-z_).

Alain T. Over a year ago

Strangely enough it works in IDLE (MacOS). Anyhow DyZ had already provided that answer (with a proper regexp). I didn't notice it before posting.

Collectives™ on Stack Overflow

Python 3: How to check if a string can be a valid variable?

3 Answers 3

9 Comments

Implementation:

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

9 Comments

Implementation:

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related