According to PEP 3131, the first character of an identifier needs to belong to ID_Start, the rest to ID_Continue, defined as follows:
ID_Start is defined as all characters having one of the general
categories uppercase letters (Lu), lowercase letters (Ll), titlecase
letters (Lt), modifier letters (Lm), other letters (Lo), letter
numbers (Nl), the underscore, and characters carrying the
Other_ID_Start property. XID_Start then closes this set under
normalization, by removing all characters whose NFKC normalization is
not of the form ID_Start ID_Continue* anymore.
ID_Continue is defined as all characters in ID_Start, plus
nonspacing marks (Mn), spacing combining marks (Mc), decimal number
(Nd), connector punctuations (Pc), and characters carryig the
Other_ID_Continue property. Again, XID_Continue closes this set
under NFKC-normalization; it also adds U+00B7 to support Catalan.
That's a long list (currently around 120.000 characters) - fortunately there is a helpful project on GitHub that contains the list and a script to generate it.