I want to pull out all of the python functions within a python script. Is there any single regex I can use to do this, e.g:
import re
all_functions = re.findall(regex, python_script)
I have implemented a very cumbersome way of doing this involving many if statements, but I feel there is a more elegant solution with regexes.
I think the regex should be something like this:
'def.*?\n\S'
because:
- Functions start with
def - Followed by anything (but we want to be non-greedy)
- A function ends when after a newline character
\n, the starting character of the next line is not white space\S
However, I can't seem to get this to work over multiple lines.
Edit: Python functions may be contained in files that don't have .py extensions; e.g. they can be contained in IPython notebooks with .ipynb extension so I can't necessarily always import the code and use dir().
dirand check their types? If the code is to be trusted of coursedefcan't be followed by "anything", there's a definition for identifiers. Perhaps you should look into Python's grammar? Also, if you want.to include line breaks you needre.DOTALL.'def [A-Za-z_][A-Za-z_0-9]*?\(.*?\)', assuming only well written python functions are contained in the file, I think that should detect the beginning of functions.lambdastatements, assigned to keys in theglobals()dictionary, or appear as if by magic whereverevalrears its ugly head... and I'm sure I've missed some. In general, you can't use regular expressions to parse a non-regular language like Python. Can't be done, except in ridiculously limited circumstances... and even then, only when you already know exactly what you're looking for.