4

You might want to have the docstring not affect the hash for example like in joblib memory.

Is there a good way of stripping the docstring? inspect.getsource and inspect.getdoc kind of fight each other: the docstring is "cleaned" in one.

1
  • 2
    do you want to get the readable code or you just want something whose hash will not change if the code does not change? Commented Mar 9, 2020 at 15:05

4 Answers 4

2

In case anyone is still looking for a solution for this, this is how I managed to build it:

from ast import Constant, Expr, FunctionDef, Module, parse
from inspect import getsource
from textwrap import dedent
from types import FunctionType
from typing import cast


def get_source_without_docstring(obj: FunctionType) -> str:
    # Get cleanly indented source code of the function
    obj_source = dedent(getsource(obj))

    # Parse the source code into an Abstract Syntax Tree.
    # The root of this tree is a Module node.
    module: Module = parse(obj_source)

    # The first child of a Module node is FunctionDef node that represents
    # the function definition. We cast module.body[0] to FunctionDef for type safety.
    function_def = cast(FunctionDef, module.body[0])

    # The first statement of a function could be a docstring, which in AST
    # is represented as an Expr node. To remove the docstring, we need to find
    # this Expr node.
    first_stmt = function_def.body[0]

    # Check if the first statement is a docstring (a constant str expression)
    if (
        isinstance(first_stmt, Expr)
        and isinstance(first_stmt.value, Constant)
        and isinstance(first_stmt.value.value, str)
    ):
        # Split the original source code by lines
        code_lines: list[str] = obj_source.splitlines()

        # Delete the lines corresponding to the docstring from the list.
        # Note: We are using 0-based list index, but the line numbers in the
        # parsed AST nodes are 1-based. So, we need to subtract 1 from the
        # 'lineno' property of the node.
        del code_lines[first_stmt.lineno - 1 : first_stmt.end_lineno]

        # Join the remaining lines back into a single string
        obj_source = "\n".join(code_lines)

    # Return the source code of function without docstrings
    return obj_source

Note: code by myself, comments by OpenAI's GPT

Sign up to request clarification or add additional context in comments.

Comments

2

If you just want to hash the body of a function, regardless of the docstring, you can use the function.__code__ attribute.

It gives access to a code object which is not affected by the docstring.

unfortunately, using this, you will not be able to get a readable version of the source

def foo():
    """Prints 'foo'"""
    print('foo')


print(foo.__doc__)  # Prints 'foo'
print(foo.__code__.co_code)  # b't\x00d\x01\x83\x01\x01\x00d\x02S\x00'
foo.__doc__ += 'pouet'
print(foo.__doc__)  # Prints 'foo'pouet
print(foo.__code__.co_code)  # b't\x00d\x01\x83\x01\x01\x00d\x02S\x00'

1 Comment

This is not stable across runs afaik. Not suitable for hashing.
1

One approach is to delete the docstring from the source using regex:

nodoc = re.sub(":\s'''.*?'''", "", source)
nodoc = re.sub(':\s""".*?"""', "", nodoc)

currently works for functions and classes only, maybe someone finds a pattern for modules too

1 Comment

This could also delete multiline strings that are not docstrings, if the function doesn't have a docstring or if it has a multiline string with a different quote character to the docstring.
0

There is a simple solution

def fun(a,b):
    '''hahah'''
    return a+b
# we simply delete the docstring
fun.__doc__ = ''
print(help(fun))

this code yields:

Help on function fun in module __main__:

fun(a, b)

1 Comment

That seems to only print the function signature, not the whole source as required. Also, help() prints the help, it doesn't return it as a string. (Your example would also print "None" at the end.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.