tokenize.COMMENT can be used for comments (that begins with #), not for multiline strings literal or docstrings.
However, you can use this regex in order to extract the multiline strings from your example:
import re
file = "file.py"
with open(file, "r") as f:
content = f.read()
p = re.compile('(?:""")(.*?)(?:""")', re.DOTALL)
result = p.findall(content)
print(result)
Output:
['\nmultiline comment above class read me\nread me too\n', '\n multiline inside class\n ']
If you want to keep the """, just use capturing groups instead of non-capturing groups : (""") instead of (?:""").
Using re.DOTALL is important, it allows the dot . to match any character including a newline.
A little warning:
Please note that as @edusanketd said in comment, this regex will match triple quotes used inside regular strings or single line comments too. So, this regex is not the panacea: if all your python files are structured as in your example (""" are used ONLY for multilines strings), it will be fine, but if you have some files that use """ for other purposes (like triple quote strings used inside regular strings) their wil be some "errors".
Example code showing the limits of this regex :
"""
multiline comment above class read me
read me too
"""
# dont read mee
class TestComment:
"""
multiline inside class
"""
def aFunc(self):
pass
a_string = '"""THIS IS NOT A COMMENT"""'
# """dont read me too"""
output:
['\nmultiline comment above class read me\nread me too\n', '\n multiline inside class\n ', 'THIS IS NOT A COMMENT', 'dont read me too']
Some informations about multi-line strings as multi-line comments :
A tweet form Guido van Rossum :
https://twitter.com/gvanrossum/status/112670605505077248?lang=en
Python tip: You can use multi-line strings as multi-line comments.
Unless used as docstrings, they generate no code! :-)
And here is an interesting post from Sean Gillies on this subject:
https://sgillies.net/2017/05/30/python-multi-line-comments-and-triple-quoted-strings.html
"""it throws an exception. Interesting, they did handle the exception.459 if contstr: # continued string 460 if not line: --> 461 raise TokenError("EOF in multi-line string", strstart) 462 endmatch = endprog.match(line) 463 if endmatch: TokenError: ('EOF in multi-line string', (13, 0))