I have a problem using regular expressions in python 3 so I would be gladful if someone could help me. I have a text file like the one below:
Header A
text text
text text
Header B
text text
text text
Header C
text text
here is the end
what I would like to do is to have a list of the text between the headers but including the headers themselves. I am using this regular expression:
re.findall(r'(?=(Header.*?Header|Header.*?end))',data, re.DOTALL)
the result is here
['Header A\ntext text\n text text\n Header', 'Header B\ntext text\n text text\n Header', 'Header C\n text text here is the end']
The thing is that I get the next header in the end of the every item in the list. As you can see every header ends when we find the next header but the last header doesn't end in a specific way
Is there a way to get a list (not tuple) of every header including its own text as substrings using regular expressions?