I've got a string:
s = ".,-2gg,,,-2gg,-2gg,,,-2gg,,,,,,,,t,-2gg,,,,,,-2gg,t,,-1gtt,,,,,,,,,-1gt,-3ggg"
and a regular expression I'm using
import re
delre = re.compile('-[0-9]+[ACGTNacgtn]+') #this is almost correct
print (delre.findall(s))
This returns:
['-2gg', '-2gg', '-2gg', '-2gg', '-2gg', '-2gg', '-1gtt', '-1gt', '-3ggg']
But -1gtt and -1gt are not desired matches. The integer in this case defines how many subsequent characters to match, so the desired output for those two matches would be -1g and -1g, respectively.
Is there a way to grab the integer after the dash and dynamically define the regex so that it matches that many and only that many subsequent characters?