I am trying to do something which I thought would be simple (and probably is), however I am hitting a wall. I have a string that contains document numbers. In most cases the format is ######-#-### however in some cases, where the single digit should be, there are multiple single digits separated by a comma (i.e. ######-#,#,#-###). The number of single digits separated by a comma is variable. Below is an example:
For the string below:
('030421-1,2-001 & 030421-1-002,030421-1,2,3-002, 030421-1-003')
I need to return:
['030421-1-001', '030421-2-001' '030421-1-002', '030421-1-002', '030421-2-002', '030421-3-002' '030421-1-003']
I have only gotten as far as returning the strings that match the ######-#-### pattern:
import re
p = re.compile('\d{6}-\d{1}-\d{3}')
m = p.findall('030421-1,2-001 & 030421-1-002,030421-1,2,3-002, 030421-1-003')
print m
Thanks in advance for any help!
Matt
findallfunc would modify your code.