Given data similar to the following:
['blah_12_1_bbc_services_cbbc',
'blah_12_1_high-profile_and_a',
'blah_12_1_iplayer,_known',
'blah_12_1_sport,_as_co-branded',
'er_ds_such_it',
'er_ds_websites_bbc_video',
'er_ds_bbc',
'er_ds_service._sport,',
'th_ss_13_a',
"th_ss_13_iplayer,_large_bbc's",
'th_ss_13_the_a_co-branded',
"th_ss_13_the_bbc's_bbc's"]
I'd like to create a list as:
['blah_12_1_',
'blah_12_1_',
'blah_12_1_',
'blah_12_1_',
'er_ds_',
'er_ds_',
'er_ds_',
'er_ds_',
'th_ss_13_',
'th_ss_13_',
'th_ss_13_',
'th_ss_13_']
Given that the substrings to extract have differing lengths and structures I'm not sure how to go about this.