This produces the desired pattern. It's not technically a sort though - just a regular expression search of the sort string.
>>> import re
>>>
>>> sort_str = "deplete mineral resources , from 123 in x 123 in x " \
... "19 ft , on 24 ft t shaped hole"
>>>
>>> str_list = ['123', '123', '19', '24', 'in', 'in', 'ft', 'ft',
... 'deplete mineral', 't', 'resources', 'shaped hole']
>>>
>>> re.findall('|'.join(str_list), sort_str)
['deplete mineral', 'resources', '123', 'in', '123', 'in', '19',
'ft', '24', 'ft', 't', 'shaped hole']
>>>
>>>
>>> desired = ['deplete mineral', 'resources', '123', 'in' , '123',
... 'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']
>>> desired == re.findall('|'.join(str_list), sort_str)
True
The regular expression is simple. It's of the form "alt_1|alt_2|alt_3". What that OR-like expression produces is a pattern matcher that scans a string looking for the substrings "alt_1", "alt_2", or "alt_3".
str_list is joined together to form this OR-like expression in this simple fashion:
>>> '|'.join(str_list)
'123|123|19|24|in|in|ft|ft|deplete mineral|t|resources|shaped hole'
The ordering of the above expression isn't important - they could be in any order.
This string expression is turned into a regular expression internally when passed in as the first parameter to re.findall() and used to find all matching substrings in sort_str with the following line:
>>> re.findall('|'.join(str_list), sort_str)
re.findall() scans sort_str from beginning to end looking for substrings that are part of str_list. Each occurrence is added to the list it returns.
So the substrings matched will be in the same order as the words in sort_str.
pandastag? I had not found anything regardingpandasin your question! Please try to remove it.