0

I have a string

deplete mineral resources , from 123 in x 123 in x 19 ft , on 24 ft t shaped hole

and a list of strings

['123', '123', '19', '24', 'in', 'in', 'ft', 'ft', 'deplete mineral', 't', 'resources', 'shaped hole']

I want to sort this list based on the given string. When I did sorted(l, key=s.index), I am getting the output as:

['deplete mineral', 't', 'in', 'in', 'resources', '123', '123', '19', 'ft', 'ft', '24', 'shaped hole']

But my desired output is:

['deplete mineral', 'resources', '123', 'in' , '123', 'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']

The list should be sorted exactly as the string given. Is there an efficient way to achieve this?

4
  • Why are you using pandas tag? I had not found anything regarding pandas in your question! Please try to remove it. Commented Jul 19, 2021 at 6:13
  • Why do you want to do this? It is a single operation or something that should happen reapeatedly? Are you always comparing to the same string? Commented Jul 19, 2021 at 6:17
  • @Kshitiz Sorry, I will change it asap Commented Jul 19, 2021 at 6:29
  • @JohanL Yes, it has to be done repeatedly and each time the string and the list of values changes Commented Jul 19, 2021 at 6:29

1 Answer 1

1

This produces the desired pattern. It's not technically a sort though - just a regular expression search of the sort string.

>>> import re
>>>
>>> sort_str = "deplete mineral resources , from 123 in x 123 in x " \
...            "19 ft , on 24 ft t shaped hole"
>>> 
>>> str_list = ['123', '123', '19', '24', 'in', 'in', 'ft', 'ft', 
...             'deplete mineral', 't', 'resources', 'shaped hole']
>>> 
>>> re.findall('|'.join(str_list), sort_str)
['deplete mineral', 'resources', '123', 'in', '123', 'in', '19', 
 'ft', '24', 'ft', 't', 'shaped hole']
>>>
>>>
>>> desired = ['deplete mineral', 'resources', '123', 'in' , '123', 
...            'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']
>>> desired == re.findall('|'.join(str_list), sort_str)
True

The regular expression is simple. It's of the form "alt_1|alt_2|alt_3". What that OR-like expression produces is a pattern matcher that scans a string looking for the substrings "alt_1", "alt_2", or "alt_3".

str_list is joined together to form this OR-like expression in this simple fashion:

>>> '|'.join(str_list)
'123|123|19|24|in|in|ft|ft|deplete mineral|t|resources|shaped hole'

The ordering of the above expression isn't important - they could be in any order.

This string expression is turned into a regular expression internally when passed in as the first parameter to re.findall() and used to find all matching substrings in sort_str with the following line:

>>> re.findall('|'.join(str_list), sort_str)

re.findall() scans sort_str from beginning to end looking for substrings that are part of str_list. Each occurrence is added to the list it returns.

So the substrings matched will be in the same order as the words in sort_str.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for the response! Can you explain what you have done here? What is the purpose of joining the strings with 'I'?
Sure.. I'll add some explanation.
Thank you so much for the explanation! It's more clear now!
You're welcome =) I hope this satisfies the requirements for the type of sorting operation you need. As I said, it's not technically a sorting operation, but it will match words in the order scanned.
Yes, maybe arrange is a better word than sort in this scenario. Once again, thank you for the explanation, it satisfies the other cases as well :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.