0

Given the following list of sub-strings:

sub = ['ABC', 'VC', 'KI']

is there a way to get the index of these sub-string in the following string if they exist?

s = 'ABDDDABCTYYYYVCIIII'


so far I have tried:

for i in re.finditer('VC', s):
  print(i.start, i.end)
  

However, re.finditer does not take multiple arguments.

thanks

3
  • 2
    Python strings have a built-in find() function. Use that Commented Jan 17, 2023 at 10:18
  • @Pingu find() returns only the first substring index, they seem to need all of matches. Commented Jan 17, 2023 at 10:37
  • @bereal So? If find returns >= 0 just try again at an appropriate offset Commented Jan 17, 2023 at 10:44

6 Answers 6

2

You can join those patterns together using |:

import re
sub = ['ABC', 'VC', 'KI']
s = 'ABDDDABCTYYYYVCIIII'

r = '|'.join(re.escape(s) for s in sub)
for i in re.finditer(r, s):
    print(i.start(), i.end())
Sign up to request clarification or add additional context in comments.

Comments

1

A substring may occur more than once in the main string (although it doesn't in the sample data). One could use a generator based around a string's built-in find() function like this:

note the source string has been modified to demonstrate repetition

sub = ['ABC', 'VC', 'KI']
s = 'ABCDDABCTYYYYVCIIII'

def find(s, sub):
    for _sub in sub:
        offset = 0
        while (idx := s[offset:].find(_sub)) >= 0:
            yield _sub, idx + offset
            offset += idx + 1

for ss, start in find(s, sub):
    print(ss, start)

Output:

ABC 0
ABC 5
VC 13

Comments

0

You could map over the find string method.

s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']

print(*map(s.find, sub))
# Output 5 13 -1

1 Comment

using dict(zip(sub,map(s.find, sub))) it creates a dict with the substrings as keys and the indices as values.
0

How about using list comprehension with str.find?

s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']
results = [s.find(pattern) for pattern in sub]

print(*results) # 5 13 -1

Comments

0

Another approach with re, if there can be multiple indices then this might be better as the list of indices is saved for each key, when there is no index found, the substring won't be in the dict.

import re
s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']

# precompile regex pattern
subpat = '|'.join(sub)
pat = re.compile(rf'({subpat})')

matches = dict()
for m in pat.finditer(s):
    # append starting index of found substring to value of matched substring
    matches.setdefault(m.group(0),[]).append(m.start()) 

print(f"{matches=}")
print(f"{'KI' in matches=}")
print(f"{matches['ABC']=}")

Outputs:

matches={'ABC': [5], 'VC': [13]}
'KI' in matches=False
matches['ABC']=[5]

Comments

0

Just Use String index Method

list_ = ['ABC', 'VC', 'KI']

s = 'ABDDDABCTYYYYVCIIII'


for i in list_:
    if i in s:
        print(s.index(i))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.