3

I want to write a function that takes 2 inputs: a string and a substring, then the function will remove that part of the substring from the string.

def remove_substring(s, substr):
   """(str, str) -> NoneType
Returns string without string substr

remove_substring_from_string("Im in cs", "cs")
Im in
    """
    other_s = ''
for substr in s:
    if substr in s:
        continue

How do I continue on from here? Assuming my logic is sound.

3
  • Your first task is to find the position of the substring in s. Your for loop iterates over the characters from s. Since s is always a character from s, substr in s is always True. So it's not what you want. What you need to do first is to find the position of substr in s, which you can do by iterating over the possible positions where the substring might be found, take a slice of the string at that position (using the length of the substring to calculate the end of the slice), and check to see whether the slice equals the substring. Commented Oct 10, 2021 at 21:32
  • Sounds great, can you please show a code example Commented Oct 10, 2021 at 21:44
  • 1
    Is the expected output for remove_substring_from_string("aaa", "aa") only an "a" or an empty string? I.e., how are overlapping parts meant to be handled? Commented Oct 10, 2021 at 22:37

4 Answers 4

2

Avoiding the use of Python functions.

Method 1

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    i = 0
    while i < len(s) - len(substr) + 1:
        # Check if substring starts at i
        if s[i:i+len(substr)] == substr:
            break   
        i += 1
    else:
        # break not hit, so substr not found
        return s
    
    # break hit
    return s[:i] + s[i+len(substr):]

Method 2

If the range function can be used, the above can be written more compactly as follows.

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    for i in range(len(s) - len(substr) + 1):
        if s[i:i+len(substr)] == substr:
            break
    else:
        # break not hit, so substr not found
        return s
    
    return s[:i] + s[i+len(substr):]

Test

print(remove_substring_from_string("I have nothing to declare except my genuis", " except my genuis"))
# Output: I have nothing to declare'
Sign up to request clarification or add additional context in comments.

Comments

1

This approach is based on the KMP algorithm:

def KMP(s):
    n = len(s)
    pi = [0 for _ in range(n)]

    for i in range(1, n):
        j = pi[i - 1]
        while j > 0 and s[i] != s[j]:
            j = pi[j - 1]
        
        if s[i] == s[j]:
            j += 1
        
        pi[i] = j
    
    return pi

# Removes all occurences of t in s
def remove_substring_from_string(s, t):
    n = len(s)
    m = len(t)
    
    # Calculate the prefix function using KMP
    pi = KMP(t + '\x00' + s)[m + 1:]
    r = ""

    i = 0
    while i + m - 1 < n: # Before the remaining string is smaller than the substring
        if pi[i + m - 1] == m: # If the substring is here, skip it
            i += m
        else: # Otherwise, add the current character and move to the next
            r += s[i]
            i += 1
    
    # Add the remaining string
    r += s[i:]
    return r

It runs in O(|s| + |t|), but it has a few downsides:

  • The code is long and unintuitive.
  • It requires that there is no null (\x00) in the input strings.
  • Its constant factors are pretty bad for short s and t.
  • It doesn't handle overlapping strings how you might want it: remove_substring_from_string("aaa", "aa") will return "a". The only guarantee made is that t in remove_substring_from_string(s, t) is False for any two strings s and t.

A C++ example and further explanation for the KMP algorithm can be found here. The remove_substring_from_string function then only checks if the entire substring is matched at each position and if so, skips over the substring.

Comments

0

I would do this using re.

import re


def remove_substring(s, substr):
    # type: (str, str) -> str
    return re.subn(substr, '', s)[0]
remove_substring('I am in cs', 'cs')
# 'I am in '

remove_substring('This also removes multiple substr that are found. Even if that substr is repeated like substrsubstrsubstr', 'substr')
# 'This also removes multiple  that are found. Even if that  is repeated like '

Comments

-1
def remove_substring(s, substr):
while s != "":
    if substr in s:
        s = s.replace(substr, "")
    else:
        return s

if s == "":
    return "Empty String"

The idea here is that we replace all occurrences of substr within s, by replacing the first instance of substr and then looping until we are done.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.