Removing substring from string in Python

Question

I want to write a function that takes 2 inputs: a string and a substring, then the function will remove that part of the substring from the string.

def remove_substring(s, substr):
   """(str, str) -> NoneType
Returns string without string substr

remove_substring_from_string("Im in cs", "cs")
Im in
    """
    other_s = ''
for substr in s:
    if substr in s:
        continue

How do I continue on from here? Assuming my logic is sound.

Your first task is to find the position of the substring in s. Your for loop iterates over the characters from s. Since s is always a character from s, substr in s is always True. So it's not what you want. What you need to do first is to find the position of substr in s, which you can do by iterating over the possible positions where the substring might be found, take a slice of the string at that position (using the length of the substring to calculate the end of the slice), and check to see whether the slice equals the substring. — kindall
– kindall, Commented Oct 10, 2021 at 21:32
Is the expected output for remove_substring_from_string("aaa", "aa") only an "a" or an empty string? I.e., how are overlapping parts meant to be handled? — dranjohn
– dranjohn, Commented Oct 10, 2021 at 22:37

DarrylG · Accepted Answer · 2021-10-10 22:49:47Z

Avoiding the use of Python functions.

Method 1

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    i = 0
    while i < len(s) - len(substr) + 1:
        # Check if substring starts at i
        if s[i:i+len(substr)] == substr:
            break   
        i += 1
    else:
        # break not hit, so substr not found
        return s
    
    # break hit
    return s[:i] + s[i+len(substr):]

Method 2

If the range function can be used, the above can be written more compactly as follows.

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    for i in range(len(s) - len(substr) + 1):
        if s[i:i+len(substr)] == substr:
            break
    else:
        # break not hit, so substr not found
        return s
    
    return s[:i] + s[i+len(substr):]

Test

print(remove_substring_from_string("I have nothing to declare except my genuis", " except my genuis"))
# Output: I have nothing to declare'

dranjohn · Accepted Answer · 2021-10-10 23:07:46Z

This approach is based on the KMP algorithm:

def KMP(s):
    n = len(s)
    pi = [0 for _ in range(n)]

    for i in range(1, n):
        j = pi[i - 1]
        while j > 0 and s[i] != s[j]:
            j = pi[j - 1]
        
        if s[i] == s[j]:
            j += 1
        
        pi[i] = j
    
    return pi

# Removes all occurences of t in s
def remove_substring_from_string(s, t):
    n = len(s)
    m = len(t)
    
    # Calculate the prefix function using KMP
    pi = KMP(t + '\x00' + s)[m + 1:]
    r = ""

    i = 0
    while i + m - 1 < n: # Before the remaining string is smaller than the substring
        if pi[i + m - 1] == m: # If the substring is here, skip it
            i += m
        else: # Otherwise, add the current character and move to the next
            r += s[i]
            i += 1
    
    # Add the remaining string
    r += s[i:]
    return r

It runs in O(|s| + |t|), but it has a few downsides:

The code is long and unintuitive.
It requires that there is no null (\x00) in the input strings.
Its constant factors are pretty bad for short s and t.
It doesn't handle overlapping strings how you might want it: remove_substring_from_string("aaa", "aa") will return "a". The only guarantee made is that t in remove_substring_from_string(s, t) is False for any two strings s and t.

A C++ example and further explanation for the KMP algorithm can be found here. The remove_substring_from_string function then only checks if the entire substring is matched at each position and if so, skips over the substring.

Sam Bourne · Accepted Answer · 2021-10-15 00:27:57Z

0

I would do this using re.

import re


def remove_substring(s, substr):
    # type: (str, str) -> str
    return re.subn(substr, '', s)[0]

remove_substring('I am in cs', 'cs')
# 'I am in '

remove_substring('This also removes multiple substr that are found. Even if that substr is repeated like substrsubstrsubstr', 'substr')
# 'This also removes multiple  that are found. Even if that  is repeated like '

answered Oct 15, 2021 at 0:27

Sam Bourne

6506 silver badges8 bronze badges

Comments

kutschkem · Accepted Answer · 2022-10-31 11:41:31Z

-1

def remove_substring(s, substr):
while s != "":
    if substr in s:
        s = s.replace(substr, "")
    else:
        return s

if s == "":
    return "Empty String"

The idea here is that we replace all occurrences of substr within s, by replacing the first instance of substr and then looping until we are done.

edited Oct 31, 2022 at 11:41

kutschkem

8,2683 gold badges26 silver badges66 bronze badges

answered Oct 31, 2022 at 8:08

Farooq

488 bronze badges

Collectives™ on Stack Overflow

Removing substring from string in Python

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related