Remove all whitespace in a string

Question

I want to eliminate all the whitespace from a string, on both ends, and in between words.

I have this Python code:

def my_handle(self):
    sentence = ' hello  apple  '
    sentence.strip()

But that only eliminates the whitespace on both sides of the string. How do I remove all whitespace?

What should your result look like? hello apple? helloapple? — Mark Byers
– Mark Byers, Commented Nov 25, 2011 at 13:57
@JoachimPileborg, not exactly I think, because it's also about reducung whitespace between the words. — wal-o-mat
– wal-o-mat, Commented Nov 25, 2011 at 13:59
Correct me if wrong, but "whitespace" is not synonymous with "space characters". The current answer marked as correct does not remove all whitespace. But, since it's marked as correct it must have answered the intended question? So we should edit the question to reflect the accepted answer? @Kalanamith Did, or do, you want to remove all whitespace or only spaces? — AnnanFay
– AnnanFay, Commented Dec 6, 2016 at 17:23

James · Accepted Answer · 2024-09-25 02:42:06Z

2412

If you want to remove leading and ending whitespace, use str.strip():

>>> "  hello  apple  ".strip()
'hello  apple'

If you want to remove all space characters, use str.replace() (NB this only removes the “normal” ASCII space character ' ' U+0020 but not any other whitespace):

>>> "  hello  apple  ".replace(" ", "")
'helloapple'

If you want to remove all whitespace and then leave a single space character between words, use str.split() followed by str.join():

>>> " ".join("  hello  apple  ".split())
'hello apple'

If you want to remove all whitespace then change the above leading " " to "":

>>> "".join("  hello  apple  ".split())
'helloapple'

edited Sep 25, 2024 at 2:42

James

1075 bronze badges

answered Nov 25, 2011 at 13:56

Cédric Julien

81.2k16 gold badges131 silver badges134 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

lsheng Over a year ago

The greatness of this function is that it also removes the '\r\n' from the html file I received from Beautiful Soup.

don Over a year ago

I like "".join(sentence.split()), this removes all whitespace (spaces, tabs, newlines) from anywhere in sentence.

Yannis Dran Over a year ago

begginner here. Can someone explain me why print(sentence.join(sentence.split())) results to 'hello hello appleapple'? Just want to understand how code is processed here.

Cédric Julien Over a year ago

@YannisDran check the str.join() documentation, when you call sentence.join(str_list) you ask python to join items from str_list with sentenceas separator.

Cecil Curry Over a year ago

"".join(sentence.split()) is indeed the canonical solution, efficiently removing all whitespace rather than merely spaces. Mark Byers' excellent answer should probably have been accepted in lieu of this less applicable answer.

|

Randall Cook · Accepted Answer · 2014-01-20 23:45:37Z

452

To remove only spaces use str.replace:

sentence = sentence.replace(' ', '')

To remove all whitespace characters (space, tab, newline, and so on) you can use split then join:

sentence = ''.join(sentence.split())

or a regular expression:

import re
pattern = re.compile(r'\s+')
sentence = re.sub(pattern, '', sentence)

If you want to only remove whitespace from the beginning and end you can use strip:

sentence = sentence.strip()

You can also use lstrip to remove whitespace only from the beginning of the string, and rstrip to remove whitespace from the end of the string.

edited Jan 20, 2014 at 23:45

Randall Cook

6,8066 gold badges36 silver badges68 bronze badges

answered Nov 25, 2011 at 13:54

Mark Byers

843k202 gold badges1.6k silver badges1.5k bronze badges

2 Comments

Andy Hayden Over a year ago

Note: You don't need to compile step, re.sub (and friends) cache the compiled pattern. See also, Emil's answer.

deed02392 Over a year ago

python3: yourstr.translate(str.maketrans('', '', ' \n\t\r'))

President James K. Polk · Accepted Answer · 2024-09-24 13:08:49Z

167

An alternative is to use regular expressions and match these strange white-space characters too. Here are some examples:

Remove ALL whitespace in a string, even between words:

import re
sentence = re.sub(r"\s+", "", sentence, flags=re.UNICODE)

Remove whitespace in the BEGINNING of a string:

import re
sentence = re.sub(r"^\s+", "", sentence, flags=re.UNICODE)

Remove whitespace in the END of a string:

import re
sentence = re.sub(r"\s+$", "", sentence, flags=re.UNICODE)

Remove whitespace both at the BEGINNING and at the END of a string:

import re
sentence = re.sub("^\s+|\s+$", "", sentence, flags=re.UNICODE)

Remove ONLY DUPLICATE whitespace:

import re
sentence = " ".join(re.split("\s+", sentence, flags=re.UNICODE))

(All examples work in both Python 2 and Python 3)

edited Sep 24, 2024 at 13:08

President James K. Polk

42.2k34 gold badges113 silver badges149 bronze badges

answered Feb 19, 2015 at 13:05

Emil Stenström

14.2k8 gold badges57 silver badges77 bronze badges

3 Comments

Sarang Over a year ago

Did not work for "\u202a1234\u202c". Gives the same output: u'\u202a1234\u202c'

Emil Stenström Over a year ago

@Sarang: Those are not whitespace characters (google them and you'll see) but "General Punctuation". My answer only deals with removing characters classified as whitespace.

CapnShanty Over a year ago

This is the only solution I see here that removes those damn pesky unicode whitespace characters, thanks fam

I.B. · Accepted Answer · 2020-10-01 06:48:23Z

67

"Whitespace" includes space, tabs, and CRLF. So an elegant and one-liner string function we can use is str.translate:

Python 3

' hello  apple '.translate(str.maketrans('', '', ' \n\t\r'))

OR if you want to be thorough:

import string
' hello  apple'.translate(str.maketrans('', '', string.whitespace))

Python 2

' hello  apple'.translate(None, ' \n\t\r')

OR if you want to be thorough:

import string
' hello  apple'.translate(None, string.whitespace)

edited Oct 1, 2020 at 6:48

I.B.

29.2k13 gold badges87 silver badges108 bronze badges

answered Nov 28, 2015 at 3:36

MaK

1,7381 gold badge16 silver badges6 bronze badges

3 Comments

Suzana Over a year ago

This won't help with Unicode whitespace like \xc2\xa0

user405 Over a year ago

ans.translate( None, string.whitespace ) produces only builtins.TypeError: translate() takes exactly one argument (2 given) for me. Docs says that argument is a translate table, see string.maketrans(). But see comment by Amnon Harel, below.

Shogan Aversa-Druesne Over a year ago

' hello apple'.translate(str.maketrans('', '', string.whitespace)) Note: its better to make a variable to store the trans-table if you intend to do this multiple times.

wal-o-mat · Accepted Answer · 2011-11-25 13:56:22Z

19

For removing whitespace from beginning and end, use strip.

>> "  foo bar   ".strip()
"foo bar"

answered Nov 25, 2011 at 13:56

wal-o-mat

7,3748 gold badges35 silver badges42 bronze badges

2 Comments

Shayan Shafiq Over a year ago

The question specifically asks for removing all of the whitespace and not just at the ends. Please take notice.

Scott Over a year ago

This answer is irrelevant to this question

Asclepius · Accepted Answer · 2019-12-25 01:34:36Z

13

' hello  \n\tapple'.translate({ord(c):None for c in ' \n\t\r'})

MaK already pointed out the "translate" method above. And this variation works with Python 3 (see this Q&A).

edited Dec 25, 2019 at 1:34

Asclepius

64.6k20 gold badges188 silver badges164 bronze badges

answered Sep 26, 2016 at 9:54

Amnon Harel

1711 silver badge6 bronze badges

1 Comment

user405 Over a year ago

Thanks! Or, xxx.translate( { ord(c) :None for c in string.whitespace } ) for thoroughness.

cacti5 · Accepted Answer · 2018-04-06 20:51:47Z

11

In addition, strip has some variations:

Remove spaces in the BEGINNING and END of a string:

sentence= sentence.strip()

Remove spaces in the BEGINNING of a string:

sentence = sentence.lstrip()

Remove spaces in the END of a string:

sentence= sentence.rstrip()

All three string functions strip lstrip, and rstrip can take parameters of the string to strip, with the default being all white space. This can be helpful when you are working with something particular, for example, you could remove only spaces but not newlines:

" 1. Step 1\n".strip(" ")

Or you could remove extra commas when reading in a string list:

"1,2,3,".strip(",")

answered Apr 6, 2018 at 20:51

cacti5

2,1262 gold badges29 silver badges35 bronze badges

Comments

Peter Mortensen · Accepted Answer · 2018-03-14 10:19:01Z

7

Be careful:

strip does a rstrip and lstrip (removes leading and trailing spaces, tabs, returns and form feeds, but it does not remove them in the middle of the string).

If you only replace spaces and tabs you can end up with hidden CRLFs that appear to match what you are looking for, but are not the same.

edited Mar 14, 2018 at 10:19

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Nov 12, 2014 at 19:30

yan bellavance

4,87020 gold badges64 silver badges95 bronze badges

1 Comment

Dpedrinha Over a year ago

Although this is a good point, this isn't really an answer and should be a comment unless you provide a solution. Would you care to provide a solution for this is exactly what I'm looking for? Cheers

handle · Accepted Answer · 2020-03-13 16:02:14Z

6

eliminate all the whitespace from a string, on both ends, and in between words.

>>> import re
>>> re.sub("\s+", # one or more repetition of whitespace
    '', # replace with empty string (->remove)
    ''' hello
...    apple
... ''')
'helloapple'

https://en.wikipedia.org/wiki/Whitespace_character

Python docs:

edited Mar 13, 2020 at 16:02

answered Mar 13, 2020 at 15:51

handle

6,5004 gold badges63 silver badges93 bronze badges

1 Comment

handle Over a year ago

I know re has been suggested before, but I found that the actual answer to the question title was a bit hidden amongst all the other options.

naoki fujita · Accepted Answer · 2021-07-29 14:33:40Z

6

I use split() to ignore all whitespaces and use join() to concatenate strings.

sentence = ''.join(' hello  apple  '.split())
print(sentence) #=> 'helloapple'

I prefer this approach because it is only a expression (not a statement).
It is easy to use and it can use without binding to a variable.

print(''.join(' hello  apple  '.split())) # no need to binding to a variable

answered Jul 29, 2021 at 14:33

naoki fujita

7191 gold badge9 silver badges13 bronze badges

1 Comment

James Over a year ago

I prefer this too. You can use it permanently (as you did assigning it to a variable) or temporarily (as you did with the print statement. It's simple and it removes all whitespace, not just spaces.

PrabhuPrakash · Accepted Answer · 2016-10-24 12:46:29Z

3

import re    
sentence = ' hello  apple'
re.sub(' ','',sentence) #helloworld (remove all spaces)
re.sub('  ',' ',sentence) #hello world (remove double spaces)

answered Oct 24, 2016 at 12:46

PrabhuPrakash

2612 silver badges7 bronze badges

1 Comment

Maximilian Peters Over a year ago

the question was too remove all white space which includes tabs and new line characters, this snippet will only remove regular spaces.

Jane Kathambi · Accepted Answer · 2022-02-28 08:04:44Z

3

In the following script we import the regular expression module which we use to substitute one space or more with a single space. This ensures that the inner extra spaces are removed. Then we use strip() function to remove leading and trailing spaces.

# Import regular expression module
import re

# Initialize string
a = "     foo      bar   "

# First replace any number of spaces with a single space
a = re.sub(' +', ' ', a)

# Then strip any leading and trailing spaces.
a = a.strip()

# Show results
print(a)

edited Feb 28, 2022 at 8:04

answered Feb 19, 2022 at 10:59

Jane Kathambi

96510 silver badges10 bronze badges

2 Comments

the Tin Man Over a year ago

It helps more if you supply an explanation why this is the preferred solution and explain how it works. We want to educate, not just provide code.

Jane Kathambi Over a year ago

@theTinMan thanks for the recommendation I just added the explanations.

cottontail · Accepted Answer · 2023-08-31 04:46:45Z

All string characters are unicode literal in Python 3; as a consequence, since str.split() splits on all white space characters, that means it splits on unicode white space characters. So split + join syntax (as in 1, 2, 3) will produce the same output as re.sub with the UNICODE flag (as in 4); in fact, the UNICODE flag is redundant here (as in 2, 5, 6, 7).

import re
import sys

# all unicode characters
sentence = ''.join(map(chr, range(sys.maxunicode+1)))

# remove all white space characters
x = ''.join(sentence.split())
y = re.sub(r"\s+", "", sentence, flags=re.UNICODE)
z = re.sub(r"\s+", "", sentence)

x == y == z      # True

In terms of performance, since Python's string methods are optimized, they are much faster than regex. As the following timeit test shows, when removing all white space characters from the string in the OP, Python string methods are over 7 times faster than re option.

import timeit

import timeit

setup = """
import re
s = ' hello  \t apple  '
"""

t1 = min(timeit.repeat("''.join(s.split())", setup))
t2 = min(timeit.repeat("re.sub(r'\s+', '', s, flags=re.UNICODE)", setup))


t2 / t1  # 7.868004799367726

user856387 · Accepted Answer · 2022-07-25 10:49:09Z

1

I found that this works the best for me:

test_string = '  test   a   s   test '
string_list = [s.strip() for s in str(test_string).split()]
final_string = ' '.join(string_array)
# final_string: 'test a s test'

It removes any whitespaces, tabs, etc.

edited Jul 25, 2022 at 10:49

answered Jul 25, 2022 at 10:08

user856387

913 bronze badges

Comments

James Bond · Accepted Answer · 2024-08-07 12:16:52Z

0

Just addition to the Emil Stenström's answer

This code removes all white spaces and you could also remove your own extra utf-8 characters.

import re

def utf8trim(s: str) -> str:
    spaces = "|".join([r"\s", "\u2800", "\u3164", "\u1160", "\uFFA0", "\u202c"])
    return re.sub(f"^[{spaces}]+|[{spaces}]+$", "", s, flags=re.UNICODE)

answered Aug 7, 2024 at 12:16

James Bond

3,0651 gold badge27 silver badges36 bronze badges

Comments

Assad Ali · Accepted Answer · 2020-10-10 19:36:56Z

-2

try this.. instead of using re i think using split with strip is much better

def my_handle(self):
    sentence = ' hello  apple  '
    ' '.join(x.strip() for x in sentence.split())
#hello apple
    ''.join(x.strip() for x in sentence.split())
#helloapple

answered Oct 10, 2020 at 19:36

Assad Ali

2881 silver badge12 bronze badges

Collectives™ on Stack Overflow

Remove all whitespace in a string

16 Answers 16

8 Comments

2 Comments

3 Comments

3 Comments

2 Comments

1 Comment

Comments

1 Comment

1 Comment

1 Comment

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

16 Answers 16

8 Comments

2 Comments

3 Comments

3 Comments

2 Comments

1 Comment

Comments

1 Comment

1 Comment

1 Comment

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Linked

Related