0

How to define a function that takes a string (sentence) and inserts an extra space after a period if the period is directly followed by a letter.

sent = "This is a test.Start testing!"
def normal(sent):
    list_of_words = sent.split()
    ...

This should print out

"This is a test. Start testing!"

I suppose I should use split() to brake a string into a list, but what next?

P.S. The solution has to be as simple as possible.

0

5 Answers 5

8

Use re.sub. Your regular expression will match a period (\.) followed by a letter ([a-zA-Z]). Your replacement string will contain a reference to the second group (\2), which was the letter matched in the regular expression.

>>> import re
>>> re.sub(r'\.([a-zA-Z])', r'. \1', 'This is a test.This is a test. 4.5 balloons.')
'This is a test. This is a test. 4.5 balloons'

Note the choice of [a-zA-Z] for the regular expression. This matches just letters. We do not use \w because it would insert spaces into a decimal number.

Sign up to request clarification or add additional context in comments.

Comments

3

One-liner non-regex answer:

def normal(sent):
    return ".".join(" " + s if i > 0 and s[0].isalpha() else s for i, s in enumerate(sent.split(".")))

Here is a multi-line version using a similar approach. You may find it more readable.

def normal(sent):
    sent = sent.split(".")
    result = sent[:1]
    for item in sent[1:]:
        if item[0].isalpha():
            item = " " + item
        result.append(item)
    return ".".join(result)

Using a regex is probably the better way, though.

3 Comments

Very clever. I love that it avoids regular expressions. I don't, however, like it's level of complexity. I wish I could give a 1/2 vote.
Yeah, that would probably be much more readable as a traditional for loop. :-) In fact I will just go ahead and add such a version in an edit.
Wow, can't believe you came back to a question 5 days later! Upvote for you.
1

Brute force without any checks:

>>> sent = "This is a test.Start testing!"
>>> k = sent.split('.')
>>> ". ".join(l)
'This is a test. Start testing!'
>>> 

For removing spaces:

>>> sent = "This is a test. Start testing!"
>>> k = sent.split('.')
>>> l = [x.lstrip(' ') for x in k]
>>> ". ".join(l)
'This is a test. Start testing!'
>>> 

1 Comment

This doesn't take into account the requirement that the period be followed by a letter.
1

Another regex-based solution, might be a tiny bit faster than Steven's (only one pattern match, and a blacklist instead of a whitelist):

import re
re.sub(r'\.([^\s])', r'. \1', some_string)

1 Comment

Good catch. I didn't need to make two groups, just one. But [^\s] would split on more than letters. Most importantly, decimal numbers would have spaces inserted, so I would stick with [a-zA-Z].
0

Improving pyfunc's answer:

sent="This is a test.Start testing!"

k=sent.split('.')

k='. '.join(k)

k.replace('. ','. ')

'This is a test. Start testing!'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.