2

I have a list of strings and I want to remove specific elements in each string from it. Here is what I have so far:

s = [ "Four score and seven years ago, our fathers brought forth on",
      "this continent a new nation, conceived in liberty and dedicated"]

result = []
for item in s:
    words = item.split()
    for item in words:
        result.append(item)

print(result,'\n')

for item in result:
    g = item.find(',.:;')
    item.replace(item[g],'')
print(result)

The output is:

['Four', 'score', 'and', 'seven', 'years', 'ago,', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation,', 'conceived', 'in', 'liberty', 'and', 'dedicated']

In this case I wanted the new list to contain all the words, but it should not include any punctuation marks except for quotes and apostrophes.

 ['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']

Even though am using the find function the result seems to be same. How can I correct it prints without the punctuation marks? How can I improve upon the code?

2
  • What is the exact output that you're expecting for the above list? Commented Jul 24, 2015 at 19:52
  • 1
    @CristiFati the second output that he presents is his desired output i believe. I've tested his code and both his prints output the first output. Commented Jul 24, 2015 at 20:00

4 Answers 4

2

You could strip all the characters that you want to get rid of after you split the string:

for item in s:
    words = item.split()
    for item in words:
        result.append(item.strip(",."))  # note the addition of .strip(...)

You can add whatever characters you want to get rid of to the String argument to .strip(), all in one string. The example above strips out commas and periods.

Sign up to request clarification or add additional context in comments.

2 Comments

But beware of cases like "new nation,conceived", i.e. punctuation without whitespace. May or may not be a problem.
Excellent point. This was my Occam's razor approach. Regex would probably give you the most robust solution.
2

You can do this by using re.split to specify a regular expression to split on, in this case everything not a number or digit.

import re
result = []
for item in s:
    words = re.split("[^A-Za-z0-9]", s)
    result.extend(x for x in words if x) # Include nonempty elements

2 Comments

Testing this out, I get a bunch of repeats of my words if I use the for loop, but it behaves exactly the way we want if we omit the for loop entirely.
Also you can add a "+" the end of the regex (look for one or more matching characters per group) to get rid of inter-word spaces in your matched list. I still get a blank string at the end of my matched word list when I have a period at the end of my sentence though. Not sure why.
1
s = [ "Four score and seven years ago, our fathers brought forth on", "this continent a new nation, conceived in liberty and dedicated"]

# Replace characters and split into words
result = [x.translate(None, ',.:;').split() for x in s] 

# Make a list of words instead of a list of lists of words (see http://stackoverflow.com/a/716761/1477364)
result = [inner for outer in result for inner in outer] 

print s

Output:

['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']

Comments

1

or, you could just add a loop in

for item in result:
    g = item.find(',.:;')
    item.replace(item[g],'')

and split up ,.:; just add an array of punctuation like

punc = [',','.',':',';']

then iterate through it inside for item in result: like

for p in punc:
    g = item.find(p)
    item.replace(item[g],'')

so the full loop is

punc = [',','.',':',';']
for item in result:
    for p in punc:
        g = item.find(p)
        item.replace(item[g],'')

I've tested this, it works.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.