0

I have the following problem. I have a list of different text lines that all has a comma in it. I want to keep the text to the left of the comma and delete everything that occurs after the comma for all the lines in the file.

Here is a sample line from the file:

1780375 "004956 , down , 943794 , 22634 , ET , 2115 ,

I'd like to delete the characters after the first comma:

I tried to make the program yet am having some trouble. Here is what i have so far:

datafile = open('C:\\middlelist3.txt', 'r')

smallerdataset = open('C:\\nocommas.txt', 'w')

counter = 1

for line in datafile:
    print counter
    counter +=1
    datafile.rstrip(s[,])
    smallerdataset.write(line)

1 Answer 1

4

You can use split for this. It splits the string on a given substring. As you only need the first part, I set 1 as the second parameter to make it only split on the first one.

Instead of using a counter, you could use enumerate, like this:

datafile = open('C:\\middlelist3.txt', 'r')

smallerdataset = open('C:\\nocommas.txt', 'w')

for counter, line in enumerate(datafile):
    print counter
    smallerdataset.write(line.split(',', 1)[0])   

smallerdataset.close()

This is how you could improve your script using the with statement and generator expressions:

with open('C:\\middlelist3.txt') as datafile:
    list = (line.split(',', 1)[0] for line in datafile)
    with open('C:\\nocommas.txt', 'w') as smallfile:
        smallfile.writelines(list)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.