4

I have following code which works well but I am not able to trim and store a data in a datafile:

import nltk

tweets = [
    (['love', 'this', 'car']),
    (['this', 'view', 'amazing']),
    (['not', 'looking', 'forward', 'the', 'concert'])
    ]

def get_words_in_tweets(tweets):
    all_words = []
    for (words) in tweets:
      all_words.extend(words)
    return all_words

def get_word_features(wordlist):
    wordlist = nltk.FreqDist(wordlist)
    word_features = wordlist.keys()
    return word_features

output = open('wordFeatures.csv','w')

word_features = get_word_features(get_words_in_tweets(tweets))

print (word_features)
output.write(word_features)
#print (wordlist)
output.close()

What it does is, it checks if words a double or triple etc and only adds one word in the list. The output looks like this:

['this', 'amazing', 'car', 'concert', 'forward', 'looking', 'love', 'not', 'the', 'view']

Now as you can see I tried to save this data in a textfile but I get an

TypeError: expected a character buffer object

I want the data from the array in a textfile in the following format:

1:this
2:amazing
3:car 
4:concert
5:forward
...

so one row for every word with an increasing integer.

Has someone an idea how to save my data in this way?

4
  • why is car and concert on the same line? Commented Sep 15, 2013 at 14:32
  • So, 'car', 'concert' will come on the same line? Commented Sep 15, 2013 at 14:33
  • It is on the same line because it is one vector which contains all the feature words. It is not import from which tweet they are because I want them to print as I indicated in a list. Commented Sep 15, 2013 at 15:47
  • Recently I saw what you went and you were right, it was a mistake and they should not be on the same line! I fixed it. Commented Sep 15, 2013 at 17:10

3 Answers 3

2

The reason for the error is that output.write accepts a string, not a list. word_features is a list.

To write a list to a file, you will need to iterate over it:

for feature in word_features: 
    output.write("{0}\n".format(feature))

I don't understand the format you need because of the car and concert coming together on the same line. I am assuming that is a typo and you actually need them on separate lines. Then you can do this to obtain that output:

for nfeature in enumerate(word_features):
    output.write("{0}:{1}\n".format(nfeature[0] + 1, nfeature[1]))
Sign up to request clarification or add additional context in comments.

Comments

1

You're trying to write a list object to a file, but it expects a string. You can use `enumerate here:

word_features = get_word_features(get_words_in_tweets(tweets))
with open('wordFeatures.csv', 'w') as output:
    for ind, item in enumerate(word_features, 1):
        output.write("{}:{}\n".format(ind, item))

or using csv module :

import csv
word_features = get_word_features(get_words_in_tweets(tweets))
with open('wordFeatures.csv', 'w') as output:
    writer = csv.writer(output, delimiter=':')
    writer.writerows(enumerate(word_features, 1))

Output:

1:this
2:amazing
3:car
4:concert
5:forward
6:looking
7:love
8:not
9:the
10:view

Comments

0

In Python, I save data into a csv file, but in a rather hack way:

First I save my data into a text file. In each row, I separate each "column element" with a comma.

Then, when I'm done with that row [currently just a line in a text file], I write in a new line and begin writing in the next line of data. Repeat as desired.

Then, when I'm all done, I re-name the text file into a csv file.

For you, adding in the increasing integer, you could make up an increment counter. If you were to do as I have, you could increment your counter, write the value into the text file, write in the comma, write in your data, and then write in a new line, then repeat. Just remember to re-name the file into a csv file when you're all done.

Like I said, a hack way of doing it but whichever. I'm open to hearing better methods.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.