2

I have a large text file full of notes that I would like to split and separate into individual rows using Python. I've gotten it to work somewhat, but it is adding one letter per cell in a .csv file, not the entire section. I've inserted the @@@ characters to denote where each section needs to be split. For example, here's what my .txt file looks like:

@@@ jlkdlkjdlkjdalkjdalk @@@ 78278947298729874298742 @@@ llkdlaklkdalkdsa
@@@ nmczxmnczxmncz

I eventually want it exported into .csv so it would look like this:

ID | Reporttext

1  | jlkdlkjdlkjdalkjdalk 
2  | 78278947298729874298742 
3  | llkdlaklkdalkdsa
4  | nmczxmnczxmncz

Right now it's being exported like this: j l k d l k (and so on).

Here's my code:

import re, csv with open("thetext.txt") as f: for line in f: for word in line.split("@@@"): with open(r'theoutput.csv', 'a') as g: writer = csv.writer(g) writer.writerow(word) print(word)

So just to reiterate, my problem is avoiding the spacing (e.g., t h i s ) when it exports.

Thanks!

1
  • If I understand your separator is '@@@', right? In any case did you try to use pandas to load your file and then exports it to csv? data = pd.read_csv('my_file.txt', sep="@@@ ", header=None) pd.to_csv('my_new_file.csv') Commented Mar 14, 2019 at 13:53

3 Answers 3

3

You could do stripping and splitting on the @ like,

$ cat txt2csv.py 
import csv

with open('some.txt') as file_, open('some_new.csv', 'w') as csvfile:
    lines = [x for x in file_.read().strip().split('@') if x]
    writer = csv.writer(csvfile, delimiter='|')
    writer.writerow(('ID', 'Reporttext'))
    for idx, line in enumerate(lines, 1):
        writer.writerow((idx, line.strip('@')))

And the input file,

$ cat some.txt 
@@@ jlkdlkjdlkjdalkjdalk @@@ 78278947298729874298742 @ llkdlaklkdalkdsa @@@ nmczxmnczxmncz

And the output file,

$ cat some_new.csv 
ID|Reporttext
1| jlkdlkjdlkjdalkjdalk 
2| 78278947298729874298742 
3| llkdlaklkdalkdsa 
4| nmczxmnczxmncz
Sign up to request clarification or add additional context in comments.

Comments

1

First you should open both files with one

with open("thetext.txt") as f, open(r'theoutput.csv', 'a') as g:
import csv
with open("thetext.txt") as f, open('theoutput.csv', 'a') as g:
    lines = [x for x in f.read().strip().split('@') if x]
    writer = csv.writer(g, delimiter='|')
    writer.writerow(('ID', 'Reporttext'))
    for lineNumber, line in enumerate(lines, 1):
        writer.writerow((lineNumber, line.strip('@')))

Also you have to use

lines = f.readlines()

Because what's happening now, is that python treats the txt file like a large string

2 Comments

Using for line in f: instead of f.readlines() is totally fine and even more memory friendly since it reads line by line instead of loading every line in an object beforehand.
That's very helpfull I didn't knew that.
1

Similar to the answer from han solo you could do the line reading and splitting like this:

import csv

with open("thetext.txt") as txt, open('theoutput.csv', 'a') as csvfile:
  writer = csv.writer(csvfile, delimiter=';')

  writer.writerow(('ID', 'Reporttext'))
  id = 1
  for line in txt:
    words = line.strip().split("@@@")

      for word in words:
        writer.writerow((id, word.strip()))
        id += 1

This way you're reading your txt file line by line, then split it at the @@@ before writing them word by word to your CSV file. You can even remove the leading @@@ in your input file.

2 Comments

The code works, but the only problem is that the text is misaligned. For example, "1" is showing up under ID, but so is some of the text. I'd like them to be in separate columns so I can import this into a database. Do I need to use Pandas or something to do this? Sorry I wasn't clear.
If you want a normal CSV file then you have to use a semicolon as delimiter. I will edit my answer accordingly. And you should remove the leading @@@ in your file so the code above won't write empty cells.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.