0

I am trying to use Python to auto-parse a set of text files and turn them into XML files.

There are alot of people asking how to loop through a text file and read them into an array. The trouble here is that this wont quite work for me.

I need to loop through the first three lines individually then drop the rest of the text file (body) into one array entry.

The text file is formatted as follows.

Headline

Subhead

by A Person

text file body content. Multiple paragraphs

How would I go about setting up an array to do this in Python?

0

2 Answers 2

2

Something like this:

with open("data1.txt") as f:
    head,sub,auth = [f.readline().strip() for i in range(3)]
    data=f.read()
    print head,sub,auth,data

If you've spaces between the the lines, then you should try:

filter() will remove he empty lines:

 with open("data1.txt") as f:
    head,sub,auth =filter(None,(f.readline().strip() for i in range(6)))
    data=f.read()
    print head,sub,auth,,data
Sign up to request clarification or add additional context in comments.

3 Comments

@enrico.bacis just strip() is fine.
Wonderful this did it. One more question how would I get the current path(where the text file lives) and append that to the bottom of the list?
@user1789778 if you're taking about the current working directory then try os.getcwd().
1

If I understood your question correctly, you wish to put all the text except for the first 3 lines into an array (list). Here's how to do that:

with open("/path/to/your/file.txt") as f:
    all_lines = f.readlines()
content_lines = all_lines[3:]
content_text = '\n'.join(content_lines)
content_list.append(content_text)

Explanation: You first open the file, and then put all of its lines into a list. Then, you take all the lines after the first three, and put those into a list. Then, you join this new list with newlines to make it content again. Then, you append this new content to a list that you've created beforehand called content_list


If you want to put the first three lines into your list as well, then do the following before appending to content_list:

for line in all_lines[:3]:
    content_list.append(line)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.