0

I have data like the example data below in a text file. What I would like to do is search through the text file and return everything between "SpecialStuff" and the next ";", like I've done with the example out put. I'm pretty new to python so any tips are greatly appreciated, would something like .split() work?

Example Data:

stuff:
    1
    1
    1
    23

];

otherstuff:
    do something
    23
    4
    1

];

SpecialStuff
    select
        numbers
        ,othernumbers
        words
;

MoreOtherStuff
randomstuff
@#123


Example Out Put:

select
        numbers
        ,othernumbers
        words

3 Answers 3

1

You can try this:

file = open("filename.txt", "r") # This opens the original file
output = open("result.txt", "w") # This opens a new file to write to
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen.
for line in file:
    if ";" in line:
        seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon.
    if seenSpecialStuff == 1:
        output.write(line)  # Print if tracker is active 
    if "SpecialStuff" in line:
        seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen

This returns a file named result.txt that contains:

  select
    numbers
    ,othernumbers
    words

This code can be improved! Since this is likely a homework assignment, you'll probably want to do more research about how to make this more efficient. Hopefully it can be a useful starting ground for you!

Cheers!

EDIT

If you wanted the code to specifically read the line "SpecialStuff" (instead of lines containing "SpecialStuff"), you could easily change the "if" statements to make them more specific:

file = open("my.txt", "r")
output = open("result.txt", "w")
seenSpecialStuff = 0
for line in file:
    if line.replace("\n", "") == ";":
        seenSpecialStuff = 0
    if seenSpecialStuff == 1:
        output.write(line)
    if line.replace("\n", "") == "SpecialStuff":
        seenSpecialStuff = 1
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you, this is really close to what I was looking for. The only problem is that there are some parts of the code that have strings like "abcSpecialStuffpdq" and so it's grabbing everything that follows. How could I change the code so it only grabs stuff following the string "SpecialStuff" ?
You can try making the "if" statement something like if line.replace("\n", "") == "SpecialStuff":, which would make it so that only the line that has exactly SpecialStuff in it would trigger making the tracker "1"! That can be done for the other lines too, if you want it to only find specific occurrences!
I edited the answer to reflect that! If you needed to later also grab the information contained in "abcSpecialStuffpdq" you would have to add a separate "if" statement so that the code would recognize it.
0
with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:  # open the input and output files
    wanted = False  # do we want the current line in the output?
    for line in infile:
        if line.strip() == "SpecialStuff":  # marks the begining of a wanted block
            wanted = True
            continue
        if line.strip() == ";" and wanted:  # marks the end of a wanted block
            wanted = False
            continue

        if wanted: outfile.write(line)

Comments

0

Don't use str.split() for that - str.find() is more than enough:

parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)

Keep in mind that this will capture everything between those two, including new lines and other whitespace - you can additionally do parsed.strip() to remove leading and trailing whitespaces if you don't want them.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.