1

I am trying to extract the following data srcintf,dstintf,srcaddr,dstaddr,action,schedule,service,logtraffic from a text file and save the values into a csv file with proper rows.

The input file looks like this:

edit 258
    set srcintf "Untrust"
    set dstintf "Trust"
    set srcaddr "all"
    set dstaddr "10.2.22.1/32"
    set action accept
    set schedule "always"
    set service "selling_soft_01"
    set logtraffic all
next
edit 184
    set srcintf "Untrust"
    set dstintf "Trust"
    set srcaddr "Any"
    set dstaddr "10.1.1.1/32"
    set schedule "always"
    set service "HTTPS"
    set logtraffic all
next
edit 124
    set srcintf "Untrust"
    set dstintf "Trust"
    set srcaddr "Any"
    set dstaddr "172.16.77.1/32"
    set schedule "always"
    set service "ping"
    set logtraffic all
    set nat enable
next

This is my first time programming (as you can see from my code) but maybe you can understand more about what I am trying to do. See code below.

import csv

text_file = open("fwpolicy.txt", "r")

lines = text_file.readlines()

mycsv = csv.writer(open('output.csv', 'w'))

mycsv.writerow(['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat'])

n = 0
for line in lines: 
    n = n + 1
n = 0
for line in lines: 
    n = n + 1
    if "set srcintf" in line:
            srcintf = line
    else    srcintf = 'not set'
    if "set dstintf" in line:            
        dstintf = line
    else    dstintf  = 'not set'
    if "set srcaddr" in line:           
        srcaddr = line
    else    srcaddr = 'not set'
    if "set dstaddr" in line:
            dstaddr = line
    else    dstaddr = 'not set'
    if "set action" in line:            
        action = line
    else    action = 'not set'
    if "set schedule" in line:
            schedule = line
    else    schedule = 'not set'
    if "set service" in line:
            service = line
    else    service = 'not set'
    if "set logtraffic" in line:
            logtraffic = line
    else    logtraffic = 'not set'
    if "set nat" in line:
            nat = line
    else    nat = 'not set'            

        mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])

Expected results(CSV file):

srcintf,dstintf,srcaddr,dstaddr,schedule,service,logtraffic,nat
"Untrust","Trust","all","10.2.22.1/32","always","selling_soft_01",all,,

Actual results:

Traceback (most recent call last):
  File "parse.py", line 45, in <module>
    mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])
NameError: name 'srcintf' is not defined
1
  • 1
    Please show real and correctly indented code. This does not even contain the colons after else statements! (use copy/paste and Ctrl-K for code formatting...) Commented Jun 12, 2019 at 8:32

4 Answers 4

1

You are attempting to write a row to the csv for every line in your file. You should only write the row when you see the word next, so check for that before the write to collect the terms fully for each row.

When you get that far, you will notice you have set the value to the whole line, rather than what you need after the strings. e.g. with the line

 set srcintf "Untrust"

your code

 if "set srcintf" in line: srcintf = line
 else srcintf = 'not set' 

will give srcintf the value set srcintf "Untrust". Try to split the string to find the actual value?

... something like this:

text_file = open("fwpolicy.txt", "r")
lines = text_file.readlines()
mycsv = csv.writer(open('output.csv', 'w'))
mycsv.writerow(['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule',
                'service', 'logtraffic', 'nat'])
for line in lines:
    if "edit" in line:
        [srcintf, dstintf, srcaddr, dstaddr, schedule,
         service, logtraffic, nat] = ['not set']*8
    elif 'next' in line:
        mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])
    elif "set srcintf" in line:
         srcintf = line.split()[2]
    elif "set dstintf" in line:            
         dstintf = line.split()[2]
    elif "set srcaddr" in line:           
         srcaddr = line.split()[2]
    elif "set dstaddr" in line:
        dstaddr = line.split()[2]
    elif "set action" in line:            
        action = line.split()[2]
    elif "set schedule" in line:
        schedule = line.split()[2]
    elif "set service" in line:
        service = line.split()[2]
    elif "set logtraffic" in line:
        logtraffic = line.split()[2]
    elif "set nat" in line:
        nat = line.split()[2]

The important thing is to fill all the values for a row, and only write when you have them. The repetition can be made neater, but hopefully this helps with the idea of a state machine - see where you are at in the file to decide whether to collect values, start a new lot or write a row.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. I will look into "split". Hopefully I will get it right.
I've added a version that's similar to your code, but with the split in place. The dictionary version above from Thomas is much cleaner though
1

Here is how to do that with a DictWriter

with open("fwpolicy.txt", "r") as text_file, open('output.csv', 'w', newline='') as out_file:

    fieldnames = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule',
                  'service', 'logtraffic', 'nat']

    mycsv = csv.DictWriter(out, fieldnames=fieldnames, extrasaction='ignore',
                           quotechar=None, quoting=csv.QUOTE_NONE)
    mycsv.writeheader()

    row = {}
    for line in text_file:
        words = line.strip().split(maxsplit=2)
        if 'set' == words[0]:
            row[words[1]] = words[2]
        elif 'next' == words[0]:
            print(row)
            mycsv.writerow(row)
            row = {}

Comments

0

Here's how I'd approach this:

import csv
text_file = open("structured_content.txt", "r")
lines = "\n".join(text_file.readlines())
fieldnames = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat']

defaults = {'srcintf' : "not set", 'dstintf': "not set", 'srcaddr': "not set", 
            'dstaddr': "not set", 'schedule': "not set", 'service': "not set", 
            'logtraffic': "not set", 'nat': "not set"}

mycsv = csv.DictWriter(open('output.csv', 'w'), fieldnames)
for block in lines.split("next"):
    csv_row = {}
    for p in [(s.strip()) for s in block.replace("\n", "").split("set")]:
        s = p.split()
        if len(s)==2:
            csv_row[s[0]]=s[1]  # n.b. this includes "action" and "edit" fields, which need stripping out
            csv_write_row = {}
            for k,v in csv_row.items():
                print ( "key=",k,"value=",v )
                if k in fieldnames: # a filter to only include fields in the "fieldnames" list
                    print ( k , " is in the list - attach its value to the output dictionary")
                    csv_write_row[k]=v
            for k,v in defaults.items(): 
                if k not in csv_write_row.keys(): # pad-out the output row with any default values not lifted from the file
                    print ( k , " is not in the list - write a default out")
                    csv_write_row[k]=v
    mycsv.writerow(csv_write_row)

What I'm aiming to do here is take advantage of the structure of your file, and using the split command to break up that text string into repeating chunks. Converting your file to csv is just a matter of aligning your chunks (and nested chunks) to the csv format. csv.DictWriter provides a useful interface for saving your content down in a row-by-row basis.

If you want to set defaults for values that aren't there, I'd do that with a dictionary containing fieldname keys, and default (missing) values. You could then "wash" your prepared csv_write_row with these defaults in the case they're not present.

6 Comments

Thank you very much Thomas! I mean, I understand maybe 30% of whats going on in that code but it works. Now I am going to study what you did... thank you again.
Just making an edit here to enact the edit I made about making the defaults using a dictionary of default values. The advantage being in the future, if you want to edit these, it's just a dict edit, and not code. Also, you're welcome!
Hi Thomas. I am still in the process to understand your code and I have a question..hopefully you can help me. Is there a way to know what's going on inside the csv_write_row and debug from there? I mean, I tried changing the "k, v" variables and run the script again to see what changes.. but I was wondering if there is a way to debug in real time step by step.. are there tools that can help me with that?(maybe an IDE can do that for me? do you use one?) Thank you and the others that helped me.
Sure, it's not an IDE as such, but I code in a Jupyter Notebook when I'm trying things out. I think some IDEs will give you steps and breakpoints - but I tend to just insert judicially placed print statements (see edit) run my code and then inspect whether what's printed matches what I expected. Each of those k,v blocks is an iterator that pulls the key(k) and value(v) pairs out of the referenced dictionary contents. In the first, this is used to construct the output based on the fieldnames list, and in the second, to pad out anything that might be missing.
Thank you, Thomas. Even though I don't use the proper terms(I apologize), you can understand what I am trying to ask.
|
0

Here is a way to do it:

keys = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat']
lines
records = []
for line in lines:

    found_key = [key for key in keys if key in line]

    if len(found_key) >0:
        value = line.strip().rstrip("\n\r").replace('"', '').split(" ")[2: ]
        record[found_key[0]] = value[0]

    if 'next' in line:
        records.append(record)
        record = dict()

pd.DataFrame(records).to_csv('output.csv', index=False)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.