0

Hi I am creating a Pandas DF from this peice of code:

for odfslogp_obj in odfslogs_plist:
        with zipfile.ZipFile(odfslogp_obj, mode='r') as z:
            for name in z.namelist():
                with z.open(name) as etest_zip:
                    tdict = {}
                    etestlines = [line.decode() for line in etest_zip] #change lines from log files from binary to text
                    regval_range_tup_list = list(zip([i for i,x in enumerate(etestlines) if 'Head' in x ],  [i for i,x in enumerate(etestlines) if 'BINS' in x ])) #get binval sections
                    head_siteparam_tup_list = list(zip([x.split("=")[1].replace("(",'').replace(")",'').rstrip() for x in etestlines if 'Head' in x], [x.split(":")[2].rstrip() for x in etestlines if 'SITE:PARAM_SITE:' in x])) #extract head and site:param values from bin val sections

                    print(head_siteparam_tup_list)
                    linesineed = [etestlines[range[0]:range[1]-1] for range in regval_range_tup_list]
                    reglinecount = []
                    regvals = []
                    for head_site, loclist in zip(head_siteparam_tup_list, linesineed):
                        regvals_ext = [x for x in loclist if pattern.search(x)]
                        regvaltups_list = [tuple(x.split(":")[0:2]) for x in regvals_ext]
                        regvaldict = dict(regvaltups_list)
                        df = pd.DataFrame(data=regvaldict)
                        print(df)

The Sample of output of the dictionary being used looks like this when printed:

{'1000': '1669.15', '10012': '-0.674219', '10013': '-0.260156', '1003': '9.5792', '1007': '11.9812', '1011': '27.888', '1012': '14.8333', '1014': '19.1812', '1015': '19.0396', '1024': '1352.66', '1025': '3247.63', '1026': '33.7434', '1027': '38.7566', '1030': '19.7548', '1031': '30.2201'}

As you can see they are all strings, so why is it giving me this error? And how do i fix it?

2 Answers 2

3

Check the parameter orient from .from_dict():

pd.DataFrame.from_dict(dic, orient='index')

Another option:

pd.DataFrame(dic.keys(), index = dic.values())

Output:

        0
1000    1669.15
10012   -0.674219
10013   -0.260156
...

Alternatively, if you do not want to have the keys as index:

pd.DataFrame(dic.items())

Output:

    0       1
0   1000    1669.15
1   10012   -0.674219
2   10013   -0.260156
Sign up to request clarification or add additional context in comments.

3 Comments

Hmm, but I am not using the .from_dict() function. I try to use as less functions as possible to minimize any potential overhead time as I will be processing large amount of files
Check the second option in case it is better for your case.
Unfortunately none of those worked. I found the solution by using a list of dictionaries instead. But thank you for your efforts! I have copied down your methods in case a different use case dictates i use them
0

I found what I was looking for by using a list of dictionaries instead of passing a dictionary.

This not only helped me append different dictionaries but it streamlined the code and allowed me to create a dataframe from the aggregate list of dictionaries easily to exactly how I wanted it to look:

     for head_site, loclist in zip(head_siteparam_tup_list, linesineed):
                  
                    regvals_ext = [x for x in loclist if pattern.search(x)]
                    #print(regvals_ext)
                    regvaltups_list = [tuple(x.split(":")[0:2]) for x in regvals_ext]
                    
                    regvaldict = dict(regvaltups_list)
                    regvaltupaggr_list.append(regvaldict)
                    
                regvalfile_df = pd.DataFrame(regvaltupaggr_list)
                # print(regvalfile_df)
                regvalfile_df.to_csv(r"C:\Users\sys_nsgprobeingestio\Documents\dozie\odfs\etest\filebinval.csv", index=False)

Ouput:

1000      10012      10013     1003     1007    1011     1012     1014     1015     1024     1025     1026     1027  ...   9717   9718     9722     9723     9724     9725     9726     9727     9728     9729    9730       9912       9913
0   1665.67  -0.678906  -0.267969  9.66017  12.0638  27.728  15.2347  19.9796   19.634  1352.33  3618.55   32.843  38.1179  ...  81.58  89.88  106.239  117.136  132.556  132.944   141.92   132.76  161.551  68.6192  67.325   -0.68125  -0.27031

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.