Python Pandas, write DataFrame to fixed-width file (to_fwf?)

Question

I see that Pandas has read_fwf, but does it have something like DataFrame.to_fwf? I'm looking for support for field width, numerical precision, and string justification. It seems that DataFrame.to_csv doesn't do this. numpy.savetxt does, but I wouldn't want to do:

numpy.savetxt('myfile.txt', mydataframe.to_records(), fmt='some format')

That just seems wrong. Your ideas are much appreciated.

take a look at the to_string method so see if you can do what you want. — zach
– zach, Commented May 13, 2013 at 1:02
This looks close. It seems that I'd have to give a formatter function for each column if any two float or string columns had different formats. It would do the trick, it just looks a little unwieldy. I'd hoped I was missing something. Thanks! — jkmacc
– jkmacc, Commented May 13, 2013 at 18:35
pandas df.to_csv has a sep =" " parameter that changes the comma to anything else in this case a space or empty string. That in conjunction with the formatter of the method should do it. — Joop
– Joop, Commented Jun 14, 2013 at 10:52
@Joop Actually with the df.to_csv() method using empty string as delimiter gives the error TypeError: delimiter must be set. — pbreach
– pbreach, Commented Jan 16, 2015 at 21:48
true.. passing empty string to method would create a mess do ignore my reference to empty string. maybe trying pandas "to_string" method would help. if has formatters parameter that is pretty good — Joop
– Joop, Commented Feb 2, 2015 at 11:00

Matt Kramer · Accepted Answer · 2016-03-13 19:29:45Z

28

Until someone implements this in pandas, you can use the tabulate package:

import pandas as pd
from tabulate import tabulate

def to_fwf(df, fname):
    content = tabulate(df.values.tolist(), list(df.columns), tablefmt="plain")
    open(fname, "w").write(content)

pd.DataFrame.to_fwf = to_fwf

answered Mar 13, 2016 at 19:29

Matt Kramer

7549 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Brian Burns · Accepted Answer · 2017-10-03 13:14:53Z

10

For custom format for each column you can set format for whole line. fmt param provides formatting for each line

with open('output.dat') as ofile:
     fmt = '%.0f %02.0f %4.1f %3.0f %4.0f %4.1f %4.0f %4.1f %4.0f'
     np.savetxt(ofile, df.values, fmt=fmt)

edited Oct 3, 2017 at 13:14

Brian Burns

22.4k10 gold badges93 silver badges80 bronze badges

answered Sep 14, 2017 at 16:45

Amir Uteuov

1261 silver badge4 bronze badges

Comments

Alexandre Huat · Accepted Answer · 2021-11-23 10:27:03Z

9

pandas.DataFrame.to_string() is all you need. The only trick is how to manage the index.

# Write
# df.reset_index(inplace=True)  # uncomment if the index matters
df.to_string(filepath, index=False)

# Read
df = pd.read_fwf(filepath)
# df.set_index(index_names, inplace=True)  # uncomment if the index matters

If the index is a pandas.Index that has no name, reset_index() should assign it to column "index". If it is a pandas.MultiIndex that has no names, it should be assigned to columns ["level_0", "level_1", ...].

edited Nov 23, 2021 at 10:27

answered Sep 15, 2020 at 15:23

Alexandre Huat

94311 silver badges16 bronze badges

7 Comments

Jaime M. Over a year ago

Note that Dataframe.to_string() has no option to remove space between output columns.

Alexandre Huat Over a year ago

@JaimeM. I’m sorry but I don’t understand how it relates to the question.

Jaime M. Over a year ago

Dataframe.to_string() is not able to write an string output without spaces between columns so is not valid to get a fixed-width file row. For example, when the column length is 1 character and left aligned, to_string() will output: x yyy... but you want xyyy....

Alexandre Huat Over a year ago

Oh, okay. I haven’t thought about that use case because it’s very rare. When someone exports a table in CSV, TSV, Excel, fixed-width, or any other format, columns separators are usually expected. It makes the file simpler to read and process. Yet, if all columns have 1 character, we can remove the spaces with df.to_string().replace(" ", ""), and then write that string into a text file.

Jaime M. Over a year ago

Fixed-width format doesn't use columns separator. Because each column has fixed width, so you know where is the begining and the end of each column. Padding with white spaces is used between the value and the limit column, so you can use str.replace().

|

Community · Accepted Answer · 2017-05-23 12:00:32Z

7

Python, Pandas : write content of DataFrame into text File

The question aboves answer helped me. It is not the best, but until to_fwf exists this will do the trick for me...

np.savetxt(r'c:\data\np.txt', df.values, fmt='%d')

or

np.savetxt(r'c:\data\np.txt', df.values, fmt='%10.5f')

edited May 23, 2017 at 12:00

CommunityBot

11 silver badge

answered Jun 26, 2016 at 7:51

brandog

1,5275 gold badges21 silver badges28 bronze badges

1 Comment

maxymoo Over a year ago

IMO this is better than tabulate since numpy is included with pandas so doesn't require an additional library

leon yin · Accepted Answer · 2015-07-30 18:14:58Z

4

I'm sure you found a workaround for this issue but for anyone else who is curious... If you write the DF into a list, you can write it out to a file by giving the 'format as a string'.format(list indices) eg:

df=df.fillna('')
outF = 'output.txt'      
dbOut = open(temp, 'w')
v = df.values.T.tolist()        
for i in range(0,dfRows):       
    dbOut.write(( \
    '{:7.2f}{:>6.2f}{:>2.0f}{:>4.0f}{:>5.0f}{:6.2f}{:6.2f}{:6.2f}{:6.1f {:>15}{:>60}'\
    .format(v[0][i],v[1][i],v[2][i],v[3][i],v[4][i],v[5][i],v[6][i],v[7][i],v[8][i],\
    v[9][i],v[10][i]) ))
    dbOut.write("\n")
dbOut.close

Just make sure to match up each index with the correct format :)

Hope that helps!

answered Jul 30, 2015 at 18:14

leon yin

8493 gold badges10 silver badges23 bronze badges

Comments

zubin patel · Accepted Answer · 2019-02-26 20:13:49Z

found a very simple solution! (Python). In the code snapped I am trying to write a DataFrame to a positional File. "finalDataFrame.values.tolist()" will return u a list in which each row of the DataFrame is turn into an another list just a [['Camry',2019,'Toyota'],['Mustang','2016','Ford']]. after that with the help of for loop and if statement i am trying to set its fix length. rest is obvious!

 with open (FilePath,'w') as f:
    for i in finalDataFrame.values.tolist():
        widths=(0,0,0,0,0,0,0)
        if i[2] == 'nan':
            i[2]=''
            for h in range(7):
                i[2]= i[2] + ' '
        else:
            x=7-len(str(i[2]))
            a=''
            for k in range(x):
               a=a+' '
            i[2]=str(i[2])+a

        if i[3] == '':
            i[3]=''
            for h in range(25):
                i[3]=i[3]+' '
        else:
            x = 25 - len(i[3])
            print(x)
            a = ''
            for k in range(x):
                a = a + ' '
            print(a)
            i[3] = i[3] + a


        i[4] = str(i[4])[:10]

        q="".join("%*s" % i for i in zip(widths, i))
        f.write(q+'\n')

Chen Du · Accepted Answer · 2020-09-25 02:26:17Z

Based on others' answer, here is the snippet I wrote, not the best in coding and performance:

import pandas as pd
import pickle
import numpy as np
from tabulate import tabulate


left_align_gen = lambda length, value: eval(r"'{:<<<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
right_align_gen = lambda length, value: eval(r"'{:><<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))

# df = pd.read_pickle("dummy.pkl")
with open("df.pkl", 'rb') as f:
    df = pickle.load(f)

# field width defines here, width of each field
widths=(22, 255, 14, 255, 14, 255, 255, 255, 255, 255, 255, 22, 255, 22, 255, 255, 255, 22, 14, 14, 255, 255, 255, 2, )

# format datetime
df['CREATED_DATE'] = df['CREATED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['LAST_MODIFIED_DATE'] = df['LAST_MODIFIED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['TERMS_ACCEPTED_DATE'] = df['TERMS_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['PRIVACY_ACCEPTED_DATE'] = df['PRIVACY_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))


# print(type(df.iloc[0]['CREATED_DATE']))
# print(df.iloc[0])
record_line_list = []
# for row in df.iloc[:10].itertuples():
for row in [tuple(x) for x in df.to_records(index=False)]:
    record_line_list.append("".join(left_align_gen(length, value) for length, value in zip(widths, row)))

with open('output.txt', 'w') as f:
    f.write('\n'.join(record_line_list))

Github gist

Collectives™ on Stack Overflow

Python Pandas, write DataFrame to fixed-width file (to_fwf?)

7 Answers 7

Comments

Comments

7 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Comments

Comments

7 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related