26

I see that Pandas has read_fwf, but does it have something like DataFrame.to_fwf? I'm looking for support for field width, numerical precision, and string justification. It seems that DataFrame.to_csv doesn't do this. numpy.savetxt does, but I wouldn't want to do:

numpy.savetxt('myfile.txt', mydataframe.to_records(), fmt='some format')

That just seems wrong. Your ideas are much appreciated.

5
  • 5
    take a look at the to_string method so see if you can do what you want. Commented May 13, 2013 at 1:02
  • This looks close. It seems that I'd have to give a formatter function for each column if any two float or string columns had different formats. It would do the trick, it just looks a little unwieldy. I'd hoped I was missing something. Thanks! Commented May 13, 2013 at 18:35
  • 1
    pandas df.to_csv has a sep =" " parameter that changes the comma to anything else in this case a space or empty string. That in conjunction with the formatter of the method should do it. Commented Jun 14, 2013 at 10:52
  • 1
    @Joop Actually with the df.to_csv() method using empty string as delimiter gives the error TypeError: delimiter must be set. Commented Jan 16, 2015 at 21:48
  • 1
    true.. passing empty string to method would create a mess do ignore my reference to empty string. maybe trying pandas "to_string" method would help. if has formatters parameter that is pretty good Commented Feb 2, 2015 at 11:00

7 Answers 7

28

Until someone implements this in pandas, you can use the tabulate package:

import pandas as pd
from tabulate import tabulate

def to_fwf(df, fname):
    content = tabulate(df.values.tolist(), list(df.columns), tablefmt="plain")
    open(fname, "w").write(content)

pd.DataFrame.to_fwf = to_fwf
Sign up to request clarification or add additional context in comments.

Comments

10

For custom format for each column you can set format for whole line. fmt param provides formatting for each line

with open('output.dat') as ofile:
     fmt = '%.0f %02.0f %4.1f %3.0f %4.0f %4.1f %4.0f %4.1f %4.0f'
     np.savetxt(ofile, df.values, fmt=fmt)

Comments

9

pandas.DataFrame.to_string() is all you need. The only trick is how to manage the index.

# Write
# df.reset_index(inplace=True)  # uncomment if the index matters
df.to_string(filepath, index=False)

# Read
df = pd.read_fwf(filepath)
# df.set_index(index_names, inplace=True)  # uncomment if the index matters

If the index is a pandas.Index that has no name, reset_index() should assign it to column "index". If it is a pandas.MultiIndex that has no names, it should be assigned to columns ["level_0", "level_1", ...].

7 Comments

Note that Dataframe.to_string() has no option to remove space between output columns.
@JaimeM. I’m sorry but I don’t understand how it relates to the question.
Dataframe.to_string() is not able to write an string output without spaces between columns so is not valid to get a fixed-width file row. For example, when the column length is 1 character and left aligned, to_string() will output: x yyy... but you want xyyy....
Oh, okay. I haven’t thought about that use case because it’s very rare. When someone exports a table in CSV, TSV, Excel, fixed-width, or any other format, columns separators are usually expected. It makes the file simpler to read and process. Yet, if all columns have 1 character, we can remove the spaces with df.to_string().replace(" ", ""), and then write that string into a text file.
Fixed-width format doesn't use columns separator. Because each column has fixed width, so you know where is the begining and the end of each column. Padding with white spaces is used between the value and the limit column, so you can use str.replace().
|
7

Python, Pandas : write content of DataFrame into text File

The question aboves answer helped me. It is not the best, but until to_fwf exists this will do the trick for me...

np.savetxt(r'c:\data\np.txt', df.values, fmt='%d')

or

np.savetxt(r'c:\data\np.txt', df.values, fmt='%10.5f')

1 Comment

IMO this is better than tabulate since numpy is included with pandas so doesn't require an additional library
4

I'm sure you found a workaround for this issue but for anyone else who is curious... If you write the DF into a list, you can write it out to a file by giving the 'format as a string'.format(list indices) eg:

df=df.fillna('')
outF = 'output.txt'      
dbOut = open(temp, 'w')
v = df.values.T.tolist()        
for i in range(0,dfRows):       
    dbOut.write(( \
    '{:7.2f}{:>6.2f}{:>2.0f}{:>4.0f}{:>5.0f}{:6.2f}{:6.2f}{:6.2f}{:6.1f {:>15}{:>60}'\
    .format(v[0][i],v[1][i],v[2][i],v[3][i],v[4][i],v[5][i],v[6][i],v[7][i],v[8][i],\
    v[9][i],v[10][i]) ))
    dbOut.write("\n")
dbOut.close

Just make sure to match up each index with the correct format :)

Hope that helps!

Comments

1

found a very simple solution! (Python). In the code snapped I am trying to write a DataFrame to a positional File. "finalDataFrame.values.tolist()" will return u a list in which each row of the DataFrame is turn into an another list just a [['Camry',2019,'Toyota'],['Mustang','2016','Ford']]. after that with the help of for loop and if statement i am trying to set its fix length. rest is obvious!

 with open (FilePath,'w') as f:
    for i in finalDataFrame.values.tolist():
        widths=(0,0,0,0,0,0,0)
        if i[2] == 'nan':
            i[2]=''
            for h in range(7):
                i[2]= i[2] + ' '
        else:
            x=7-len(str(i[2]))
            a=''
            for k in range(x):
               a=a+' '
            i[2]=str(i[2])+a

        if i[3] == '':
            i[3]=''
            for h in range(25):
                i[3]=i[3]+' '
        else:
            x = 25 - len(i[3])
            print(x)
            a = ''
            for k in range(x):
                a = a + ' '
            print(a)
            i[3] = i[3] + a


        i[4] = str(i[4])[:10]

        q="".join("%*s" % i for i in zip(widths, i))
        f.write(q+'\n')

Comments

0

Based on others' answer, here is the snippet I wrote, not the best in coding and performance:

import pandas as pd
import pickle
import numpy as np
from tabulate import tabulate


left_align_gen = lambda length, value: eval(r"'{:<<<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
right_align_gen = lambda length, value: eval(r"'{:><<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))

# df = pd.read_pickle("dummy.pkl")
with open("df.pkl", 'rb') as f:
    df = pickle.load(f)

# field width defines here, width of each field
widths=(22, 255, 14, 255, 14, 255, 255, 255, 255, 255, 255, 22, 255, 22, 255, 255, 255, 22, 14, 14, 255, 255, 255, 2, )

# format datetime
df['CREATED_DATE'] = df['CREATED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['LAST_MODIFIED_DATE'] = df['LAST_MODIFIED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['TERMS_ACCEPTED_DATE'] = df['TERMS_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['PRIVACY_ACCEPTED_DATE'] = df['PRIVACY_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))


# print(type(df.iloc[0]['CREATED_DATE']))
# print(df.iloc[0])
record_line_list = []
# for row in df.iloc[:10].itertuples():
for row in [tuple(x) for x in df.to_records(index=False)]:
    record_line_list.append("".join(left_align_gen(length, value) for length, value in zip(widths, row)))

with open('output.txt', 'w') as f:
    f.write('\n'.join(record_line_list))

Github gist

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.