Python: read mixed float and string csv file

Question

I have a csv file with mixed floats, a string and an integer, the formatted output from a FORTRAN file. A typical line looks like:

 507.930    ,  24.4097    ,   1.0253E-04, O  III   ,    4

I want to read it while keeping the float decimal places unmodified, and check to see if the first entry in each line is present is another list.

Using loadtxt and genfromtxt results in the demical places changing from 3 (or 4) to 12.

How should I tackle this?

Tim Pietzcker · Accepted Answer · 2013-07-01 13:27:25Z

1

If you need to keep precision exactly, you need to use the decimal module. Otherwise, issues with floating point arithmetic limitations might trip you up.

Chances are, though, that you don't really need that precision - just make sure you don't compare floats for equality exactly but always allow a fudge factor, and format the output to a limited number of significant digits:

# instead of if float1==float2:, use this:
if abs(float1-float2) <= sys.float_info.epsilon: 
    print "equal"

edited Jul 1, 2013 at 13:27

answered Jul 1, 2013 at 13:21

Tim Pietzcker

337k59 gold badges520 silver badges572 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

stderr · Accepted Answer · 2013-07-01 13:24:19Z

1

loadtxt appears to take a converters argument so something like:

from decimal import Decimal
numpy.loadtxt(..., converters={0: Decimal,
                               1: Decimal,
                               2: Decimal})

Should work.

Decimal's should work with whatever precision you require although if you're doing significant number crunching with Decimal it will be considerably slower than working with float. However, I assume you're just looking to transform the data without losing any precision so this should be fine.

answered Jul 1, 2013 at 13:24

stderr

8,7521 gold badge38 silver badges51 bronze badges

1 Comment

stderr Over a year ago

Actually, after a closer look at the numpy.loadtxt this likely won't work as apparently the converters functions need to return a float. Of course you could write a function yourself that attempts to preserve the precision from the input but I think the conversion to float requirement is going to make this hard to work around.

dcnicholls · Accepted Answer · 2013-07-02 02:30:54Z

0

I finished up writing some string processing code. Not elegant but it works:

stuff=loadtxt(fname1,skiprows=35,dtype="f10,f10,e10,S10,i1",delimiter=','‌) 
stuff2 = loadtxt('keylines.txt') # a list of the reference values
... # open file for writing etc
for i in range(0,len(stuff)): 
    bb=round(float(stuff[i][0]),3) # gets number back to correct decimal format
    cc=round(float(stuff[i][1]),5) # ditto
    dd=float(stuff[i][2]) 
    ee=stuff[i][3].replace(" ","")  # gets rid of extra FORTRAN spaes
    ff=int(stuff[i][4]) 
    for item in stuff2: 
        if bb == item: 
        fn.write( str(bb)+','+str("%1.5f" % cc)+','+str("%1.4e" % dd)+','+ee+','+str(ff)+'\n')

answered Jul 2, 2013 at 2:30

dcnicholls

3911 gold badge5 silver badges15 bronze badges

Collectives™ on Stack Overflow

Python: read mixed float and string csv file

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related