2

I need to somehow make numpy load in both text and numbers.

I am getting this error:

Traceback (most recent call last):
  File "ip00ktest.py", line 13, in <module>
    File = np.loadtxt(str(z[1]))        #load spectrum file 
  File "/usr/lib64/python2.6/site-packages/numpy/lib/npyio.py", line 805, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]
ValueError: invalid literal for float(): EFF

because my file I'm loading in has text in it. I need each word to be stored in an array index as well as the data below it. How do I do that?

Edit: Sorry for not giving an example. Here is what my file looks like.

FF   3500.  GRAVITY 0.00000  SDSC GRID  [+0.0]   VTURB 2.0 KM/S    L/H 1.25                            
  wl(nm)    Inu(ergs/cm**2/s/hz/ster) for 17 mu in 1221 frequency intervals
            1.000   .900  .800  .700  .600  .500  .400  .300  .250  .200  .150  .125  .100  .075  .050  .025  .010
    9.09 0.000E+00     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    9.35 0.000E+00     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    9.61 0.000E+00     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    9.77 0.000E+00     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    9.96 0.000E+00     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0

There are thousands of numbers below the ones shown here. Also, there are different datasets within the file such that the header you see on top repeats, followed by a new set of new numbers.

Code that Fails:

import sys
import numpy as np
from math import *

print 'Number of arguments:', len(sys.argv), 'arguments.'
print 'Argument List:', str(sys.argv)

z = np.array(sys.argv)          #store all of the file names into array

i = len(sys.argv)           #the length of the filenames array

File = np.loadtxt(str(z[1]))        #load spectrum file 
4
  • you need to give more details about the structure of your file. Commented May 17, 2013 at 16:21
  • how big is the file? Can you load the whole thing in memory? Commented May 17, 2013 at 16:59
  • I think so. I just tried to use lists, and readlines() doesn't give an error. I assume that means it is not too big? Commented May 17, 2013 at 17:04
  • You could split the file into separate chunk by looking at the lines where the header starts, then load those into numpy to create separate arrays. Or something like that. . . Commented May 17, 2013 at 17:33

2 Answers 2

3

If the line that messes it up always begins with EFF, then you can ignore that line quite easily:

np.loadtxt(str(z[1]), comments='EFF')

Which will treat any line beginning with 'EFF' as a comment and it will be ignored.

Sign up to request clarification or add additional context in comments.

1 Comment

I really wish I would've seen this earlier. I ended up having to make a counter in the file (since the data sets within were uniform), and having certain arrays reset to 0 after it reached a new "EFF" line. You have just save me a lot of time in the future though. Much appreciated.
1

To read the numbers, use the skiprows parameter of numpy.loadtxt to skip the header. Write custom code to read the header, because it seems to have an irregular format.

NumPy is most useful with homogenous numerical data -- don't try to put the strings in there.

2 Comments

This would be useful, except that the header repeats itself. It'll just screw up once it reaches the header again for a new set of data within the file.
@user2378781 Then read the file line by line, examine the string and decide what to do with each line.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.