5

question: is my method of converting a numpy array of numbers to a numpy array of strings with specific number of decimal places AND trailing zeros removed the 'best' way?

import numpy as np
x = np.array([1.12345, 1.2, 0.1, 0, 1.230000])
print np.core.defchararray.rstrip(np.char.mod('%.4f', x), '0')

outputs:

['1.1235' '1.2' '0.1' '0.' '1.23']

which is the desired result. (I am OK with the rounding issue)

Both of the functions 'rstrip' and 'mod' are numpy functions which means this is fast but is there a way to accomplish this with ONE built in numpy function? (ie. does 'mod' have an option that I couldn't find?) It would save the overhead of returning copies twice which for very large arrays is slow-ish.

thanks!

9
  • 1
    why don't you just use print np.char.mod('%0.4f', x)? Commented Aug 14, 2014 at 19:32
  • @Dalek because that would not remove trailing zeros. The reason I want to remove the zeros is it will make my files smaller. I am manually creating some ascii GIS rasters and would prefer to keep the large files as small as possible. Speed-wise, the additional operation to remove trailing zeros is not a bid deal so I consider it worth it for the gain of having smaller files. It would be fine for a few files to be larger than needed but I'm planning to do some quite large scale stuff...it'll add up. So, I am OK with the speed of what I use, but I am curious if anyone has a slicker way. Commented Aug 14, 2014 at 20:14
  • If you are OK with 5 "signifcant digits" instead of 4 decimal places, you could use np.char.mod("%.5g", x). Commented Aug 14, 2014 at 20:28
  • 3
    What version of numpy are you using? In the latest version of numpy, savetxt accepts a file handle: docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html Commented Aug 14, 2014 at 20:45
  • 1
    Related: stackoverflow.com/questions/24691755/… Commented Aug 14, 2014 at 20:53

1 Answer 1

2

Thanks to Warren Weckesser for providing valuable comments. Credit to him.

I converted my code to use:

formatter = '%d'
if num_type == 'float':
  formatter = '%%.%df' % decimals
np.savetxt(out, arr, fmt=formatter)

where out is a file handle to which I had already written my headers. Alternatively, I could also use the headers= argument in np.savetxt. I have no clue how I didn't see those options in the documentation.

For a numpy array 1300 by 1300, creating the line by line output as I did before (using np.core.defchararray.rstrip(np.char.mod('%.4f', x), '0')) took ~1.7 seconds and using np.savetxt takes 0.48 seconds.

So np.savetxt is a cleaner, more readable, and faster solution.

Note: I did try:

np.savetxt(out, arr, fmt='%.4g')

in an effort to not have a switch based on number type but it did not work as I had hoped.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.