2

I have an array of tuples loaded from a csv file using np.genfromtxt() function.

import numpy as np
import re
from matplotlib.dates import strpdate2num
def convert_string_to_bigint(x):
    p = re.compile(r'(\d{4})/(\d{1,2})/(\d{1,2}) (\d{1,2}):(\d{2}):\d{2}')
    m = p.findall(x)
    l = list(m[0])
    l[1] = ('0' + l[1])[-2:]
    l[2] = ('0' + l[2])[-2:]
    return long("".join(l))

#print convert_string_to_bigint("2012/7/2 14:07:00")
csv = np.genfromtxt ('sr00-1min.txt', delimiter=',', converters={0:convert_string_to_bigint})

The data sample in the csv file:

2015/9/2 14:54:00,5169,5170,5167,5168
2015/9/2 14:55:00,5168,5169,5166,5166
2015/9/2 14:56:00,5167,5170,5165,5169
2015/9/2 14:57:00,5168,5173,5167,5172
2015/9/2 14:58:00,5172,5187,5171,5182
2015/9/2 14:59:00,5182,5183,5171,5176
2015/9/2 15:00:00,5176,5183,5174,5182

After it is loaded, it looked like this:

[(201509021455L, 5168.0, 5169.0, 5166.0, 5166.0)
 (201509021456L, 5167.0, 5170.0, 5165.0, 5169.0)
 (201509021457L, 5168.0, 5173.0, 5167.0, 5172.0)
 (201509021458L, 5172.0, 5187.0, 5171.0, 5182.0)
 (201509021459L, 5182.0, 5183.0, 5171.0, 5176.0)
 (201509021500L, 5176.0, 5183.0, 5174.0, 5182.0)]

And I want to convert it to a numpy 2d array. It should like this:

[[201509021455L, 5168.0, 5169.0, 5166.0, 5166.0]
 [201509021456L, 5167.0, 5170.0, 5165.0, 5169.0]
 [201509021457L, 5168.0, 5173.0, 5167.0, 5172.0]
 [201509021458L, 5172.0, 5187.0, 5171.0, 5182.0]
 [201509021459L, 5182.0, 5183.0, 5171.0, 5176.0]
 [201509021500L, 5176.0, 5183.0, 5174.0, 5182.0]]

I used code below to solve the question, but it looks extreamly ugly.Could anyone tell me how to convert it in an elegant way?

pool = np.asarray([x for x in csv if x[0] > 201508010000])
sj = np.asarray([x[0] for x in pool])
kpj = np.asarray([x[1] for x in pool])
zgj = np.asarray([x[2] for x in pool])
zdj = np.asarray([x[3] for x in pool])
spj = np.asarray([x[4] for x in pool])
output = np.column_stack((sj,kpj,zgj,zdj,spj))
print output.shape
4
  • What does the csv look like? Commented Sep 6, 2015 at 10:56
  • what do u mean by a 2-d array? How do you want ur output for the same input u have? Commented Sep 6, 2015 at 11:01
  • what's the expected output? Commented Sep 6, 2015 at 11:04
  • You can't get a 2d array with one column being an L and the others floats. Instead genfromtxt gave you a 1d structured array. You can get a 2d array of all floats. Commented Sep 6, 2015 at 15:52

1 Answer 1

2

In convert_string_to_bigint, change

return long("".join(l))

to

return float("".join(l))

Then genfromtxt will recognize all values as floats, and return a 2D array of float dtype:

In [23]: np.genfromtxt ('sr00-1min.txt', delimiter=',', converters={0:convert_string_to_bigint}).shape
Out[23]: (7, 5)

instead of a 1D structured array of mixed dtype.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.