Iterating numpy array to set variables

Question

I couldn't help but thinking if there is anyways I can do this with fewer lines:

def load_data(symbol, time_frame, folder_name='candle_dfs'):
    data = np.loadtxt('{}/{}/{}-{}.csv'.format(folder_name, symbol, symbol, time_frame), delimiter=',', unpack=True, dtype=str, skiprows=1)
    date = data[0]
    openp = data[1]
    closep = data[2]
    highp = data[3]
    lowp = data[4]
    volume = data[5]
    return date, openp, closep, highp, lowp, volume

Basically I have csv files that I used pd.to_csv() to export and now I loaded them in as a numpy array. The csv file structure looks something like this:

DATE,OPEN,CLOSE,HIGH,LOW,VOLUME
07-01-2016 00:00:00,428.2,458.78,462.0,427.11,55448.62348451
14-01-2016 00:00:00,431.09,419.55,435.0,352.5,351431.25461113
21-01-2016 00:00:00,419.65,394.97,424.57,371.25,180450.95451554
28-01-2016 00:00:00,394.7,368.98,395.48,360.03,161054.42792964

so when I loaded it in with numpy.loadtxt() and using unpack=True each column in the dataframe becomes an array which then I can set each array into a variable so I can call them later. The code above works. However, I'm just wondering if it is possible to do this part in a fewer lines:

date = data[0]
openp = data[1]
closep = data[2]
highp = data[3]
lowp = data[4]
volume = data[5]

Thank you very much for helping!

BENY · Accepted Answer · 2018-03-08 20:10:36Z

2

You read your data into dataframe by using pd.read_csv()

Then

d=dict(zip(list(df),df.T.values))
d
Out[104]: 
{'CLOSE': array([458.78, 419.55, 394.97, 368.98], dtype=object),
 'DATE': array(['07-01-2016 00:00:00', '14-01-2016 00:00:00',
        '21-01-2016 00:00:00', '28-01-2016 00:00:00'], dtype=object),
 'HIGH': array([462.0, 435.0, 424.57, 395.48], dtype=object),
 'LOW': array([427.11, 352.5, 371.25, 360.03], dtype=object),
 'OPEN': array([428.2, 431.09, 419.65, 394.7], dtype=object),
 'VOLUME': array([55448.62348451, 351431.25461113, 180450.95451554, 161054.42792964],
       dtype=object)}

update

D,O,C,H,L,V=df.T.values

edited Mar 8, 2018 at 20:10

answered Mar 8, 2018 at 18:46

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Man Nguyen Over a year ago

Thank you for your response! This is basically the same with loading it out to numpy but yours is a dictionary and I can call them by d['DATE'] or something like that. However if I still want to set each key to a variable I would still need to do something like: date = d['DATE'] close = d['CLOSE'] ... Im interested in knowing if there is a way to just loop through and set each variable to the key value in that dict. maybe something like date, close, open, high, low, volume = d[k] for k,v in d but this doesnt work

Man Nguyen Over a year ago

Yes thank you that is exactly what I was looking for. I was close haha just made it way complicated that it should :P. Thanks so much!

BENY Over a year ago

@MightAsWell if this is what you need ? can you consider accept it ?

Man Nguyen Over a year ago

ah yes just did I forgot to do that. Thank you again.

hpaulj · Accepted Answer · 2018-03-08 22:07:02Z

0

I was going to suggest the unpack parameter - but you are already using it. Why didn't you go all the way and do:

date, openp, closep, ... = np.loadtxt(....)

With unpack, loadtxt returns a tuple of the columns.

You could split that line into:

def load_data(symbol, time_frame, folder_name='candle_dfs'):
    data = np.loadtxt('{}/{}/{}-{}.csv'.format(folder_name, symbol, symbol, time_frame), delimiter=',', unpack=True, dtype=str, skiprows=1)
    date, openp, closep, highp, lowp, volume = data
    return date, openp, closep, highp, lowp, volume

But the return from load_data is a tuple, without these names. You'd still have to do

date, openp, closep,... = load_data(...)

to get the variables in the main space. Why do all the extra assignment within the function? Just

return data

Without the unpack, data2d will be a (n,6) array of dtype string. The columns are accessible by indexing

date = data2d[:,0]
openp = data2d[:,1]
etc

Or do what the unpack does and transpose it so the first dimension is the file columns, and then unpack that.

date, openp, ... = data2d.T

The relevant line from the docs is:

If True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...).... Default is False.

edited Mar 8, 2018 at 22:07

answered Mar 8, 2018 at 21:15

hpaulj

233k14 gold badges260 silver badges392 bronze badges

2 Comments

Man Nguyen Over a year ago

Well to be completely honest with you I'm super new to python so there are alot of things that I dont know haha. But you are right I did way too many extra assignment. What I have right now is just do a df = pd.read_csv and then I just return df.T.values then whenever I call the function I just do date, openp, closep, ... = load_data()

hpaulj Over a year ago

@MightAsWell, I tend to discourage unpack because having a array of rows and columns like the csv is usually more useful than splitting the columns into separate variables. Even with a structured array result, access by field name can be more convenient.

Collectives™ on Stack Overflow

Iterating numpy array to set variables

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related