1

I couldn't help but thinking if there is anyways I can do this with fewer lines:

def load_data(symbol, time_frame, folder_name='candle_dfs'):
    data = np.loadtxt('{}/{}/{}-{}.csv'.format(folder_name, symbol, symbol, time_frame), delimiter=',', unpack=True, dtype=str, skiprows=1)
    date = data[0]
    openp = data[1]
    closep = data[2]
    highp = data[3]
    lowp = data[4]
    volume = data[5]
    return date, openp, closep, highp, lowp, volume

Basically I have csv files that I used pd.to_csv() to export and now I loaded them in as a numpy array. The csv file structure looks something like this:

DATE,OPEN,CLOSE,HIGH,LOW,VOLUME
07-01-2016 00:00:00,428.2,458.78,462.0,427.11,55448.62348451
14-01-2016 00:00:00,431.09,419.55,435.0,352.5,351431.25461113
21-01-2016 00:00:00,419.65,394.97,424.57,371.25,180450.95451554
28-01-2016 00:00:00,394.7,368.98,395.48,360.03,161054.42792964

so when I loaded it in with numpy.loadtxt() and using unpack=True each column in the dataframe becomes an array which then I can set each array into a variable so I can call them later. The code above works. However, I'm just wondering if it is possible to do this part in a fewer lines:

date = data[0]
openp = data[1]
closep = data[2]
highp = data[3]
lowp = data[4]
volume = data[5]

Thank you very much for helping!

2 Answers 2

2

You read your data into dataframe by using pd.read_csv()

Then

d=dict(zip(list(df),df.T.values))
d
Out[104]: 
{'CLOSE': array([458.78, 419.55, 394.97, 368.98], dtype=object),
 'DATE': array(['07-01-2016 00:00:00', '14-01-2016 00:00:00',
        '21-01-2016 00:00:00', '28-01-2016 00:00:00'], dtype=object),
 'HIGH': array([462.0, 435.0, 424.57, 395.48], dtype=object),
 'LOW': array([427.11, 352.5, 371.25, 360.03], dtype=object),
 'OPEN': array([428.2, 431.09, 419.65, 394.7], dtype=object),
 'VOLUME': array([55448.62348451, 351431.25461113, 180450.95451554, 161054.42792964],
       dtype=object)}

update

D,O,C,H,L,V=df.T.values
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for your response! This is basically the same with loading it out to numpy but yours is a dictionary and I can call them by d['DATE'] or something like that. However if I still want to set each key to a variable I would still need to do something like: date = d['DATE'] close = d['CLOSE'] ... Im interested in knowing if there is a way to just loop through and set each variable to the key value in that dict. maybe something like date, close, open, high, low, volume = d[k] for k,v in d but this doesnt work
Yes thank you that is exactly what I was looking for. I was close haha just made it way complicated that it should :P. Thanks so much!
@MightAsWell if this is what you need ? can you consider accept it ?
ah yes just did I forgot to do that. Thank you again.
0

I was going to suggest the unpack parameter - but you are already using it. Why didn't you go all the way and do:

date, openp, closep, ... = np.loadtxt(....)

With unpack, loadtxt returns a tuple of the columns.

You could split that line into:

def load_data(symbol, time_frame, folder_name='candle_dfs'):
    data = np.loadtxt('{}/{}/{}-{}.csv'.format(folder_name, symbol, symbol, time_frame), delimiter=',', unpack=True, dtype=str, skiprows=1)
    date, openp, closep, highp, lowp, volume = data
    return date, openp, closep, highp, lowp, volume

But the return from load_data is a tuple, without these names. You'd still have to do

date, openp, closep,... = load_data(...)

to get the variables in the main space. Why do all the extra assignment within the function? Just

return data

Without the unpack, data2d will be a (n,6) array of dtype string. The columns are accessible by indexing

date = data2d[:,0]
openp = data2d[:,1]
etc

Or do what the unpack does and transpose it so the first dimension is the file columns, and then unpack that.

date, openp, ... = data2d.T

The relevant line from the docs is:

If True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...).... Default is False.

2 Comments

Well to be completely honest with you I'm super new to python so there are alot of things that I dont know haha. But you are right I did way too many extra assignment. What I have right now is just do a df = pd.read_csv and then I just return df.T.values then whenever I call the function I just do date, openp, closep, ... = load_data()
@MightAsWell, I tend to discourage unpack because having a array of rows and columns like the csv is usually more useful than splitting the columns into separate variables. Even with a structured array result, access by field name can be more convenient.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.