0

I'm really sorry if this is a lame question, but I think this may potentially help others making the same transition from C to Python. I have a program that I started writing in C, but I think it's best if I did it in Python because it just makes my life a lot easier.

My program retrieves intraday stock data from Yahoo! Finance and stores it inside of a struct. Since I'm so used to programming in C I generally try to do things the hard way. What I want to know is what's the most "Pythonesque" way of storing the data into an organized fashion. I was thinking an array of tuples?

Here's a bit of my C program.

// Parses intraday stock quote data from a Yahoo! Finance .csv file. 
void parse_intraday_data(struct intraday_data *d, char *path)
{
    char cur_line[100];
    char *csv_value;
    int i;

    FILE *data_file = fopen(path, "r");

    if (data_file == NULL)
    {
        perror("Error opening file.");
        return;
    }

    // Ignore the first 15 lines.
    for (i = 0; i < 15; i++) 
        fgets(cur_line, 100, data_file);

    i = 0;

    while (fgets(cur_line, 100, data_file) != NULL) {
        csv_value = strtok(cur_line, ",");
        csv_value = strtok(NULL, ",");
        d->close[i] = atof(csv_value);

        csv_value = strtok(NULL, ",");
        d->high[i] = atof(csv_value);

        csv_value = strtok(NULL, ",");
        d->low[i] = atof(csv_value);

        csv_value = strtok(NULL, ",");
        d->open[i] = atof(csv_value);

        csv_value = strtok(NULL, "\n");
        d->volume[i] = atoi(csv_value);

        i++;
    }

    d->close[i] = 0;
    d->high[i] = 0;
    d->low[i] = 0;
    d->open[i] = 0;
    d->volume[i] = 0;
    d->count = i - 1;
    i = 0;

    fclose(data_file);
}

So far my Python program retrieves the data like this.

response = urllib2.urlopen('https://www.google.com/finance/getprices?i=' + interval +     '&p=' + period + 'd&f=d,o,h,l,c,v&df=cpct&q=' + ticker)

Question is, what's the best or most elegant way of storing this data in Python?

4
  • So youre using Yahoo's Finance API? In your code example you are using google's finance API, which no longer exists. Commented Jul 23, 2013 at 7:23
  • What does your struct look like? Commented Jul 23, 2013 at 7:23
  • @Alex Works just fine for me. ;-) Commented Jul 23, 2013 at 7:25
  • Yeah, the Google Finance method still works although the site says it no longer exists. Commented Jul 23, 2013 at 7:35

3 Answers 3

2

Keep it simple. Read the line, split it by commas, and store the values inside a (named)tuple. That’s pretty close to using a struct in C.

If your program gets more elaborate it might (!) make sense to replace the tuple by a class, but not immediately.

Here’s an outline:

from collections import namedtuple
IntradayData = namedtuple('IntradayData',
        ['close', 'high', 'low', 'open', 'volume', 'count'])

response = urllib2.urlopen('https://www.google.com/finance/getprices?q=AAPL')
result=response.read().split('\n')
result = result[15 :] # Your code does this, too. Not sure why.

all_data = []
for i, data in enumerate(x):
    if data == '': continue
    c, h, l, o, v, _ = map(float, data.split(','))
    all_data.append(IntradayData(c, h, l, o, v, i))
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the suggestion. Yeah, in the first snippet of code I was extracting data from a Yahoo! Finance csv file which begins with 15 lines that aren't actual data.
0

I believe it depends on how much data manipulation you will want to do after retrieving data. If, for example, you plan to just print them on the screen then an array of tuple would do.

However, if you need to be able to sort, search, and other kind of data manipulation I believe a custom class could help: you would then work with a list (or even a home-brew container) of custom objects, allowing you for easily adding custom methods, based on your need.

Note that this is just my opinion, and I'm not an advanced Python developer.

Comments

0

Pandas (http://pandas.pydata.org/pandas-docs/stable/) is particularly well suited to this. Numpy is a little lower level, but may also suit your purposes. I really recommend going the pandas route, though. Either way you shouldn't lose too much of C's speed, so that's a plus.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.