0

Below is the data response from a call the the Yahoo! Finance API, the data is there, but it is a giant string. Any bright ideas of how to convert this to an array with Date, Open, High, Low, Close, Volume, Adj Close as the columns? I know I could just convert this to a list then convert that to an array using .reshape since I know the order of the data but I was just wondering if there is a more slick way to do it. Thank you

 Date,Open,High,Low,Close,Volume,Adj Close
    2011-01-31,603.60,604.47,595.55,600.36,2804900,600.36
    2011-01-28,619.07,620.36,599.76,600.99,4231100,600.99
    2011-01-27,617.89,619.70,613.25,616.79,2019200,616.79
    2011-01-26,620.33,622.49,615.28,616.50,2038100,616.50
    2011-01-25,608.20,620.69,606.52,619.91,3646800,619.91
    2011-01-24,607.57,612.49,601.23,611.08,4599200,611.08
    2011-01-21,639.58,641.73,611.36,611.83,8904400,611.83
    2011-01-20,632.21,634.08,623.29,626.77,5485800,626.77
    2011-01-19,642.12,642.96,629.66,631.75,3406100,631.75
    2011-01-18,626.06,641.99,625.27,639.63,3617000,639.63
    2011-01-14,617.40,624.27,617.08,624.18,2365600,624.18
    2011-01-13,616.97,619.67,614.16,616.69,1334000,616.69
    2011-01-12,619.35,619.35,614.77,616.87,1632700,616.87
    2011-01-11,617.71,618.80,614.50,616.01,1439300,616.01
    2011-01-10,614.80,615.39,608.56,614.21,1579200,614.21
    2011-01-07,615.91,618.25,610.13,616.44,2101200,616.44
    2011-01-06,610.68,618.43,610.05,613.50,2057800,613.50
    2011-01-05,600.07,610.33,600.05,609.07,2532300,609.07
    2011-01-04,605.62,606.18,600.12,602.12,1824500,602.12
    2011-01-03,596.48,605.59,596.48,604.35,2365200,604.35
    2010-12-31,596.74,598.42,592.03,593.97,1539300,593.97
    2010-12-30,598.00,601.33,597.39,598.86,989500,598.86
    2010-12-29,602.00,602.41,598.92,601.00,1019200,601.00
    2010-12-28,602.05,603.87,598.01,598.92,1064800,598.92
    2010-12-27,602.74,603.78,599.50,602.38,1208100,602.38
    2010-12-23,605.34,606.00,602.03,604.23,1110800,604.23
    2010-12-22,604.00,607.00,603.28,605.49,1207500,605.49
    2010-12-21,598.57,604.72,597.61,603.07,1879500,603.07
    2010-12-20,594.65,597.88,588.66,595.06,1973300,595.06
    2010-12-17,591.00,592.56,587.67,590.80,3087100,590.80
    2010-12-16,592.85,593.77,588.07,591.71,1596900,591.71
    2010-12-15,594.20,596.45,589.15,590.30,2167700,590.30
    2010-12-14,597.09,598.29,592.48,594.91,1643300,594.91
    2010-12-13,597.12,603.00,594.09,594.62,2398500,594.62
    2010-12-10,593.14,593.99,590.29,592.21,1704700,592.21
    2010-12-09,593.88,595.58,589.00,591.50,1868900,591.50
    2010-12-08,591.97,592.52,583.69,590.54,1756900,590.54
    2010-12-07,591.27,593.00,586.00,587.14,3042200,587.14
    2010-12-06,580.57,582.00,576.61,578.36,2093800,578.36
    2010-12-03,569.45,576.48,568.00,573.00,2631200,573.00
    2010-12-02,568.66,573.33,565.35,571.82,2547900,571.82
    2010-12-01,563.00,571.57,562.40,564.35,3754100,564.35
    2010-11-30,574.32,574.32,553.31,555.71,7117400,555.71
    2010-11-29,589.17,589.80,579.95,582.11,2859700,582.11
    2010-11-26,590.46,592.98,587.00,590.00,1311100,590.00
    2010-11-24,587.31,596.60,587.05,594.97,2396400,594.97
    2010-11-23,587.01,589.01,578.20,583.01,2162600,583.01
3
  • 1
    Python has built in CSV support. Check out: docs.python.org/2/library/csv.html Commented Mar 7, 2014 at 13:17
  • 2
    Does it have to be numpy, you can use pandas also for this Commented Mar 7, 2014 at 13:18
  • import pandas as pd; pd.read_csv(giant_string) Commented Mar 7, 2014 at 14:18

4 Answers 4

2

basically all you have to do is to use the csv module to do so:

import csv
with open(PathFile) as f:
    reader = csv.DictReader(f, skipinitialspace=True)
    for row in reader:
        # all the values within the cells will 
        # be strings you'll have to convert to date, float… using numpy or not.
Sign up to request clarification or add additional context in comments.

Comments

0

The csv module implements classes to read and write tabular data in CSV format. It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file which was generated by Excel,” without knowing the precise details of the CSV format used by Excel. Programmers can also describe the CSV formats understood by other applications or define their own special-purpose CSV formats.

The csv module’s reader and writer objects read and write sequences. Programmers can also read and write data in dictionary form using the DictReader and DictWriter classes.

>>> import csv
>>> with open('eggs.csv', 'r') as csvfile:
       spamreader = csv.reader(csvfile, delimiter=',')
       for row in spamreader:
           print ', '.join(row)

Comments

0

This worked for me and seems easier than using the csv methods

import re
import numpy as np

string_list = re.split(',|\n',giant_string)
string_list = [string for string in string_list if x != ''] #take out blanks

string_to_arr = np.array(string_list).reshape(len(string_list)/7, 7) #because I know there are 7 headers

Comments

0

You could use numpy.genfromtxt and you will directly have it as you want (remove names=True if you don't want named columns, but then add a # to the start of your first line):

np.genfromtxt('test', dtype=('object',float,float,float,float,float,float), delimiter=',', names = True)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.