1

I would like to be able to read the data that I have into C++ and then start to do things to manipulate it. I am quite new but have a tiny bit of basic knowledge. The most obvious way of doing this that strikes me (and maybe this comes from using excel previously) would be to read the data into a 2d array. This is the code that I have so far.

#include <iostream>
#include <fstream>
#include <algorithm>
#include <string>
#include <sstream>

using namespace std;

string C_J;

int main()
{
    float data[1000000][10];

    ifstream C_J_input;
    C_J_input.open("/Users/RT/B/CJ.csv");

    if (!C_J_input) return -1;

    for(int row = 0; row <1000000; row++)
    {

        string line;
        getline(C_J_input, C_J, '?');
        if ( !C_J_input.good() )
            break;

        stringstream iss(line);

        for(int col = 0; col < 10; col++)
            {
            string val;
            getline(iss, val, ',');
            if (!iss.good() )
                break;

            stringstream converter(val);
            converter >> data[row][col];
        }
    }


    cout << data;

    return 0;
}

Once I have the data read in I would like to be able to read through it line by line and then pull analyse it, looking for certain things however I think that could probably be the topic of another thread, once I have the data read in.

Just let me know if this is a bad question in any way and I will try to add anything more that might make it better.

Thanks!

8
  • Well, I must admit that your question is mostly OK, but apparently you forgot to state the question itself. You have precisely stated what you have and what you want to have, but there's no word about what's currently wrong with it and what you are currently trying to fix. First glance at your code - you have an array, you open the filestream, read lines in a loop and parse them with stringstream, seems valid. So, what's actually wrong? What does not work in that current code? Commented Aug 18, 2014 at 13:16
  • I guess I just wrote that maybe was naturally expecting it to be wrong! I created that as a merge of 2 snippets that I have. I am a little rusty and haven't been able to do anything yet other than create a successful build. How would I take the data in "data" and now work with it? (I might create a new thread for that however) Commented Aug 18, 2014 at 13:20
  • float data[1000000][10]; this is 38mb of data and will almost certainly overflow the stack. Why don't you use std::vector<float> and the push_back function to only allocate the amount of memory required to represent the file? Commented Aug 18, 2014 at 13:20
  • 1
    it would make more sense to dump the whole file into a string, and then use a split method to first split by newlines, then split each line by the comma separate. Commented Aug 18, 2014 at 14:22
  • 1
    In this case, they are in a vector of vectors of strings, so for each line there is a vector of strings each holding an cell of the csv. Commented Aug 21, 2014 at 12:30

2 Answers 2

2

as request of the asker, this is how you would load it into a string, then split into lines, and then further split into elements:

#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <sstream>


//This takes a string and splits it with a delimiter and returns a vector of strings
std::vector<std::string> &SplitString(const std::string &s, char delim, std::vector<std::string> &elems)
{
    std::stringstream ss(s);
    std::string item;
    while (std::getline(ss, item, delim))
    {
        elems.push_back(item);
    }
    return elems;
}


int main(int argc, char* argv[])
{

    //load the file with ifstream
    std::ifstream t("test.csv");
    if (!t)
    {
        std::cout << "Unknown File" << std::endl;
        return 1;
    }

    //this is just a block of code designed to load the whole file into one string
    std::string str;

    //this sets the read position to the end
    t.seekg(0, std::ios::end);
    str.reserve(t.tellg());//this gives the string enough memory to allocate up the the read position of the file (which is the end)
    t.seekg(0, std::ios::beg);//this sets the read position back to the beginning to start reading it

    //this takes the everything in the stream (the file data) and loads it into the string.
    //istreambuf_iterator is used to loop through the contents of the stream (t), and in this case go up to the end.
    str.assign((std::istreambuf_iterator<char>(t)),
        std::istreambuf_iterator<char>());
    //if (sizeof(rawData) != *rawSize)
    //  return false;

    //if the file has size (is not empty) then analyze
    if (str.length() > 0)
    {
        //the file is loaded

        //split by delimeter(which is the newline character)
        std::vector<std::string> lines;//this holds a string for each line in the file
        SplitString(str, '\n', lines);

        //each element in the vector holds a vector of of elements(strings between commas)
        std::vector<std::vector<std::string> > LineElements;



        //for each line
        for (auto it : lines)
        {
            //this is a vector of elements in this line
            std::vector<std::string> elementsInLine;

            //split with the comma, this would seperate "one,two,three" into {"one","two","three"}
            SplitString(it, ',', elementsInLine);

            //take the elements in this line, and add it to the line-element vector
            LineElements.push_back(elementsInLine);
        }

        //this displays each element in an organized fashion

        //for each line
        for (auto it : LineElements)
        {
            //for each element IN that line
            for (auto i : it)
            {
                //if it is not the last element in the line, then insert comma
                if (i != it.back())
                    std::cout << i << ',';
                else
                    std::cout << i;//last element does not get a trailing comma
            }
            //the end of the line
            std::cout << '\n';
        }
    }
    else
    {
        std::cout << "File Is empty" << std::endl;
        return 1;
    }

    system("PAUSE");
    return 0;
}
Sign up to request clarification or add additional context in comments.

10 Comments

Sorry @Kevin, I don't really understand. As I understand it, you initially set up a vector that consists of strings .......(and then I lose the plot). Apologies but I think I would literally need a line by line description of what this is doing. I think it might be back to the textbooks for me for now :-(
Sorry to be late, but I added some comments to clarify.
That I took directly from here: stackoverflow.com/questions/2602013/… as a solution to reading a whole file into a string in one shot, but I can clarify what each line does.
I had a syntax error, because I changed elements to LineElements but not in every instance, it should work fine now.
@Taylr, they way I designed this, you pass the empty vector into SplitString, and it fills it for you. the & is used to pass the vector by reference, so it can be modified within the split method. Technically, I did not need to to return std::vector<std::string>&, but I did that out of personal preference.
|
1

On second glance, I've noticed few obvious issues which will slow your progress greatly, so I'll drop them here:

1) you are using two disconnected variables for reading the lines:

  • C_J - which receives data from getline function
  • line - which is used as the source of stringstream

I'm pretty sure that the C_J is completely unnecessary. I think you wanted to simply do

getline(C_J_input, line, ...)  // so that the textline read will fly to the LINE var
// ...and later
stringstream iss(line); // no change

or, alternatively:

getline(C_J_input, C_J, ...)  // no change
// ...and later
stringstream iss(C_J); // so that ISS will read the textline we've just read

elsewise, the stringstream will never see what getline has read form the file - getline writes the data to different place (C_J) than the stringstream looks at (line).

2) another tiny bit is that you are feeding a '?' into getline() as the line separator. CSVs usually use a 'newline' character to separate the data lines. Of course, your input file may use '?' - I dont know. But if you wanted to use a newline instead then omit the parameter at all, getline will use default newline character matching your OS, and this will probably be just OK.

3) your array of float is, um huge. Consider using list instead. It will nicely grow as you read rows. You can even nest them, so list<list<float>> is also very usable. I'd actually probably use list<vector<float>> as the number of columns is constant though. Using a preallocated huge array is not a good idea, as there always be a file with one-line-too-much you know and ka-boom.

4) your code contains a just-as-huge loop that iterates a constant number of times. A loop itself is ok, but the linecount will vary. You actually don't need to count the lines. Especially if you use list<> to store the values. Just like you;ve checked if the file is properly open if(!C_J_input), you may also check if you have reached End-Of-File:

if(C_J_input.eof())
    ; // will fire ONLY if you are at the end of the file.

see here for an example

uh.. well, that's for start. Goodluck!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.