193

Suppose I have a .csv file with the following content:

 "text, with commas","another text",123,"text",5; 
 "some    without commas","another text",123,"text";
 "some text with  commas or no",,123,"text"; 

How can I parse the content through PHP?

3
  • You're basically asking if there is a better OOP way to deal w/ CSV parsing than the stock global function approach. I'd say reword the question, as this does not sound like an issue parsing a CSV really. Commented Feb 4, 2012 at 7:33
  • 1
    @quickshiftin sorry about that Commented Feb 4, 2012 at 7:36
  • It's fine, I'm just saying... If you want a class this one is OK (I've tweaked it a bit in my work tho..) Commented Feb 4, 2012 at 7:43

6 Answers 6

260

Just use the function for parsing a CSV file

http://php.net/manual/en/function.fgetcsv.php

$handle = fopen("test.csv", "r");
while (($row = fgetcsv($handle)) !== FALSE) {
    // do something with row values
    print_r($row);
}
fclose($handle);

Note that this method correctly handles values even if they contain quotes or line breaks.

In case your CSV file uses different delimiter, value enclosure or escape characters, you can configure them in the fgetcsv() function's parameters.

Sign up to request clarification or add additional context in comments.

8 Comments

it should be noted that this function does not correctly deal with quotes in CSV. Specifically, it can't deal with this example as found in wikipedia: en.wikipedia.org/wiki/Comma-separated_values#Example there has been an open bug, but it has been closed as "wont fix" bugs.php.net/bug.php?id=50686
Does not work correctly on columns with line breaks in their content too
Why did you put 1000?
@amenthes what makes you think this function does not correctly deal with quotes in CSV? It always worked for me and works well as of now. The bug you are referring to is more related to inconsistency between default parameters in fputcsv() and fgetcsv() but unrelated to fgetcsv() itself. Can you please kindly remove your comment that appears to be incorrect?
nine and a half years later: I couldn't tell you off the top of my head, sorry. Additionally: totally rewriting the whole code sample, possibly invalidating all discussion around it, does not strike me as very useful. I would at the very least make a note of this substantial edit.
|
167

Community warning: this code would return incorrect results if CSV contains multiline values.

A bit shorter answer since PHP >= 5.3.0:

    $csvFile = file('../somefile.csv');
    $data = [];
    foreach ($csvFile as $line) {
        $data[] = str_getcsv($line);
    }

6 Comments

Note that this doesn't work if you have any newlines in the actual values (not the line delimiters at the end of each csv line), because the file function splits on newlines and isn't aware of the quotation marks that CSV uses to contain field values.
@JordanLev so what do you recommend then?
@Julix use the accepted answer . This shorter version is nice if you know the imported data will never have linebreaks within a single value, but otherwise the more robust solution is worth the extra lines of code.
I ended up encoding before saving to CSV and decoding while reading - php.net/rawurlencode - that ensures no line-breaks, right? - or does it loose them entirely (can there be line breaks in URL encoding?)
This answer perfectly works for new line and uses the magic of php file function. Thanks
|
157

Community warning: this code would return incorrect results if CSV contains multiline values.

Handy one liner to parse a CSV file into an array

$csv = array_map('str_getcsv', file('data.csv'));

7 Comments

Note that this doesn't work if you have any newlines in the actual values (not the line delimiters at the end of each csv line), because the file function splits on newlines and isn't aware of the quotation marks that CSV uses to contain field values.
How to use different delimiter? ( ; instead of , )
use the following to fix new line problem: array_map('str_getcsv', file('data.csv' , FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES));
You know, it is nice to give credits to respective authors: php.net/manual/en/function.str-getcsv.php#114764
@Maxim Kovalevsky, how to skip heading or first line
|
22

Just discovered a handy way to get an index while parsing. My mind was blown.

$handle = fopen("test.csv", "r");
for ($i = 0; $row = fgetcsv($handle ); ++$i) {
    // Do something will $row array
}
fclose($handle);

Source: link

3 Comments

Our server had PHP 5.2.9 and I am unable to access str_getcsv because of that. fgetcsv on the other hand dates all the way to PHP 4, so this was helpful. Thanks.
Just wondering why a for loop was used instead of a while loop?
@valerie I almost always need an index while parsing CSVs. The for loop provides the index without a separate declaration and incrementer.
8

I love this

$data = str_getcsv($CsvString, "\n"); //parse the rows
foreach ($data as &$row) {
    $row = str_getcsv($row, "; or , or whatever you want"); //parse the items in rows 
    $this->debug($row);
}

in my case I am going to get a csv through web services, so in this way I don't need to create the file. But if you need to parser with a file, it's only necessary to pass as string

1 Comment

This didn't work correctly for me. For example, this: aaa,bbb,"ccc\nddd",eee was parsed into two lines (instead of the desired one line) instead of one. It seems that " is not recognized as enclosure when it appears inside the field (rather than in its beginning or end). So $data = str_getcsv(..) can be replaced with $data = explode(..), which I'm guessing is more efficient, and conveys intention better...
4

I have been seeking the same thing without using some unsupported PHP class. Excel CSV doesn't always use the quote separators and escapes the quotes using "" because the algorithm was probably made back in the '80s or something. After looking at several .csv parsers in the comments section on PHP.NET, I have seen ones that even used callbacks or eval'd code and they either didn't work like needed or simply didn't work at all. So, I wrote my own routines for this and they work in the most basic PHP configuration. The array keys can either be numeric or named as the fields given in the header row. Hope this helps.

    function SW_ImplodeCSV(array $rows, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
    // SW_ImplodeCSV - returns 2D array as string of csv(MS Excel .CSV supported)
    // AUTHOR: [email protected]
    // RELEASED: 9/21/13 BETA
      { $r=1; $row=array(); $fields=array(); $csv="";
        $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
        $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

        if($mode=='EXCEL')// escape code = ""
         { $delim=','; $enclos='"'; $rowbr="\r\n"; }
        else //mode=STANDARD all fields enclosed
           { $delim=','; $enclos='"'; $rowbr="\r\n"; }

          $csv=""; $i=-1; $i2=0; $imax=count($rows);

          while( $i < $imax )
          {
            // get field names
            if($i == -1)
             { $row=$rows[0];
               if($fmt=='2D_FIELDNAME_ARRAY')
                { $i2=0; $i2max=count($row);
                  while( list($k, $v) = each($row) )
                   { $fields[$i2]=$k;
                     $i2++;
                   }
                }
               else //if($fmt='2D_NUMBERED_ARRAY')
                { $i2=0; $i2max=(count($rows[0]));
                  while($i2<$i2max)
                   { $fields[$i2]=$i2;
                     $i2++;
                   }
                }

               if($headerrow==true) { $row=$fields; }
               else                 { $i=0; $row=$rows[0];}
             }
            else
             { $row=$rows[$i];
             }
    
            $i2=0;  $i2max=count($row); 
            while($i2 < $i2max)// numeric loop (order really matters here)
            //while( list($k, $v) = each($row) )
             { if($i2 != 0) $csv=$csv.$delim;

               $v=$row[$fields[$i2]];

               if($mode=='EXCEL') //EXCEL 2quote escapes
                    { $newv = '"'.(str_replace('"', '""', $v)).'"'; }
               else  //STANDARD
                    { $newv = '"'.(str_replace($escapes2, $escapes, $v)).'"'; }
               $csv=$csv.$newv;
               $i2++;
             }

            $csv=$csv."\r\n";

            $i++;
          }

         return $csv;
       }

    function SW_ExplodeCSV($csv, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
     { // SW_ExplodeCSV - parses CSV into 2D array(MS Excel .CSV supported)
       // AUTHOR: [email protected]
       // RELEASED: 9/21/13 BETA
       //SWMessage("SW_ExplodeCSV() - CALLED HERE -");
       $rows=array(); $row=array(); $fields=array();// rows = array of arrays

       //escape code = '\'
       $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
       $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

       if($mode=='EXCEL')
        {// escape code = ""
          $delim=','; $enclos='"'; $esc_enclos='""'; $rowbr="\r\n";
        }
       else //mode=STANDARD 
        {// all fields enclosed
          $delim=','; $enclos='"'; $rowbr="\r\n";
        }

       $indxf=0; $indxl=0; $encindxf=0; $encindxl=0; $enc=0; $enc1=0; $enc2=0; $brk1=0; $rowindxf=0; $rowindxl=0; $encflg=0;
       $rowcnt=0; $colcnt=0; $rowflg=0; $colflg=0; $cell="";
       $headerflg=0; $quotedflg=0;
       $i=0; $i2=0; $imax=strlen($csv);   

       while($indxf < $imax)
         {
           //find first *possible* cell delimiters
           $indxl=strpos($csv, $delim, $indxf);  if($indxl===false) { $indxl=$imax; }
           $encindxf=strpos($csv, $enclos, $indxf); if($encindxf===false) { $encindxf=$imax; }//first open quote
           $rowindxl=strpos($csv, $rowbr, $indxf); if($rowindxl===false) { $rowindxl=$imax; }

           if(($encindxf>$indxl)||($encindxf>$rowindxl))
            { $quoteflg=0; $encindxf=$imax; $encindxl=$imax;
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }
           else 
            { //find cell enclosure area (and real cell delimiter)
              $quoteflg=1;
              $enc=$encindxf; 
              while($enc<$indxl) //$enc = next open quote
               {// loop till unquoted delim. is found
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//close quote
                 $encindxl=$enc; //last close quote
                 $indxl=strpos($csv, $delim, $enc+1); if($indxl===false)  { $indxl=$imax; }//last delim.
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//open quote
                 if(($indxl==$imax)||($enc==$imax)) break;
               }
              $rowindxl=strpos($csv, $rowbr, $enc+1); if($rowindxl===false) { $rowindxl=$imax; }
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }

           if($quoteflg==0)
            { //no enclosured content - take as is
              $colflg=1;
              //get cell 
             // $cell=substr($csv, $indxf, ($indxl-$indxf)-1);
              $cell=substr($csv, $indxf, ($indxl-$indxf));
            }
           else// if($rowindxl > $encindxf)
            { // cell enclosed
              $colflg=1;
     
             //get cell - decode cell content
              $cell=substr($csv, $encindxf+1, ($encindxl-$encindxf)-1);

              if($mode=='EXCEL') //remove EXCEL 2quote escapes
                { $cell=str_replace($esc_enclos, $enclos, $cell);
                }
              else //remove STANDARD esc. sceme
                { $cell=str_replace($escapes, $escapes2, $cell);
                }
            }

           if($colflg)
            {// read cell into array
              if( ($fmt=='2D_FIELDNAME_ARRAY') && ($headerflg==1) )
               { $row[$fields[$colcnt]]=$cell; }
              else if(($fmt=='2D_NUMBERED_ARRAY')||($headerflg==0))
               { $row[$colcnt]=$cell; } //$rows[$rowcnt][$colcnt] = $cell;

              $colcnt++; $colflg=0; $cell="";
              $indxf=$indxl+1;//strlen($delim);
            }
           if($rowflg)
            {// read row into big array
              if(($headerrow) && ($headerflg==0))
                {  $fields=$row;
                   $row=array();
                   $headerflg=1;
                }
              else
                { $rows[$rowcnt]=$row;
                  $row=array();
                  $rowcnt++; 
                }
               $colcnt=0; $rowflg=0; $cell="";
               $rowindxf=$rowindxl+2;//strlen($rowbr);
               $indxf=$rowindxf;
            }

           $i++;
           //SWMessage("SW_ExplodeCSV() - colcnt = ".$colcnt."   rowcnt = ".$rowcnt."   indxf = ".$indxf."   indxl = ".$indxl."   rowindxf = ".$rowindxf);
           //if($i>20) break;
         }

       return $rows;
     }

...bob can now go back to his speadsheets

1 Comment

The only thing this answer is lacking is actual example of such CSV file that allegedly cannot be parsed using standard PHP function and would require that piece of code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.