173

Can anyone recommend a simple API that will allow me to use read a CSV input file, do some simple transformations, and then write it.

A quick google has found http://flatpack.sourceforge.net/ which looks promising.

I just wanted to check what others are using before I couple myself to this API.

2

10 Answers 10

87

I've used OpenCSV in the past.

import au.com.bytecode.opencsv.CSVReader;

String fileName = "data.csv";
CSVReader reader = new CSVReader(new FileReader(fileName ));

// if the first line is the header String[] header = reader.readNext();
// iterate over reader.readNext until it returns null String[] line = reader.readNext();

There were some other choices in the answers to another question.

Sign up to request clarification or add additional context in comments.

5 Comments

Unfortunately, OpenCSV's latest download (v2.2 at time of comment) does not compile, and they don't provide a pre-built binary.
The package I downloaded from SourceForge had a binary in the deploy folder.
If you're using maven, please note that the dependency code on official website contains version declaration "2.0" which has some bugs, but there is updated version 2.3 in repositories.
this lib doesn't write file in separate thread, no?
according to github.com/uniVocity/csv-parsers-comparison in average 73% slower than uniVocity..
38

Apache Commons CSV

Check out Apache Common CSV.

This library reads and writes several variations of CSV, including the standard one RFC 4180. Also reads/writes Tab-delimited files.

  • Excel
  • InformixUnload
  • InformixUnloadCsv
  • MySQL
  • Oracle
  • PostgreSQLCsv
  • PostgreSQLText
  • RFC4180
  • TDF

2 Comments

I've used the sandboxed Commons CSV for quite some time and never experienced a problem. I really hope they promote it to full standing and get it out of the sandbox.
@bmatthews68 the sandbox link is defunct - looks like it's moved to apache commons proper (I edited the link in the answer too)
34

Update: The code in this answer is for Super CSV 1.52. Updated code examples for Super CSV 2.4.0 can be found at the project website: http://super-csv.github.io/super-csv/index.html


The SuperCSV project directly supports the parsing and structured manipulation of CSV cells. From http://super-csv.github.io/super-csv/examples_reading.html you'll find e.g.

given a class

public class UserBean {
    String username, password, street, town;
    int zip;

    public String getPassword() { return password; }
    public String getStreet() { return street; }
    public String getTown() { return town; }
    public String getUsername() { return username; }
    public int getZip() { return zip; }
    public void setPassword(String password) { this.password = password; }
    public void setStreet(String street) { this.street = street; }
    public void setTown(String town) { this.town = town; }
    public void setUsername(String username) { this.username = username; }
    public void setZip(int zip) { this.zip = zip; }
}

and that you have a CSV file with a header. Let's assume the following content

username, password,   date,        zip,  town
Klaus,    qwexyKiks,  17/1/2007,   1111, New York
Oufu,     bobilop,    10/10/2007,  4555, New York

You can then create an instance of the UserBean and populate it with values from the second line of the file with the following code

class ReadingObjects {
  public static void main(String[] args) throws Exception{
    ICsvBeanReader inFile = new CsvBeanReader(new FileReader("foo.csv"), CsvPreference.EXCEL_PREFERENCE);
    try {
      final String[] header = inFile.getCSVHeader(true);
      UserBean user;
      while( (user = inFile.read(UserBean.class, header, processors)) != null) {
        System.out.println(user.getZip());
      }
    } finally {
      inFile.close();
    }
  }
}

using the following "manipulation specification"

final CellProcessor[] processors = new CellProcessor[] {
    new Unique(new StrMinMax(5, 20)),
    new StrMinMax(8, 35),
    new ParseDate("dd/MM/yyyy"),
    new Optional(new ParseInt()),
    null
};

3 Comments

Your code would not compile so I submitted some corrections. Also, ParseDate() does not work correctly so I replaced it to read a String. It can be parsed later.
Big limitation: SuperCSV is not threadsafe, I'm going to looking to Jackson, although it may be more feature limited
SuperCsv also doesn't allow using multimaps. Would be nice to see it work with MultiMaps.
18

Reading CSV format description makes me feel that using 3rd party library would be less headache than writing it myself:

Wikipedia lists 10 or something known libraries:

I compared libs listed using some kind of check list. OpenCSV turned out a winner to me (YMMV) with the following results:

+ maven

+ maven - release version   // had some cryptic issues at _Hudson_ with snapshot references => prefer to be on a safe side

+ code examples

+ open source   // as in "can hack myself if needed"

+ understandable javadoc   // as opposed to eg javadocs of _genjava gj-csv_

+ compact API   // YAGNI (note *flatpack* seems to have much richer API than OpenCSV)

- reference to specification used   // I really like it when people can explain what they're doing

- reference to _RFC 4180_ support   // would qualify as simplest form of specification to me

- releases changelog   // absence is quite a pity, given how simple it'd be to get with maven-changes-plugin   // _flatpack_, for comparison, has quite helpful changelog

+ bug tracking

+ active   // as in "can submit a bug and expect a fixed release soon"

+ positive feedback   // Recommended By 51 users at sourceforge (as of now)

Comments

8

We use JavaCSV, it works pretty well

1 Comment

The only issue with this library is that it won't allow you to output CSV files with Windows line terminators (\r\n) when not running on Windows. The author has not provided support for years. I had to fork it to allow that missing feature: JavaCSV 2.2
6

For the last enterprise application I worked on that needed to handle a notable amount of CSV -- a couple of months ago -- I used SuperCSV at sourceforge and found it simple, robust and problem-free.

5 Comments

+1 for SuperCSV, but it has some nasty bugs which aren't fixed yet, new bugs aren't handled currently, and the last release is almost two years old. But we are using a patched/modified version in production without any problems.
@MRalwasser Super CSV 2.0.0-beta-1 has recently been released. It includes many bug fixes and new features (including Maven support and a new Dozer extension for mapping nested properties and arrays/Collections)
@Hound-Dog Thank you for the update, I already noticed the new beta and I'm glad to see the project alive - although the frequency of commits still fears me a little bit (almost all commits on a few days only). But I'll take a look. Is there an estimated release date of the final 2.0?
@MRalwasser I'm the only dev at the moment and have full time work, so I tend to work on this whenever I get a free weekend - hence the sporadic commits :) Nearly 1000 SF downloads of the beta now, and no bugs, so looking on track for a final release early next month. If you have any ideas for future features please let us know.
SuperCSV is not threadsafe at this stage, that makes it not really robust imho
6

You can use csvreader api & download from following location:

http://sourceforge.net/projects/javacsv/files/JavaCsv/JavaCsv%202.1/javacsv2.1.zip/download

or

http://sourceforge.net/projects/javacsv/

Use the following code:

/ ************ For Reading ***************/

import java.io.FileNotFoundException;
import java.io.IOException;

import com.csvreader.CsvReader;

public class CsvReaderExample {

    public static void main(String[] args) {
        try {

            CsvReader products = new CsvReader("products.csv");

            products.readHeaders();

            while (products.readRecord())
            {
                String productID = products.get("ProductID");
                String productName = products.get("ProductName");
                String supplierID = products.get("SupplierID");
                String categoryID = products.get("CategoryID");
                String quantityPerUnit = products.get("QuantityPerUnit");
                String unitPrice = products.get("UnitPrice");
                String unitsInStock = products.get("UnitsInStock");
                String unitsOnOrder = products.get("UnitsOnOrder");
                String reorderLevel = products.get("ReorderLevel");
                String discontinued = products.get("Discontinued");

                // perform program logic here
                System.out.println(productID + ":" + productName);
            }

            products.close();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

}

Write / Append to CSV file

Code:

/************* For Writing ***************************/

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

import com.csvreader.CsvWriter;

public class CsvWriterAppendExample {

    public static void main(String[] args) {

        String outputFile = "users.csv";

        // before we open the file check to see if it already exists
        boolean alreadyExists = new File(outputFile).exists();

        try {
            // use FileWriter constructor that specifies open for appending
            CsvWriter csvOutput = new CsvWriter(new FileWriter(outputFile, true), ',');

            // if the file didn't already exist then we need to write out the header line
            if (!alreadyExists)
            {
                csvOutput.write("id");
                csvOutput.write("name");
                csvOutput.endRecord();
            }
            // else assume that the file already has the correct header line

            // write out a few records
            csvOutput.write("1");
            csvOutput.write("Bruce");
            csvOutput.endRecord();

            csvOutput.write("2");
            csvOutput.write("John");
            csvOutput.endRecord();

            csvOutput.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

Comments

3

There is also CSV/Excel Utility. It assumes all thos data is table-like and delivers data from Iterators.

Comments

2

The CSV format sounds easy enough for StringTokenizer but it can become more complicated. Here in Germany a semicolon is used as a delimiter and cells containing delimiters need to be escaped. You're not going to handle that easily with StringTokenizer.

I would go for http://sourceforge.net/projects/javacsv

Comments

0

If you intend to read csv from excel, then there are some interesting corner cases. I can't remember them all, but the apache commons csv was not capable of handling it correctly (with, for example, urls).

Be sure to test excel output with quotes and commas and slashes all over the place.

1 Comment

The Apache Commons CSV library does offer a specific variant for Microsoft Excel. I don’t know if that now handles the problems you mention or not.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.