String,split(",") isn't likely to work.
It will split fields that have embedded commas ("Foo, Inc.") even though they are a single field in the CSV line.
What if the company name is:
Company, Inc.
or worse:
Joe's "Good, Fast, and Cheap" Food
According to Wikipedia: (http://en.wikipedia.org/wiki/Comma-separated_values)
Fields with embedded commas must be enclosed within double-quote characters.
1997,Ford,E350,"Super, luxurious truck"
Fields with embedded double-quote characters must be enclosed within double-quote characters, and each of the embedded double-quote characters must be represented by a pair of double-quote characters.
1997,Ford,E350,"Super ""luxurious"" truck"
Even worse, quoted fields may have embedded line breaks (newlines; "\n"):
Fields with embedded line breaks must be enclosed within double-quote characters.
1997,Ford,E350,"Go get one now they are going fast"
This demonstrates the problem with String,split(",") parsing commas:
The CSV line is:
a,b,c,"Company, Inc.", d, e,"Joe's ""Good, Fast, and Cheap"" Food", f, 10/11/2010,1/1/2011, g, h, i
// Test String.split(",") against CSV with // embedded commas and embedded double-quotes in // quoted text strings: // // Company names are: // Company, Inc. // Joe's "Good, Fast, and Cheap" Food // // Which should be formatted in a CSV file as: // "Company, Inc." // "Joe's ""Good, Fast, and Cheap"" Food" // // public class TestSplit { public static void TestSplit(String s, String splitchar) { String[] split_s = s.split(splitchar); for (String seg : split_s) { System.out.println(seg); } } public static void main(String[] args) { String csvLine = "a,b,c,\"Company, Inc.\", d," + " e,\"Joe's \"\"Good, Fast," + " and Cheap\"\" Food\", f," + " 10/11/2010,1/1/2011, h, i"; System.out.println("CSV line is:\n" + csvLine + "\n\n"); TestSplit(csvLine, ","); } }
Produces the following:
D:\projects\TestSplit>javac TestSplit.java D:\projects\TestSplit>java TestSplit CSV line is: a,b,c,"Company, Inc.", d, e,"Joe's ""Good, Fast, and Cheap"" Food", f, 10/11/2010,1/1/2011, g, h, i a b c "Company Inc." d e "Joe's ""Good Fast and Cheap"" Food" f 10/11/2010 1/1/2011 g h i D:\projects\TestSplit>
Where that CSV line should be parsed as:
a b c "Company, Inc." d e "Joe's ""Good, Fast, and Cheap"" Food" f 10/11/2010 1/1/2011 g h i