0

I used InputStream, and on parsing, if there is a "," in one column then it considers it as a separate column. ex - abc, xyz, "m,n" then the parsed output is abc , xyz, m, n Here m and n are considered as separate columns.

3
  • Perhaps java.io.StreamTokenizer is a possibility. Or a scanner generator like JFlex. You' have to know how how to set them up for the grammar of a CSV file, though; they're not "out-of-the-box" solutions. Commented Sep 13, 2017 at 10:31
  • What is the data structure of your file and what should you do with the results after parsing? How much memory can the program consume? Commented Sep 13, 2017 at 10:37
  • You don't need much memory to parse CSV. What you need memory for is to store it all. Solution: don't. Process it a line at a time. Commented Sep 13, 2017 at 12:01

2 Answers 2

5

There are many thirdParty Csv parsing library like

  1. UniVocity Parser

  2. CommonsCsv Parser

  3. OpenCsv Parser

  4. SuperCsv Parser

I am using UniVocity csv parser which is very fast and automatically detect separator in rows. You can go through above given csv libraries.

Sign up to request clarification or add additional context in comments.

Comments

2

I really like the Apache Commons CSVParser. This is almost verbatim from their user guide:

Reader reader = new FileReader("input.csv"); final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT); try { for (final CSVRecord record : parser) { final String string = record.get("SomeColumn"); ... } } finally { parser.close(); reader.close(); } 

This is simple, configurable and line-oriented.

You could configure it like this:

final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT.withHeader().withDelimiter(';')); 

For the record, this configuration is unnecessary, as the CSVFormat.DEFAULT works exactly the way you want it to.

This would be my first attempt to see whether it fits into the memory. If it doesn't, can you be a little more specific about low memory footprint?

3 Comments

thanks for replying CSVParser loads the whole file into the memory that is a problem. If the file size is 1GB then already the memory consumption is 1GB around.
@somey CSVParser can do both: reading all into memory, and reading record wise. See commons.apache.org/proper/commons-csv/apidocs/index.html
@somey how do you parse it? That part of the code can read stuff into the memory too. Can you please show us how you do it? Also, you could connect jvisualvm and see what exactly consumes that much memory. Maybe a gc run is needed?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.