I used InputStream, and on parsing, if there is a "," in one column then it considers it as a separate column. ex - abc, xyz, "m,n" then the parsed output is abc , xyz, m, n Here m and n are considered as separate columns.
- Perhaps java.io.StreamTokenizer is a possibility. Or a scanner generator like JFlex. You' have to know how how to set them up for the grammar of a CSV file, though; they're not "out-of-the-box" solutions.Kevin Anderson– Kevin Anderson2017-09-13 10:31:43 +00:00Commented Sep 13, 2017 at 10:31
- What is the data structure of your file and what should you do with the results after parsing? How much memory can the program consume?Mick Mnemonic– Mick Mnemonic2017-09-13 10:37:48 +00:00Commented Sep 13, 2017 at 10:37
- You don't need much memory to parse CSV. What you need memory for is to store it all. Solution: don't. Process it a line at a time.user207421– user2074212017-09-13 12:01:59 +00:00Commented Sep 13, 2017 at 12:01
2 Answers
There are many thirdParty Csv parsing library like
I am using UniVocity csv parser which is very fast and automatically detect separator in rows. You can go through above given csv libraries.
Comments
I really like the Apache Commons CSVParser. This is almost verbatim from their user guide:
Reader reader = new FileReader("input.csv"); final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT); try { for (final CSVRecord record : parser) { final String string = record.get("SomeColumn"); ... } } finally { parser.close(); reader.close(); } This is simple, configurable and line-oriented.
You could configure it like this:
final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT.withHeader().withDelimiter(';')); For the record, this configuration is unnecessary, as the CSVFormat.DEFAULT works exactly the way you want it to.
This would be my first attempt to see whether it fits into the memory. If it doesn't, can you be a little more specific about low memory footprint?
3 Comments
jvisualvm and see what exactly consumes that much memory. Maybe a gc run is needed?