3

I have the below csv file which contain the following data which i want to convert into excel file into 2003 format with the help of POI jars so i am using poi 3.10 version for that .

while converting it into .xls file special care need to be taken for the column PAYMENT_TYPE as its value in between consist a comma but that should not be consider as delimiter and it value starts with double quotes and end with double quotes so that should be treated as single value so the position of column PAYMENT_TYPE is also fixed in the csv file that it will be always be at fifth position so below is the .csv file

REC_STATUS,TRADE_ID,SETTLEMENT_DATE,TRADE_EFFECTIVE_DATE,PAYMENT_TYPE,VERSION,BREAK_DOWN_BUCKET,CAUSE,NUM_CASHFLOWS_AFFECTED,PROFILE Found only in File :B,178942690,01-Feb-16,03-Dec-14,"Coupon",5,NOISY_BREAK_BUCKET,REC_TOOL_ISSUE_PAYMENT_DIRECTION_MISMATCH | REC_TOOL_ISSUE_NOTIONAL_MISMATCH | TRADE_VERSION,1,AVS Offshore Found only in File :A,197743320,Various,21-Dec-15,"Brokerage Estimated,Upfront Fee",1,ACTUAL DATA BREAK BUCKET,ACTUAL_DATA_BREAK,2,AVS Offshore 

now i have code to convert it into .xls as shown below the only issue with that .xls file is that when i open that .xls file the value of second row is not coming correct that is the value of fifth column in excel is Brokerage Estimated,Upfront Fee and then sixth column value is Upfront Fee which is not correct the value of sixth column is 1 , please advise how can i correct this in my below piece of code

package test; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import org.apache.poi.hssf.usermodel.HSSFCell; import org.apache.poi.hssf.usermodel.HSSFRow; import org.apache.poi.hssf.usermodel.HSSFSheet; import org.apache.poi.hssf.usermodel.HSSFWorkbook; import org.apache.poi.ss.usermodel.Cell; public class CSVToExcelConverter { public static void main(String args[]) throws IOException { ArrayList arList=null; ArrayList al=null; String fName = "C:\\Vabc.csv"; String thisLine; int count=0; FileInputStream fis = new FileInputStream(fName); DataInputStream myInput = new DataInputStream(fis); int i=0; arList = new ArrayList(); while ((thisLine = myInput.readLine()) != null) { al = new ArrayList(); String strar[] = thisLine.split(","); for(int j=0;j<strar.length;j++) { if(j == 4){ al.add(strar[j] + "," + strar[j+1]); j++; } al.add(strar[j]); } arList.add(al); System.out.println(); i++; } try { HSSFWorkbook hwb = new HSSFWorkbook(); HSSFSheet sheet = hwb.createSheet("new sheet"); for(int k=0;k<arList.size();k++) { ArrayList ardata = (ArrayList)arList.get(k); HSSFRow row = sheet.createRow((short) 0+k); for(int p=0;p<ardata.size();p++) { HSSFCell cell = row.createCell((short) p); String data = ardata.get(p).toString(); if(data.startsWith("=")){ cell.setCellType(Cell.CELL_TYPE_STRING); data=data.replaceAll("\"", ""); data=data.replaceAll("=", ""); cell.setCellValue(data); }else if(data.startsWith("\"")){ data=data.replaceAll("\"", ""); cell.setCellType(Cell.CELL_TYPE_STRING); cell.setCellValue(data); }else{ data=data.replaceAll("\"", ""); cell.setCellType(Cell.CELL_TYPE_NUMERIC); cell.setCellValue(data); } //*/ // cell.setCellValue(ardata.get(p).toString()); } System.out.println(); } FileOutputStream fileOut = new FileOutputStream("C:\\test.xls"); hwb.write(fileOut); fileOut.close(); System.out.println("Your excel file has been generated"); } catch ( Exception ex ) { ex.printStackTrace(); } //main method ends } } 

folks please advise for this

any early help would be much appreciated

2
  • You should take a look at some csv library (like opencsv for example) Commented Feb 10, 2016 at 16:13
  • please advise how can i correct my above logic please Commented Feb 10, 2016 at 16:22

1 Answer 1

1

Do not use split() to parse csv data, especially if some fields contain comma's.

 String strar[] = thisLine.split(","); for(int j=0;j<strar.length;j++) { if(j == 4){ al.add(strar[j] + "," + strar[j+1]); j++; } al.add(strar[j]); } 

In the above code the row is split on , and the length of strar will be different depending on a , being present in PAYMENT_TYPE.

Then you check for the column number you're in, and always add a , and the next value - even if PAYMENT_TYPE contains no ,.

The csv data itself is valid and properly quoted.

Here is some pseudo code to parse it correctly:

read a line bool bInsideQuotes = false loop over chars if character == '"' bInsideQuotes = !bInsideQuotes if character == ',' and !bInsideQuotes found a field separator 
Sign up to request clarification or add additional context in comments.

3 Comments

Well Thanks a lot it would be great if you can please add this in my above piece of code so that I can grasp more thanks in advance
@sss - All you have to do is keep a startpos of the field, loop to the next real separator, considering quoted fields (see pseudo code above), take the substring startpos->currentpos, set startpos 1 past currentpos and repeat until end of line.
Thanks again but java implementation as I have done in my program would have help a lot so request you if you can please do the same Thanks in advance.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.