I have a text file that can have X number of fields, each separated by a comma. In my script I reading line by line, checking how many fields have been populated on that line and determining how many commas i need to append to the end of that line to represent all the fields. For instance a file looks like this:
Address,nbItems,item1,item2,item3,item4,item5,item6,item7 2325988023,7,1,2,3,4,5,6,7 2327036284,5,1,2,3,4,5 2326168436,4,1,2,3,4 Should become this:
Address,nbItems,item1,item2,item3,item4,item5,item6,item7 2325988023,7,1,2,3,4,5,6,7 2327036284,5,1,2,3,4,5,, 2326168436,4,1,2,3,4,,, My script below works, but it seems terribly inefficient. Is it the reading line by line that has a hard time on large files? Is it the sed that causes the slowdown? Better way to do this?
#!/bin/bash lineNum=0 numFields=`head -1 File.txt | egrep -o "," | wc -l` cat File.txt | while read LINE do lineNum=`expr 1 + $lineNum` num=`echo $LINE | egrep -o "," | wc -l` needed=$(( numFields - num )) for (( i=0 ; i < $needed ; i++ )) do sed -i "${lineNum}s/$/,/" File.txt done done