2

I have a huge csv file with 500.000+ lines. I want to add an amount to the "Price" column via the terminal in Ubuntu. I tried using awk (best solution?) but I don't know how. (I also need to keep the header in the new file)

Here is an example of the file

"Productno.";"Description";"Price";"Stock";"Brand" "/5PL0006";"Drum Unit";"379,29";"10";"Kyocera" "00096103";"Main pcb HUK, OP6w";"882,00";"0";"OKI" "000J";"Drum, 7033/7040 200.000";"4306,00";"0";"Minolta" 

I want to for example, add 125 to the price so the output is:

"Productno.";"Description";"Price";"Stock";"Brand" "/5PL0006";"Drum Unit";"504,29";"10";"Kyocera" "00096103";"Main pcb HUK, OP6w";"1007,00";"0";"OKI" "000J";"Drum, 7033/7040 200.000";"4431,00";"0";"Minolta" 
1
  • Any particular reason for favoring Terminal commands? This may better be done with a dedicated program. It doesn't need to be complicated either (well, depending on the language). An obvious advantage would be it can accept the price change as a parameter, so you can re-use it at will. Commented Feb 5, 2016 at 11:13

4 Answers 4

2
$ awk 'BEGIN {FS=OFS="\";\""} NR>1 {$3 = sprintf("%.2f", $3+125)}1' p.txt "Productno.";"Description";"Price";"Stock";"Brand" "/5PL0006";"Drum Unit";"504,29";"10";"Kyocera" "00096103";"Main pcb HUK, OP6w";"1007,00";"0";"OKI" "000J";"Drum, 7033/7040 200.000";"4431,00";"0";"Minolta" 

Note that this requires a value of environment variable LC_NUMERIC that expects , as the decimal separator (I had mine set to LC_NUMERIC="de_DE", e.g.).

For more DRYness you can pass in the amount you want to add with -v:

$ awk -v n=125 'BEGIN {FS=OFS="\";\""} NR>1 {$3 = sprintf("%.2f", $3+n)}1' p.txt 

If you don't care so much about the formatting (that is, if "4431" instead of "4431,00" is acceptable), you can skip the sprintf:

$ awk -v n=125 'BEGIN {FS=OFS="\";\""} NR>1 {$3+=n}1' p.txt 

EDIT: Set FS and OFS in BEGIN block, instead of independently via -v, as suggested in the comments (to better ensure that they receive the same value, since it's important that they be the same).

Sign up to request clarification or add additional context in comments.

Comments

0

Perl to the rescue! Save as add-price, run as perl add-price input.csv 125.

#!/usr/bin/perl use warnings; use strict; use Text::CSV; my ($file, $add) = @ARGV; my $csv = 'Text::CSV'->new({ binary => 1, sep_char => ';', eol => "\n", always_quote => 1, }) or die 'Text::CSV'->error_diag; open my $IN, '<', $file or die $!; open my $OUT, '>', "$file.new" or die $!; while (my $row = $csv->getline($IN)) { if (1 != $csv->record_number) { my $value = $row->[2]; $value =~ s/,/./; $value = sprintf "%.2f", $value + $add; $value =~ s/\./,/; $row->[2] = $value; } $csv->print($OUT, $row); } close $OUT or die $!; 

Comments

0

You can also use php and this fantastic library :https://github.com/parsecsv/parsecsv-for-php :

Download first the library, add it to a new folder and add a copy of your CSV file to the folder (make sure to use a copy, the save method of this library can delete the data of your csv file if you do not use it properly) :

With this library you can parse and modify directly the values !

<?php // !!! Make a copy of your csv file before executing this // Require the Parse CSV library , that you can find there : https://github.com/parsecsv/parsecsv-for-php require_once 'parsecsv.lib.php'; // Instanciate it $csv = new parseCSV(); // Load your file $csv->auto('data.csv'); // Get the number of data rows $nb_data_rows=count($csv->data)-1; // Iterate through each data row. for ($i = 0; $i <= $nb_data_rows; $i++) { // Define the new Price $new_price=$csv->data[$i]["Price"]+125; // Format the price in order to keep two decimals $new_price=number_format($new_price, 2, '.', ''); // Modify the ith value of your csv data $csv->data[$i]=array( "Productno."=> $csv->data[$i]["Productno."], "Description."=> $csv->data[$i]["Description"], "price"=>$new_price, "Stock"=> $csv->data[$i]["Stock"], "Brand"=> $csv->data[$i]["Brand"] ); // save it ! $csv->save(); } 

1 Comment

What's wrong with PHP's fgetcsv? The parsecsv-for-php you mention seems not to be compatible with PHP-5 ...
0

If you aren't concerned about the possibility of ';' occurring within the first two fields, and if you don't want to be bothered with dependence on environment variables, then consider:

awk -F';' -v add=125 ' function sum(s, d) { # global: q, add gsub(q, "", s); split(s,d,","); return (add+d[1])","d[2]; } BEGIN {OFS=FS; q="\""; } NR>1 {$3 = q sum($3) q} {print} ' 

This preserves the double-quotes ("). Using your input, the above script produces:

"Productno.";"Description";"Price";"Stock";"Brand" "/5PL0006";"Drum Unit";"504,29";"10";"Kyocera" "00096103";"Main pcb HUK, OP6w";"1007,00";"0";"OKI" "000J";"Drum, 7033/7040 200.000";"4431,00";"0";"Minolta" 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.