10

I have a csv file with data presented as follows

87540221|1356438283301|1356438284971|1356438292151697 87540258|1356438283301|1356438284971|1356438292151697 87549647|1356438283301|1356438284971|1356438292151697 

I'm trying to save the first column to a new file (without field separator , and then delete the first column from the main csv file along with the first field separator.

Any ideas?

This is what I have tried so far

awk 'BEGIN{FS=OFS="|"}{$1="";sub("|,"")}1' 

but it doesn't work

1
  • 6
    What about cut? cut -d '|' -f 2- Commented May 8, 2013 at 18:58

5 Answers 5

19

This is simple with cut:

$ cut -d'|' -f1 infile 87540221 87540258 87549647 $ cut -d'|' -f2- infile 1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 

Just redirect into the file you want:

$ cut -d'|' -f1 infile > outfile1 $ cut -d'|' -f2- infile > outfile2 && mv outfile2 file 
Sign up to request clarification or add additional context in comments.

Comments

7

Assuming your original CSV file is named "orig.csv":

awk -F'|' '{print $1 > "newfile"; sub(/^[^|]+\|/,"")}1' orig.csv > tmp && mv tmp orig.csv 

1 Comment

awk solution works x20 times faster than cut. Tested on 15Gb CSV file.
3

GNU awk

awk '{$1="";$0=$0;$1=$1}1' FPAT='[^|]+' OFS='|' 

Output

1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 

2 Comments

How does it work?
@darw Sets $1 null. $0=$0 resplits the record, which now starts with a | (the OFS). FPAT doesn't match a null, so $1 is now the (former) second field. $1=$1 causes a rebuild of the output record, with OFS='|'. The trailing {...}1 is a shorthand for print: It is the pattern of a second pattern-action pair, and is boolean "true". With no action, awk defaults to {print}. gawk recognises variable settings after the program text without needing a -v.
1

Pipe is special regex symbol and sub function expectes you to pass a regex. Correct awk command should be this:

awk 'BEGIN {FS=OFS="|"} {$1=""; sub(/\|/, "")}'1 file 

OUTPUT:

1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 1356438283301|1356438284971|1356438292151697 

Comments

0

With sed :

sed 's/[^|]*|//' file.txt 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.