0

I have a file which has different record_types. (positions 1-2 in the sample below; 1,2,1A,2A,3 etc..are the record_types)

File1:

1 xxxx uuuu dfffgg cvbd 2 jjj oo dhjkkk ooo 2 9555 schghf kllls 1A chkds tddc ihg 2A hkkseadc 1 fdsff kljjgt uoohgf 1A ghyytd gkddgg tusab sg;dadug tdskd 1A gdjhjkh hdw ouiy axv kaksh ;ljqskl 3 gdhd tfyw ;lk;k; joo 1 gdhsgdhj uyutyu ljkgjg 2 hjkhclkshclk jhshcklj dhkjdh 2A hjkdhfsh jj okop oipo 

I want to generate a sequence number to make set transactions. For example, from record_type 1 till the next occurrence of record type 1 is considered as 1 transaction`

In the above file e.g: from 1 till 2A (1st 5 lines should be 1 transaction file) from next 1 till 3 (lines 6 till 9th line ) are 2nd transaction and next occurrence of 1 till 2A is 1 set of transaction etc..

I want to do this split accordingly. I used the below code to generate sequence and use it:

awk ' BEGIN {SEQ=0 } {if ( substr($0,1,2) == "1 " ) {SEQ++;} print $0SEQ }' file1 > file2 

Now my file 2 looks like: (The sequence numbers are getting added to the last digits of the line.)

1 xxxx uuuu dfffgg cvbd1 2 jjj oo dhjkkk ooo1 2 9555 schghf kllls1 1A chkds tddc ihg1 2A hkkseadc1 1 fdsff kljjgt uoohgf2 1A ghyytd gkddgg tusab sg;dadug tdskd2 1A gdjhjkh hdw ouiy axv kaksh ;ljqskl2 3 gdhd tfyw ;lk;k; joo2 1 gdhsgdhj uyutyu ljkgjg3 2 hjkhclkshclk jhshcklj dhkjdh3 2A hjkdhfsh jj okop oipo3 

The sequence numbers are getting added to the last digits of the line. That causes validation issues for me while passing values with fixed position length. Is there any way to add the sequence number at the desired fixed position or at the beginning of the line rather than the end?

Is there any better way to do this grouping of set transactions?

awk ' BEGIN {SEQ=0 } {if ( substr($0,1,2) == "1 " ) {SEQ++;} print $0SEQ }' file1 > file2 
0

2 Answers 2

1
awk ' BEGIN {SEQ=0 } {if ( substr($0,1,2) == "1 " ) {SEQ++;} printf "%10d%s\n",SEQ,$0 }' file1 > file2 
0

Increment the sequence number when you see a 1 in the first field. Then keep using that number to prefix the records until you see a 1 again in the first field. Rinse n repeat.

awk ' $1 == 1 {seq++} { print seq $0 } ' file 

Output:

11 xxxx uuuu dfffgg cvbd 12 jjj oo dhjkkk ooo 12 9555 schghf kllls 11A chkds tddc ihg 12A hkkseadc 21 fdsff kljjgt uoohgf 21A ghyytd gkddgg tusab sg;dadug tdskd 21A gdjhjkh hdw ouiy axv kaksh ;ljqskl 23 gdhd tfyw ;lk;k; joo 31 gdhsgdhj uyutyu ljkgjg 32 hjkhclkshclk jhshcklj dhkjdh 32A hjkdhfsh jj okop oipo 
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.