Convert formatted dates to seconds since the epoch

Question

I have a file:

pablo tty8 Thu Nov 1 12:51:21 2012 still logged in (unknown tty8 Thu Nov 1 12:50:57 2012 - Thu Nov 1 12:51:21 2012 (00:00) pablo tty2 Thu Nov 1 12:50:39 2012 still logged in pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) (unknown tty7 Thu Nov 1 12:34:32 2012 - Thu Nov 1 12:49:45 2012 (00:15)

I want to replace the file in the above date for a second. I want to print:

pablo tty8 1351770681 still logged in (unknown tty8 1351770657 - 1351770681 (00:00) pablo tty2 1351770639 still logged in pablo tty7 1351770585 - 1351770656 (00:01) (unknown tty7 1351769672 - 1351770585 (00:15)

I tried this command:

gawk --posix 'function my() {"date -d \047"$0"\047 +%s" | getline b; gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b );print} { my() }' file

The above command does not work:

$ gawk --posix 'function my() > {"date -d \047"$0"\047 +%s" | getline b; > gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b ); print} > { my() }' ta date: błędna data: `pablo tty8 Thu Nov 1 12:51:21 2012 still logged in ' pablo tty8 still logged in (unknown tty8 1351897200 - 1351897200 (00:00) date: błędna data: `pablo tty2 Thu Nov 1 12:50:39 2012 still logged in ' pablo tty2 1351897200 still logged in date: błędna data: `pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) ' pablo tty7 1351897200 - 1351897200 (00:01) (unknown tty7 1351897200 - 1351897200 (00:15)

How to improve the above command?

@glenn jackman, Sorry for the duplicate topic in another forum. — nowy1
– nowy1, Commented Nov 4, 2012 at 9:10
Just wondering what are the norms on this. As an asker, the OP would definitely be tempted to post to multiple platforms in the hope of getting more attention and hence a better chance of resolving his/her question. — Ketan
– Ketan, Commented Nov 12, 2012 at 15:03

JRFerguson · Accepted Answer · 2012-11-03 14:22:14Z

Here's an alternate approach (using mktime):

#!/bin/awk -f { split($6,A,":"); S1=sprintf("%d %d %d %d %d %d",$7,$4,$5,A[1],A[2],A[3]) T1=mktime(S1) if ($8=="-") { split($12,A,":"); S2=sprintf("%d %d %d %d %d %d",$13,$10,$11,A[1],A[2],A[3]) T2=mktime(S2) print $1,$2,T1,$8,T2,$14 } else { print $1,$2,T1,$8,$9,$10 } }

Stéphane Chazelas · Accepted Answer · 2012-11-03 22:51:08Z

To do it your way, that would have to be something like:

POSIXLY_CORRECT=1 awk ' { n = ""; r = $0 while (match(r, /[[:alpha:]]{3} [[:alpha:]]{3} +[0-9]+ ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/)) { c = "date -d\"" substr(r,RSTART,RLENGTH) "\" +%s" c | getline b close(c) n = n substr(r,1,RSTART-1) b r = substr(r,RSTART+RLENGTH) } print n r }'

I want to be able to awk like you :)

delh
– delh

2012-11-04 17:39:52 +00:00
Commented Nov 4, 2012 at 17:39 — delh
– delh, Commented Nov 4, 2012 at 17:39

Thor · Accepted Answer · 2012-11-03 20:16:48Z

You could do it like this with GNU sed:

convert_date.sed

: a s/(([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4})(.*)/\n\4\n\1/ h s/.*\n// s/^/date -d "/ s/$/" +%s/e G s/([^\n]+)\n([^\n]+)\n([^\n]+)\n.*/\2\1\3/ /([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/ta

Run it like this:

sed -rf convert_date.sed infile

Output:

pablo tty8 1351770681 still logged in (unknown tty8 1351770657 - 1351770681 (00:00) pablo tty2 1351770639 still logged in pablo tty7 1351770585 - 1351770656 (00:01) (unknown tty7 1351769672 - 1351770585 (00:15)

Explanation

This may look a bit daunting at first, but the idea is not that complicated. This regular expression, ([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}, which occurs in the first replace and the conditional at the end, matches the date type used in the input, it captures and isolates the date. The surrounding bits are stored in the hold space while date -d is run on the captured date. Finally, all the bits are collected in pattern space and reorganized into the correct order.

The conditional at the end repeats the process if any dates remain in pattern space.

Thank you for a good solution. I just have a question. Is your solution can also be used in awk? — nowy1
– nowy1, Commented Nov 3, 2012 at 19:30
@nowy1: Not easily no. Several useful awk solutions have been suggested, maybe those will do. — Thor
– Thor, Commented Nov 3, 2012 at 23:09

Stéphane Chazelas · Accepted Answer · 2012-11-03 23:05:47Z

With perl and its Date::Manip module:

perl -MDate::Manip -pe ' s/\w{3} \w{3} +\d+ \d\d:\d\d:\d\d \d+/ UnixDate ParseDate("$&"),"%s"/ge'

JRFerguson · Accepted Answer · 2012-11-04 16:23:18Z

The Perl solution provided by Stephane requires a non-core Perl module. One could use the core module (since 5.10), Time::Piece, similarly:

#!/usr/bin/env perl use strict; use warnings; use Time::Piece; my $t = Time::Piece->new; while (<>) { s{\w{3}\s(\w{3}\s\d{1,2}\s\d\d:\d\d:\d\d\s\d{4})} {$t=Time::Piece->strptime($1,"%b %d %H:%M:%S %Y"); sprintf "%s",$t->epoch}ge; print; }

Or POSIX::mktime

Stéphane Chazelas
– Stéphane Chazelas

2012-11-04 17:16:13 +00:00
Commented Nov 4, 2012 at 17:16 — Stéphane Chazelas
– Stéphane Chazelas, Commented Nov 4, 2012 at 17:16

Stack Exchange Network

Convert formatted dates to seconds since the epoch

5 Answers 5

Explanation

You must log in to answer this question.

Linked

Hot Network Questions

Convert formatted dates to seconds since the epoch

5 Answers 5

Explanation

You must log in to answer this question.

Linked

Related

Hot Network Questions