0

I have sample CSV data file with 1M records with 1 date (14/03/2017 00:11:17) only. I need to generate 6 month data from this sample CSV file. Bash script taking 20 Minutes to generate 1 day data.

DATA SAMPLE

  • '12/01/2017 03:22:17,sampledata,1234,sample,123455,67546464'

EXPECTED RESULT

  • '01/01/2017 03:22:17,sampledata,1234,sample,123455,67546464'

  • '02/01/2017 03:22:17,sampledata,1234,sample,123455,67546464'

    to

  • '30/01/2017 03:22:17,sampledata,1234,sample,123455,67546464'

7
  • what does mean to generate 6 month data in your case? should it be ordered? should it be saved into separate files? Commented Aug 17, 2017 at 12:52
  • I have only 1 Day data with date of "12/01/2017 03:22:17". I want to generate data from "01/01/2017 03:22:17" to "01/07/2017 03:22:17" from CSV file which has 1M rows. I tried with BASH + SED was too slow to generate data. Hence required to help of perl or python script Commented Aug 17, 2017 at 12:54
  • @R C, so you have 1M rows of same 1 Day data ? Commented Aug 17, 2017 at 12:55
  • @RomanPerekhrest, Yes Commented Aug 17, 2017 at 12:59
  • So, for each 1M rows of your existing data, you need to make about 180 copies of it (with incrementing dates)? Commented Aug 17, 2017 at 13:03

1 Answer 1

1
cat 6months.pl 
#!/usr/bin/env perl use Text::CSV; use DateTime; use DateTime::Format::Strptime; use autodie qw/ open close /; my $csv = Text::CSV->new({binary => 1, quote_space => 0}); my $dateparser = DateTime::Format::Strptime->new(pattern => "%d/%m/%Y %T", time_zone => "local"); for my $file (@ARGV) { open my $fh, '<', $file; while (my $row = $csv->getline($fh)) { my $datestr = shift @$row; my $date = $dateparser->parse_datetime($datestr)->truncate(to => month); my $end = $date->clone->add(months => 6); while ($date <= $end) { $csv->say(STDOUT, [$dateparser->format_datetime($date), @$row]); $date = $date->add(days => 1); } } close $fh; } 

Running it:

perl 6months.pl data.csv 
01/01/2017 00:00:00,sampledata,1234,sample,123455,67546464 02/01/2017 00:00:00,sampledata,1234,sample,123455,67546464 ... 30/06/2017 00:00:00,sampledata,1234,sample,123455,67546464 01/07/2017 00:00:00,sampledata,1234,sample,123455,67546464 

Just noticed this resets the time to midnight. If you want to keep the time, do this instead:

 my $date = $dateparser->parse_datetime($datestr)->set(day => 1); # ^^^^^^^^^^^^^ 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.