Kinda complicated tar based on modified date

Question

Okay, I think this is possible, but I can't quite figure it out. This is the situation.

A folder contains the log files of all the processes on my robot. The structure looks sort of like this:

$ ls -lrt total 8 drwxrwxr-x 2 per per 4096 nov 3 12:46 launch01 -rw-rw-r-- 1 per per 0 nov 3 12:47 camera112.log -rw-rw-r-- 1 per per 0 nov 3 12:47 motors121.log -rw-rw-r-- 1 per per 0 nov 3 12:47 lidar111.log drwxrwxr-x 2 per per 4096 nov 3 12:49 launch02 -rw-rw-r-- 1 per per 0 nov 3 12:49 motors122.log -rw-rw-r-- 1 per per 0 nov 3 12:49 lidar211.log -rw-rw-r-- 1 per per 0 nov 3 12:49 camera113.log

The files camera112.log, motors121.log and lidar111.log are associated to the logs in folder launch01. I would like to write a script that gets all the files that belong to a specific launch and tar them into one tarball. Since timestamps can change between slightly by files and the numbers in the files are only nearly related, I think the best way to gather all relevant files is to get all files which are below launch01 (inclusive), up to the next directory in the list (exclusive). The number of files can vary, as can the time stamps and names. What is consistent is the folder, then a bunch of files, then the next folder, then files, etc. Ultimately, I would like to get the latest set of logs easily.

Unsure of the approach here. Any ideas how to go about this?

Clarifications:

Number of files can vary.
The exact timestamp is not reliable (as above, the folder launch01 is different than camera112.log) but relative timestamps work fine. For instance, if I could tar all files from launch01 (inclusive) to launch02 (exclusive) in the list provided by ls -lrt, that works great.

Welcome to the site. Please elaborate what you mean by "timestamps can change between slightly by files". Do you mean the timestamps are no reliable means to associate the files belonging together? The sort order of the ls -lrt command uses the timestamps, so if you can't rely on them ... — AdminBee
– AdminBee, Commented Nov 3, 2021 at 13:00
"all files which are below launch01" presumably you mean "all files that are newer than launch01"? Above and below have only visual meaning — Chris Davies
– Chris Davies, Commented Nov 3, 2021 at 14:32
Adding to @Theophrastus comment, maybe there's another way of linking the files to the folders that don't rely on something so unreliable as the dates; if you can think such option exists. — schrodingerscatcuriosity
– schrodingerscatcuriosity, Commented Nov 3, 2021 at 15:17
@roaima - "all files which are below launch01" - I took that to be referring to the visual of the output of ls -lrt... so, below launch01/ and above launch02/ ("up to the next directory in the list") — Greenonline
– Greenonline, Commented Nov 3, 2021 at 15:25
@Greenonline oh yes, I completely missed that possibility; I was looking at the set of files shown in the question — Chris Davies
– Chris Davies, Commented Nov 3, 2021 at 15:31

Greenonline · Accepted Answer · 2021-11-04 17:58:29Z

Splitting the task into chunks, using your input of

drwxrwxr-x 2 per per 4096 nov 3 12:46 launch01 -rw-rw-r-- 1 per per 0 nov 3 12:47 camera112.log -rw-rw-r-- 1 per per 0 nov 3 12:47 motors121.log -rw-rw-r-- 1 per per 0 nov 3 12:47 lidar111.log drwxrwxr-x 2 per per 4096 nov 3 12:49 launch02 -rw-rw-r-- 1 per per 0 nov 3 12:49 motors122.log -rw-rw-r-- 1 per per 0 nov 3 12:49 lidar211.log -rw-rw-r-- 1 per per 0 nov 3 12:49 camera113.log

Create the "ordered" list of the filenames only

Use either one of these:

ls -lrt | tr -s ' ' | cut -d' ' -f9 ls -lrt | awk '{print $9}'

gives:

launch01 camera112.log motors121.log lidar111.log launch02 motors122.log lidar211.log camera113.log

Farm the list off into sections

Modifying this answer to Split one file into multiple files based on delimiter, create a file called awk_pattern containing the following:

BEGIN{ fn = "part1.txt"; n = 1 } { if (substr($0,1,6) == "launch") { close (fn) n++ fn = "part" n ".txt" } print > fn }

and then running

ls -lrt | awk '{print $9}' | awk -f awk_pattern

gives the required output:

part1.txt

launch01

and then

part2.txt

launch01 camera112.log motors121.log lidar111.log

part3.txt

launch02 motors122.log lidar211.log camera113.log

Although the first file (part1.txt) should be discarded as it contains only one line...

rm part1.txt

tar the contents of each part

From 6.3 Reading Names from a File

tar -c -v -z -T part2.txt -f part2.tgz

Looping through the tar files

for part_file in $(ls part*) do tar_file = ${part_file%.*} # tar_file = basename ${part_file} .txt tar -c -v -z -T ${part_file} -f ${tar_file}.tgz done

This should give

part1.tgz part2.tgz part3.tgz

Again, part1.tgz should be discarded:

rm part1.tgz

Putting it all together

#!/bin/bash ls -lrt | awk '{print $9}' | awk -f awk_pattern for part_file in $(ls part*) do tar_file = ${part_file%.*} tar -c -v -z -T ${part_file} -f ${tar_file}.tgz done rm part1.txt rm part1.tgz

As just one script (incorporating the awk pattern)

#!/bin/bash ls -lrt | awk '{print $9}' | awk 'BEGIN{ fn = "part1.txt"; n = 1 } { if (substr($0,1,6) == "launch") { close (fn) n++ fn = "part" n ".txt" } print > fn }' for part_file in $(ls part*) do tar_file = ${part_file%.*} tar -c -v -z -T ${part_file} -f ${tar_file}.tgz done rm part1.txt rm part1.tgz

This (hopefully) should work, although I have only tested the first two steps, i.e. up to the tar part, as I don't have the files to tarball up.

Possible improvements:

Post-processing: Remove the part*.txt files (rm part*.txt)
Post-processing: Remove the log files once tar'd up (rm *.log)
Post-processing: Remove the directories once tar'd up (rm -R -- */)

See this answer to How do I remove all sub-directories from within a directory?.
Prevent awk from producing the useless part1.txt file
Save the tar files elsewhere (... -f ${tar_path}/${tar_file}.tgz)
Don't use intermediary part*.txt files.

Tested on OS X.

Greenonline
– Greenonline

2021-11-04 11:16:02 +00:00
Commented Nov 4, 2021 at 11:16 — Greenonline
– Greenonline, Commented Nov 4, 2021 at 11:16

Stack Exchange Network

Kinda complicated tar based on modified date

1 Answer 1

Create the "ordered" list of the filenames only

Farm the list off into sections

tar the contents of each part

Looping through the tar files

Putting it all together

Possible improvements:

You must log in to answer this question.

Linked

Hot Network Questions

Kinda complicated tar based on modified date

1 Answer 1

Create the "ordered" list of the filenames only

Farm the list off into sections

tar the contents of each part

Looping through the tar files

Putting it all together

Possible improvements:

You must log in to answer this question.

Linked

Related

Hot Network Questions