Timeline for Sort and backup log4j files

Current License: CC BY-SA 3.0

13 events

when toggle format	what		by	license	comment
Dec 4, 2014 at 12:02	vote	accept	Albert
Dec 3, 2014 at 17:30	answer	added	muru		timeline score: 1
Dec 3, 2014 at 17:06	comment	added	muru		@Albert the weird character you're looking for is `\0` (`NUL`). Try something like: `perl -pe 's/\n(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},)/\0$1/' log.file` to replace all newlines just before a date/time stamp with the null character, and the use sort's `-z` option.
Dec 3, 2014 at 16:51	comment	added	Albert		Ok, I get that, now. I'm afraid I won't be allowed to do that. Besides, I should have a lot of processes concurrently, one per file: I have 11 files per logger, and 5 loggers running 24/7. It's too much.
Dec 3, 2014 at 16:41	comment	added	jofel		You cannot guarantee it, but if new log lines are immediately written, you get the ordering automatically due to the different timings. Each `tail -f` waits until the logfile they belong to is changed and writes the new lines to its output. Of course, the tail commands need to run the whole time. It does not work afterwards.
Dec 3, 2014 at 16:37	comment	added	Albert		@jofel, I'm not sure how to do that. How is that guarantee output file is sorted?
Dec 3, 2014 at 16:35	comment	added	Albert		I uploaded two files, with the conditions I described: Each file is sorted, but not merged: file.log.1 and file.log.2 <br/> On the other hand, I've been thinking about try to detect each EOL character (\n or \r\n) which is not and end of register, and replace for a weird character. Then, merge, sort (e.g. as @muru said), and replace back to and EOL character by finding weird characters. What do you think?
Dec 3, 2014 at 16:19	comment	added	jofel		It probably mixes the logs together and may not work correctly, but you can try to run parallel `tail -f` for each logfile and append their output to the same output file.
Dec 3, 2014 at 14:17	comment	added	Albert		@jimmij: No, it is not fixed. It can contain a summary of some process, an XML, a response from a REST service, etc. There is not pattern.
Dec 3, 2014 at 14:16	comment	added	Albert		@muru, I will, but I have to hide some information (company policy). It'll take me some time after lunch ;)
Dec 3, 2014 at 14:08	comment	added	jimmij		Is number of lines in each block of text fixed?
Dec 3, 2014 at 14:03	comment	added	muru		I'd like to point out that dates of the form `YYYYdMMdDD` can be sorted lexicographically (`d` being some delimiter), so you can sort directly on the first key. The same holds for `HHdMMdSS`, if you use 24-hour format. (so `-k1 -k2.1,2.8`). Can you post a few files (at some pastebin, or on Github) with scrubbed example data, so I can do some trials? Also, if you're on bash: `sort .... file.log.{10..1} file.log -o $FILE_SORTED`
Dec 3, 2014 at 13:51	history	asked	Albert	CC BY-SA 3.0