3

Following https://www.baeldung.com/linux/history-remove-avoid-duplicates, I added

clean_history() { awk '!a[$0]++' $HOME/.bash_history > $HOME/.bash_history.tmp && mv $HOME/.bash_history.tmp $HOME/.bash_history } trap clean_history DEBUG # execute after every command 

to my ~/.bashrc file in order to ensure that after every command is executed, duplicate entries are removed. This is important for me, because kile (the KDE LaTeX editor) leaves a ton of "clear" and "cd ..." commands in the history, effectively destroying the history within a single session of LaTeX-writing. And I have not found any way to prevent kile to do that.

Anyway... the above line of code effectively removes all subsequent entries of a command and leaves the first entry in the history. However, I want the last entry to remain in the history and all previous entries to be deleted. How can do this?

By the way: Kile ignores

HISTCONTROL=ignoredups:erasedups 

so using this approach seems not to be an option.

I would also be happy if we could prevent Kile to write into the history all together.

Non-duplicates:

9
  • 1
    Dumb question: In a list of duplicate lines, what difference does choosing either the first or last occurrence make in the end result? Commented Jul 1, 2024 at 22:18
  • Do you even need history from Kile? If not, you could probably detect if Kile is the parent process in .bashrc and just disable history recording. Commented Jul 1, 2024 at 22:34
  • 2
    @spuck, say a certain command is issued regularly. Choosing the first instance means it's buried near the beginning of history. Choosing the last means it's recent. The difference is whether the first time or most recent time a command is used is preserved. If the purpose of history is to repeat recently used commands (usually the case), the latter is "better". Commented Jul 1, 2024 at 22:37
  • @Mathew: No, I do not need (nor want) the history from Kile, but I do not know how to disable Kile's history recording (and I searched). If you know how to do that, that would be great. Commented Jul 2, 2024 at 10:08
  • 1
    And another suggestion for keeping history clean ... I personally wouldn't use Ctrl-r for a cd .. and I use Ctrl-L for a clear, so having these in history ever seems unnecessary. I use HISTIGNORE= for things I see aspollutants, so your ~/.bashrc could have e.g. HISTIGNORE="cd:cd ..:pwd:exit:mc:clear:history *" ... Commented Jul 17, 2024 at 17:36

4 Answers 4

6

You can reverse the file, process it, and reverse it back. Most systems have either tail -r or GNU tac (or its busybox or toybox clone) for that:

( umask 077 # make sure the new ~/.bash_history remains private tac ~/.bash_history | awk '!seen[$0]++' | tac > ~/.bash_history.tmp && mv ~/.bash_history.tmp ~/.bash_history ) 
0
2
awk '{ delete line[lnum[$0]]; line[NR] = $0; lnum[$0] = NR } END { for (n = 1; n <= NR; n++) { if (n in line) print line[n] } }' 

We maintain two associative arrays. line[i] gives us the content of line number i. As we scan through the input, lnum[str] gives us the most recent line number which contains str.

At each step, we first delete the currently seen line from the line[] array by looking up its line number in the reverse lnum index. We then enter the current line into the line array, and update the current line number in lnum as being the most recent occurrence of that line. Thus, if the current line has been seen before, it is erased from line and superseded by a new entry.

Then at the end we just iterate through the line numbers, and print line[i], provided it has not been deleted.

While the tac approach simplifies the awk code considerably, the command is not as portable, and adds an extra pass. That solution still builds an associative table out of the lines that it sees, albeit only one.

Note: we would like, in the END block, to just do for (n in line) print line[n], but Awk does not guarantee the order of associative array indices. I think, that will work in GNU Awk, but is not portable.

5
  • A couple of questions: 1. Why is the tac approach less portable? 2. So, the tac approach requires half of memory, because it only needs to keep the lines in memory once, while you (approximately) need to have all lines in memory twice - once for line and once for lnum. Is that correct? 3. Does this not produce errors when you try to delete line[lnum[$0]], but it cannot find the lookup entry? Or is this different then in other programming languages? Commented Jul 8, 2024 at 8:39
  • After some research regarding 3: It seems to me that the entry for lnum[$0] is created if it does not exist when lnum[$0] is called. Likewise for line[lnum[$0]]. And afterwards it is deleted with delete. Is that correct? Commented Jul 8, 2024 at 9:26
  • 1
    @Make42 line[lnum[$0]] gets deleted. I don't know whether the delete operator creates the entry first if it doesn't exist and then deletes it, or whether it does nothing in the non-existence case. Certainly lnum[$0] gets created if it does not exist, with an undefined value, but we clobber it with a new value two statements later. Commented Jul 8, 2024 at 10:15
  • 1
    @Make42 Awk is very loose compared to many other scripting languages. You can delete things that don't exist, and such. If you simply mention a variable, that causes it to exist. Strings that look like numbers can be subject to arithmetic, ... Commented Jul 8, 2024 at 10:16
  • @Make42 Whether this solution takes more memory due to the parallel arrays depends on the Awk implementation. I can't think of any reason why the $0 string would not simply be reference counted so that line[NR] = $0 increments the refcount to store the value, and lnum[$0] increments the refcount to store the key. More memory will be used due to more assoc array entries, but not for a wholesale copy of the string data. Awk strings are not mutable, so they are shareable, similarly to strings in Javascript. Commented Jul 8, 2024 at 10:18
2

You could prevent kile from polluting your history by invoking it as unset HISTFILE;HISTSIZE=0 kile.

Also, see my AskUbuntu answer: https://askubuntu.com/questions/80371/bash-history-handling-with-multiple-terminals/80882#80882

1
  • Thank you! I tried this out, works great! Not the answer to the question I asked but the answer I needed :-). Although, by now, I combine your approach and the one by Kaz. Commented Jul 17, 2024 at 9:04
2

Rather than answer the Question as written, I'm going to offer a suggestion for fixing the underlying issue.

First, if you haven't already, file a bug against Kile; it seems like there ought to at least be an option for Kile to turn off shell history. (Granted, since it may not be able to rely on a specific shell, this might be a little complicated, but e.g. running a specific user script could still be a solution, even if it requires more user effort.)

Second... since we're already talking about fiddling with .bashrc (which I presume is being run by Kile), a better solution would be to teach .bashrc to detect if the shell is being run by Kile and to disable history in that case:

# in your .bashrc... has_ancestor() { i=( $(ps --no-headers -o ppid,cmd $1) ) [ "${i[0]}" == 1 ] && return 1 [ "$2" == "${i[1]##*/}" ] && return 0 has_ancestor ${i[0]} "$2" } if has_ancestor $$ kile; then set +o history # ...or whatever fi 

This walks the process tree from the shell's PID ($$) up to pid 1, checking if the command name (the ##*/ removes a leading path which may or may not be present) matches some specified name (in this case, kile). It's very likely that Kile will be the parent process of the shell it's running. (There are ways to disassociate from the spawning process, but it's very unlikely Kile would do this. At least Konsole does not.)

This might also be overkill; here's a simplified (but less robust) version:

has_parent() { ppid=$(ps --no-headers -o ppid $1) i=( $(ps --no-headers -o cmd $ppid) ) [ "$2" == "${i[1]##*/}" ] } if has_parent $$ kile; then set +o history # ...or whatever fi 

In any case, the goal is simple; detect if bash's parent/ancestor is kile and "do something" accordingly. I've taken the guess that "something" is set -o history.

8
  • Love the idea! Follow up questions: 1. From a practical standpoint: Do you think there are advantages to what waltinator suggests? 2. Would the ps --no-headers -o cmd $(ps --no-headers -o ppid $$) replace everything in the above code, or only the code in has_ancestor(){...}? Commented Jul 17, 2024 at 9:09
  • 1
    @Make42, 1) waltinator's suggestion is to launch kile in an environment with history disabled. That probably works, but it's inconvenient, and will only help if you remember to use it every time. (That said, if you meant using the environment changes instead of set -o history, that might work also, but seems more round-about than set -o history.) 2) See edit. It would probably be beneficial to learn enough shell to understand how both functions actually work rather than just trusting what I wrote. 🙂 Commented Jul 17, 2024 at 17:13
  • @Make42, looking at the output of ps --no-headers -o cmd will only work if the command-line used to invoke Kile is exactly kile, not /usr/bin/kile, kile <args...>, or a combination thereof. The version in the edit retains robustness against these, only discarding the test against any ancestor process. If you were to write it inline, you would write something like if [ $(ps --no-headers -o cmd $(ps --no-headers -o ppid $$)) == kile ] instead of the if ... shown in the Answer, but mind the caveats in the beginning of this comment. Commented Jul 17, 2024 at 17:21
  • 1
    Shouldn't it be set +o history instead of set -o history? digi.ninja/blog/hiding_bash_history.php Commented Jul 19, 2024 at 9:57
  • 1
    Regarding "what is the difference to waltinator's suggestion": Your solution works independently from where (konsole, start menu, etc.) I call kile. Commented Jul 19, 2024 at 10:03

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.