I'm trying to mask some sensitive data in a log file.
I first need to filter out specific lines from the file with a matching pattern and then for those specific lines I need to replace any text that is inside double quotes but leave alone any text that is not.
In the file, all lines that matches with the pattern, that has double quotes, anything inside double quotes needs to be be replaced in a way that any A-Z gets replaced by X, any a-z by x and any digit 0-9 by 0.
In one line, there can be multiple quoted strings. Inside quotes can be also special characters, like ',', '-', '.', '@' and those should be preserved as-is.
An example file contents (filtering word in this case is 'KEYWORD'):
2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Replace This"}}} -> {:entry1 {:entry2 {:value "Replace ALSO this."}}} 2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "REplace. THIS 12345"}}} 2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}} 2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}} That file as input would be processed into this output:
2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Xxxxxxx Xxxx"}}} -> {:entry1 {:entry2 {:value "Xxxxxxx XXXX xxxx."}}} 2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "XXxxxxx. XXXX 00000"}}} 2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}} 2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}} The changed lines need to be updated in the file or the whole file with these modifications should be thrown into standard output (also those lines that did not have the keyword(s), the line order, etc. details should be preserved.
Is this possible to accomplish this using bash scripting/command line tools like grep and/or sed?