Applying a set of consecutive regular expressions to multiple files (emacs)

Question

I am working on a project that converts a few dozen html files into text files, and have composed the replace-regexp formulae that do the job. The question is, how to apply all six of them consecutively, and then to do so to each of the dozens of files in the directory? I've appended my org explanation that includes the regexp, but keep in mind that those aren't the problem; they do their job (after translating the ^J, etc). The question is just how to programatically apply all six of them to each (HTML) file in the directory?

* 1. Delete all until >General Conference< \(.*^J\)*.*?General Conference * 2. Delete all <p class="copyright"> and after ^.*<p class="copy\(.*^J\)* * 3. Strip all tags \(<.*?>\)* * 4. Remove whitespace lines ^\s-*^J * 5. Remove ugly numeric identifier ^\s-*[0-9].*^J * 6. Remove amp &amp; -> &

Michael Hoffman · Accepted Answer · 2012-07-22 13:30:25Z

Open the directory with Dired: C-xC-ddirectoryRET
Mark the files you want to change, either by pressing m (dired-mark) to mark each one individually, or some other mechanism in the Mark menu in the menu bar, like *.htmlRET (dired-mark-extension) to mark all files with an html extension.
QregexRETRET (dired-do-query-replace-regexp) to replace any examples of regex with nothing. You can use Ωmega's regex for this.
You can then either replace individual examples with SPC or all examples without asking further questions with !.

Precisely what I needed! The only downside was that even with "!" it still asks when you get to the next file. So, my one improvement after trying your answer is to use iBuffer instead of Dired, which allows you to ibuffer-do-replace-regexp and hence bypass the querying (of course, you must be very confident in your regexp). That did it! Thanks!

event_jr · Accepted Answer · 2012-07-21 17:14:29Z

1

It wouldn't be hard to do this pragmatically. But the idiomatic Emacs solution is to record 2 keyboard macros.

Perform each of your regexp replacements with replace-regexp in a single buffer.
In a dired buffer,
1. move to the next html (with C-s)
2. open it in other window
3. run (1) in other window and switch back to the dired buffer.

You would then run (2) with an absurd number C-u1000 or something.

answered Jul 21, 2012 at 17:14

event_jr

17.7k4 gold badges50 silver badges63 bronze badges

4 Comments

WorldsEndless Over a year ago

Great idea. Unfortunately, the regexps are complex enough that I need to yank them or use referential history commands, which screws up the macro since the history/pool are different each time I come around. I was also getting an error about the length of the macro. Perhaps I need to use a lisp script so I can save commands instead of keystrokes? Or am I missing something about making macros?

phils Over a year ago

Re: "You would then run (2) with an absurd number", Note that an argument of zero means repeat-until-failure.

event_jr Over a year ago

@WorldsEndless this is not an overly complicated macro scenario, you definitely don't need elisp. Have a look at registers.

Howard West Over a year ago

A way around the history reference problem would be save the regular expressions into specific registers where they can be reliably yanked out by a known command that won't change each time.

Raceimaztion · Accepted Answer · 2014-12-19 09:40:06Z

Seems just a step for you writing a function and applying it onto a files list.

Here's a draft starting it:

(defun my-replacements () (interactive "*") (save-restriction (widen) (save-excursion (goto-char (point-min)) (while (re-search-forward "FIRST-REGEXP" nil t 1) (replace-match "FIRST-REPLACEMENT"))

Repeat the last 3 lines until all the forms are covered.

Collectives™ on Stack Overflow

Applying a set of consecutive regular expressions to multiple files (emacs)

3 Answers 3

1 Comment

4 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

Comments

Related