1

How would you approach querying org-mode files from command line? I first tried using awk/sed, but I've struggled to even print a single item:

awk '/^\* foo/,/^\* /' file.org 

it was recommended instead to do this: awk '/^\* foo/,!/^\* foo/&&/^\* /' file.org, so it became clear to me that this approach won't work for more complex operations. I saw some python and clojure libraries, but they didn't seem to have a command-line equivalents.

For example, I'd like to print all tasks called "** foo" as well their top headers/parents and potentially some siblings matching a pattern. Are there already any solutions available?

UPDATE: I've since learned that that this specific query could be performed more simply with sed -n '/^\* foo/,/^\* /p' file.org (as it conveniently starts the second search on the next line).

5
  • Why not use emacs to parse Org mode files from the command line? Commented May 12, 2024 at 14:53
  • that would be ideal, but I couldn't find any example how to print all headers "** foo" in a file (from command line, e.g. emacs --batch -l org --eval '( ??? )')? Commented May 12, 2024 at 14:58
  • Please provide an working example of the org file you're trying to parse and an example of the output you would like. It's completely possible to do this but I need more details about the input and output to write a code example that will work from the command-line. Thanks! Commented May 13, 2024 at 17:55
  • @Melioratus sure, can we try this one: pastebin.com/kMNgTnDM ? Could there be a command line tool that will print all contents of headings named "inkscape wsl": pastebin.com/eMkN81pz Commented May 13, 2024 at 23:13
  • 1
    Thanks! I'll take a look and see what I can put together! Commented May 14, 2024 at 5:00

1 Answer 1

1

There are two considerations here: 1) write a function to extract the needed information and 2) run the function on an Org mode file from the command line. The two are pretty much orthogonal, so you should start with a simple function that extracts some useful information (but not necessarily all that you indicate above) and figure out how to run it from the command line. Once you have debugged that and can see your way through the forest, then write a more complicated function for step 1) that incrementally gets you to where you eventually want to go.

For example, suppose that we want to do the equivalent of wc -l: print the number of lines in the file. That's a much simpler function that doesn't even involve Org mode; nevertheless, it provides you an opportunity to figure out the mechanics of calling it on a file from the command line.

Here's a simple function that gives you the number of lines in the current buffer:

 (defun ndk/count-lines () (save-excursion (goto-char (point-max)) (1- (line-number-at-pos)))) 

You should run this function with M:- (ndk/count-lines) in various files to make sure it works as expected. Compare its output with that of wc -l.

Now to run this function on a file from the command line, store it in a file (let's call the file minimal-init.el and put it in the current directory for simplicity). Then to run the function on a file and print the result to stdout, run this command line:

emacs --batch --load ./minimal-init.el --file foo.org --eval '(print (ndk/count-lines))` 

Explanation:

We run a batch session: that means that your init file is skipped, there is no interaction, there is no graphical frame opened and the standard functions print and message modify their behavior to produce output on stdout and stderr respectively (since there is no visible bufer or echo area for them to use). In addition, when all the arguments have been processed, emacs exits.

We then load a minimal init file that contains the definition of the function(s) that we eventually want to run (and any functions that they depend on that are user-defined).

Then we open the file foo.org which creates a buffer, but the buffer is not shown since there is no frame to show it. Nevertheless, that becomes the current buffer.

Finally we evaluate the expression: the function to count lines in the current buffer is called, it returns the number of lines as its result and then print prints that result (to stdout). After that, emacs exits.

That's step 1 - you can now use a different function to print some other information about the file. For example, try the following:

(defun ndk/org-heading-list () "Map the `ndk/heading' function on all entries of the current buffer." (org-map-entries #'ndk/heading t 'file)) (defun ndk/heading () "Return the text of the heading of the current entry." (org-get-heading t t t t)) 

Put these functions into your ./minimal-init.el file and run the same command line, except for replacing the ndk/count-lines function with ndk/org-heading-list:

emacs --batch --load ./minimal-init.el --file foo.org --eval '(print (ndk/org-heading-list))` 

For your simple example file (from your linked question), I get:

$ emacs --batch --load ./minimal-init.el --file foo.org --eval '(print (ndk/org-heading-list))' ("Example" "foo" "subheading" "bar" "subheading") $ 

Now you can write a more complicated function to act on the current bufffer that does what you really want to do and add it to your ./minimal-init.el, but the invocation will be identical except for the name of that function.

And given that this answers the question in your title even though it does not answer the (more complicated) question in your comment, if you have problems writing that function, you should ask a different question on that.

4
  • thank you, that would work, but involves quite a few steps and I was hoping to find some simple one-liner that could be also modified in the future to different queries. The closest approach that I could come up with since posting the question was converting it to xml and then using xpath like this (it doesn't quite work yet though): pandoc -f org -t docbook input.org | xmllint --format --xpath '/<XPATH>'. That would allow one to freely modify xpath in an expressive and relatively easy way. I don't suppose that there is an "xpath for org-mode" equivalent? Sorry if this isn't emacs-specific. Commented May 15, 2024 at 12:33
  • "... some simple one-liner ..." - that's a dream we all have but it's only a dream. If you want to use Emacs for parsing Org mode documents you will need to learn the Org mode API for getting things out of files and enough Elisp so that you can string things together. There is no royal road to geometry or anything else for that matter. Converting to XML and using xpath might work in simple cases, but it's a lossy conversion so infinite extensibility is just not going to happen. You have to be realistic about these things: pandoc is not a panacea. Commented May 15, 2024 at 13:09
  • " ... quite a few steps ...": like most things, it looks complicated when it is described (and I tend to run on in the mouth sometimes), but it is easier done than said. If you spend half an hour with the answer and try it out, you will see that it is pretty simple - but it's never going to be a one-liner, not in awk, not in xpath, not in anything - not in the generality you are talking about. Commented May 15, 2024 at 13:20
  • true, I will give it a try Commented May 16, 2024 at 22:03

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.