2

Motivated from the discussion on How to convert docx to PDF in r?, I tried to convert .docx to pdf using the following code.

pandoc <- "C:/Users/.../Pandoc/pandoc.exe" input <- "C:/Users/.../abc.docx" output <- "C:/Users/.../abc.pdf" cmd <- sprintf('"%s" "%s" -o "%s"', pandoc, input, output) shell(cmd) 

However, I am getting the "execution failed with error code 1" error. What's the solution? If there is some issue running this in R, how can I do this using other tools?

1 Answer 1

1

I kept having the same problem with this method -- I just can't get it to work.

I did figure out a way to convert docx to PDF using RDCOMClient, however.

library(RDCOMClient) file <- "C:/path/to your/doc.docx" wordApp <- COMCreate("Word.Application") # create COM object wordApp[["Visible"]] <- TRUE #opens a Word application instance visibly wordApp[["Documents"]]$Add() #adds new blank docx in your application wordApp[["Documents"]]$Open(Filename=file) #opens your docx in wordApp #THIS IS THE MAGIC wordApp[["ActiveDocument"]]$SaveAs("C:/path/to your/new.pdf", FileFormat=17) #FileFormat=17 saves as .PDF wordApp$Quit() #quit wordApp 

I found the FileFormat=17 bit here https://learn.microsoft.com/en-us/office/vba/api/word.wdexportformat

Edit: Alternative option - Use Python in R via Reticulate package. This uses the pywin32 Python package. If you don't have it, you can install it using instructions found here: https://rstudio.github.io/reticulate/articles/python_packages.html

I'm not as conversant in Python, but this works on my machine. See below:

library(reticulate) com <- import("win32com.client") file <- "C:/path/to your/doc.docx" wordPy <- com$gencache$EnsureDispatch("Word.Application") wordPyOpen <- wordPy$Documents$Open(file) wordPyOpen$SaveAs("C:/path/to your/doc.pdf", FileFormat=17) wordPy$Quit() 

Hopefully this helps!

Sign up to request clarification or add additional context in comments.

11 Comments

Thank you! I tried this, but getting an error at wordApp[["Documents"]]$Open(Filename=file) : <checkErrorInfo> 80020009 No support for InterfaceSupportsErrorInfo checkErrorInfo -2147352567 Error: Exception occurred.
I was having that issue too and realized that I couldn't even open my docx by clicking it. The docx itself was corrupted. That's probably not your issue, but make sure your file is in good order, and double-check the filename you provided for the file object is correct - that's the best advice I can give at this point.
I just tried opening that file and it did open. One thing though, it has images and lot of formatting stuff.
Even with lots of formatting and images, it seems strange that the open step would trip it up. I'm going to keep obsessing about this and I'll give an update if I find anything. Sorry!
By the way, what versions of RDCOMClient and R are you running?
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.