Basic Powershell - batch convert Word Docx to PDF

Any ideas? Also - how would I need to change it to also convert doc (not docX) files, as well as use the local files (files in same location as the script location)? Sorry - never done PowerShell scripting.

asked May 14, 2013 at 2:44 takabanana takabanana 627 1 1 gold badge 8 8 silver badges 9 9 bronze badges

5 Answers 5

This will work for doc as well as docx files.

$documents_path = 'c:\doc2pdf' $word_app = New-Object -ComObject Word.Application # This filter will find .doc as well as .docx documents Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object < $document = $word_app.Documents.Open($_.FullName) $pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf" $document.SaveAs([ref] $pdf_filename, [ref] 17) $document.Close() >$word_app.Quit()

answered May 14, 2013 at 8:04 1,465 12 12 silver badges 11 11 bronze badges

This worked - thanks - changed $documents_path to be dynamic based on script location by using $documents_path = Split-Path -parent $MyInvocation.MyCommand.Path

Commented May 14, 2013 at 22:16 Is oledb used in this conversion? Commented Oct 28, 2013 at 14:37

You should also release the COM object: [System.Runtime.Interopservices.Marshal]::ReleaseComObject($word_app) . Removing the entire variable is also suggested by The Scripting Guys: Remove-Variable word_app

Commented Nov 2, 2014 at 18:37

This results in an error: ComException on the SaveAs line. I have not found a solution that works on windows server 2016 yet.

Commented Mar 13, 2017 at 19:52

For anyone having problems with this version using [ref], make sure you have the latest Windows Management Framework 5.1 installed. Updating it worked for me, even when using this script on a Windows 7 machine.

Commented May 3, 2018 at 9:37

The above answers all fell short for me, as I was doing a batch job converting around 70,000 word documents this way. As it turns out, doing this repeatedly eventually leads to Word crashing, presumably due to memory issues (the error was some COMException that I didn't know how to parse). So, my hack to get it to proceed was to kill and restart word every 100 docs (arbitrarily chosen number).

Additionally, when it did crash occasionally, there would be resulting malformed pdfs, each of which were generally 1-2 kb in size. So, when skipping already generated pdfs, I make sure they are at least 3kb in size. If you don't want to skip already generated PDFs, you can delete that if statement.

Excuse me if my code doesn't look good, I don't generally use Windows and this was a one-off hack. So, here's the resulting code:

$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*" $counter = 0 $filesProcessed = 0 $Word = New-Object -ComObject Word.Application Foreach ($File in $Files) < $Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf" if ((Test-Path $Name) -And (Get-Item $Name).length -gt 3kb) < echo "skipping $($Name), already exists" continue >echo "$($filesProcessed): processing $($File.FullName)" $Doc = $Word.Documents.Open($File.FullName) $Doc.SaveAs($Name, 17) $Doc.Close() if ($counter -gt 100) < $counter = 0 $Word.Quit() [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word) $Word = New-Object -ComObject Word.Application >$counter = $counter + 1 $filesProcessed = $filesProcessed + 1 >