I had a share folder in which users were continuously dropping MS Word Docs, a mix of .doc and .docx files. I needed to find a way to automatically convert those docs to text (.txt) so that they could be imported into a specialized database. I found the perfect solution at
http://blogs.technet.com/b/heyscriptingguy/archive/2008/11/12/how-can-i-convert-word-files-to-pdf-files.aspx .
It was easy to configure a Powershell script that I added a loop to and did the job perfectly. However; after running for a day I started to get errors and docs were not being converted. When I would try to manually open one of the problem docs, specifically .doc extensions in Word I would get a mswrd632-wpc error and have to click through the errors a couple times before the doc would open up. I was able to manually save the problem doc as a docx and then the script would process it. After some research I found this link:
http://helpdeskgeek.com/office-tips/word-cannot-start-the-converter-mswrd632-wpc/
I chose to delete the registry key at
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Text Converters\Import\MSWord6.wpc
and that fixed my issue.
According to the original link author the only issue this causes is that Word97 docs no longer open in Wordpad.
Powershell script as I modified it for my project:
#Powershell Convert DoC and DCOX to plain txt
#infinite loop for running conversion
while(1)
{
$wdFormatText = 2
$word = New-Object -ComObject word.application
$word.visible = $false
$folderpath = "c:\fso\*"
$fileTypes = "*.docx","*doc"
Get-ChildItem -path $folderpath -include $fileTypes |
foreach-object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
"Converting $path to pdf ..."
$doc = $word.documents.open($_.fullname)
$doc.saveas([ref] $path, [ref]$wdFormatText)
$doc.close()
Remove-Item $_
}
$word.Quit()
Move-Item $folderpath\*.txt d:\someFolder
start-sleep - seconds 600
}
No comments:
Post a Comment