Monday, 18 June 2018

Python - Use Process Monitor to diagnose Subprocess.run

Python can shell out and run other executables but sometimes it fails; and it is not always apparent why. In this post I show how Process Monitor (and some help from StackOverflowers) helped me to get the right syntax.

Inkscape converts PDFs to SVG

As part as my ongoing struggles with pdf files I was looking for a better way to get content from PDF files. I had discovered that Inkscape allows the conversion of a single page PDF into an SVG file, (there is a set of instructions here).

I'd prefer SVG files as they are XML and I know how to traverse and navigate them. Indeed, I have blogged about using VBA XML to create an SVG file.

Following the instructions given I get a single dialog box (shown below) and then success. This is great but I wanted to automate the process.

Inkscape's Command Line Options

So I wanted a command line way of getting Inkscape to do the pdf to svg conversion. I found Inkscape installed at C:\PROGRA~1\Inkscape> and queried it for command line options

C:PROGRA~1Inkscape>inkscape --help
Usage: inkscape [OPTIONS...] [FILE...]

Available options:

  -z, --without-gui                          Do not use X server (only process
                                             files from console)
  -f, --file=FILENAME                        Open specified document(s)
                                             (option string may be excluded)
...
  -l, --export-plain-svg=FILENAME            Export document to plain SVG file
                                             (no sodipodi or inkscape
                                             namespaces)
...
Help options:
  -?, --help                                 Show this help message
      --usage                                Display brief usage message

So all the required options are there and we can construct the right command line; the following worked and exported a pdf to svg...

c:\progra~1\Inkscape\inkscape -z -f "N:\pdf_skunkworks\inflation-report-may-2018-page0.pdf" -l "N:\pdf_skunkworks\inflation-report-may-2018-page0.svg

Excel VBA code to shell Inkscape to Convert PDF to SVG

As part of investigations I also wrote some VBA code to execute the above command line...

Sub TestShellToInkscape()
    '* Tools->References->Windows Script Host Object Model (IWshRuntimeLibrary)
    Dim sCmd As String
    sCmd = "c:\progra~1\Inkscape\inkscape -z -f ""N:\pdf_skunkworks\inflation-report-may-2018-page0.pdf"" -l ""N:\pdf_skunkworks\inflation-report-may-2018-page0.svg"""
    Debug.Print sCmd
    
    Dim oWshShell As IWshRuntimeLibrary.WshShell
    Set oWshShell = New IWshRuntimeLibrary.WshShell
    
    Dim lProc As Long
    lProc = oWshShell.Run(sCmd, 0, True)
    
End Sub

Python code to shell Inkscape to Convert PDF to SVG

And here is the final Python code which also shells out to Inkscape and converts pdf to svg.

import subprocess 
completed = subprocess.run(['c:/Progra~1/Inkscape/Inkscape.exe',
        '-z', 
        '-f', r'N:/pdf_skunkworks/inflation-report-may-2018-page0.pdf' , 
        '-l', r'N:/pdf_skunkworks/inflation-report-may-2018-page0.svg'])
print ("stderr:" + str(completed.stderr))
print ("stdout:" + str(completed.stdout))

So this turned out to be the right answer, specifically passing each argument separately (whereas VBA passes whole string).

Diagnosing Subprocess.run

The correct Python code is given above but this blog post is more about the journey to get there.

My early attempts did not work and I resorted to StackOverflow to get help. JacobIRR put me on the right track saying that I could use forward slashes and Python could work out when to swap them for backslashes. I took on board this suggestion but it still didn't quite work.

Another StackOverflower asked if I knew that Inkscape was actually running. I thought this unlikely but sought to provide a screenshot that it was indeed running. Task manager was insufficient for this. So instead I turned to Process Monitor to grab the screenshot.

Using Process Monitor to diagnose Subprocess.run

Taking a process shell tree was quite tricky; it required running the python script and then quick as a flash switching (ALT+TAB) to Process Monitor and then pressing Ctrl+T. Here is the first snap which shows a malfunctioning Python program with its arguments being passed to Inkscape with overzealous slashes!

This second snap is one of correctly working code (see above). You can see how the triple slashes have gone, thankfully. Also not using double quotes helped.

Final Thoughts

So, if you are having difficulty with Subprocess.run do please consider using Process Monitor to help diagnose what actually gets passed as arguments.

No comments:

Post a Comment