Tuesday, 28 April 2020

VBA, Named Pipes (& JavaScript) - binary serialization revisited

In this post I revisit the matter of the serialization of VBA Variant arrays to bytes and back again. This can be used VBA to VBA or interoperating with other technologies that can be sequence bytes, in this post I demonstrate JavaScript to VBA. I incorporate a better serialization technique code that uses Windows OS Named Pipes as written by VBForums.com contributor Olaf Schmidt.

Background

This is a revisit of an earlier article where I saved and loaded a Variant array to disk using VBA’s Open File For Binary, Put and Get statements. Whilst the code did what I wanted, I had complained that my code required a disk operation which carries a performance penalty.

I am indebted to a commenter (who gives no name) who tipped me off as to some code on VBForums written by Olaf Schmidt; Olaf’s code serializes using Windows OS Named Pipes and not a disk write. The Named Pipes are purely in memory and so this makes his code faster.

Moreover, the Named Pipes serialization yields a leading sequence of bytes that describes the number of dimensions in the array and the size and bounds of each dimension. This was something formally missing from disk write version my code and which I had had to implement manually, something of a nuisance.

I am thus doubly indebted to Olaf Schmidt’s code and to the commenter who tipped me off. Thank you. Do please keep the comments coming.

VBA Class Module - cPipedVariants

So with Olaf Schmidt's code as a starting point I have modified it to handle the use case of VBA variant arrays, i.e. a two dimensional array which is ready for pasting onto a worksheet. Olaf's original code demonstrated the serialization of user-defined types and these data structures are more prevalent in Visual Basic 6 (VB6) whereas Excel developers (I would argue) are more likely to deal with grids drawn from a worksheet or grids to be pasted onto a worksheet.

If you want the original version that deals with the serialization of UDTs it is on this link here to vb6forums.com.

So what follows in cPipedVariants, a modification on Olaf's original class cPipedUDTs. Much of the code is easy to follow but I will comment on the ‘secret sauce’ of the InitializePipe function.

The two key lines of code are the call to CreateNamedPipeW and then the Open "\\.\pipe\foo" For Binary statement. If I switch the order of these two around then the code fails. Internally, in its implementation the Open For Binary Statement must have a special case where it identifies the "\\.\pipe\ " prefix and then looks up in the list of created named pipes. This is not documented in any Microsoft documentation, or indeed StackOverflow. Only the VB6Forums.com users and specifically Olaf Schmidt understand this lore, it must be a throw back to the era of VB6. Anyway, it does work and I am grateful.

Add a class to your VBA project, call it cPipedVariants and then paste in the following code

Option Explicit

'* Pipe-based helper to serialize/deserialize VB-Variants InMemory ... [based on original code by Olaf Schmidt 2015]
'* Based on code by Olaf Schmidt 2015, http://www.vbforums.com/showthread.php?807205-VB6-pipe-based-UDT-serializing-deserializing-InMemory


'* https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createnamedpipea
Private Declare Function CreateNamedPipeW& Lib "kernel32" (ByVal lpName As Long, ByVal dwOpenMode&, ByVal dwPipeMode&, _
            ByVal nMaxInstances&, ByVal nOutBufferSize&, ByVal nInBufferSize&, _
            ByVal nDefaultTimeOut&, ByVal lpSecurityAttributes&)

'* https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile
Private Declare Function WriteFile& Lib "kernel32" (ByVal hFile&, lpBuffer As Any, _
            ByVal nNumberOfBytesToWrite&, lpNumberOfBytesWritten&, ByVal lpOverlapped&)

'* https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-readfile
Private Declare Function ReadFile& Lib "kernel32" (ByVal hFile&, lpBuffer As Any, _
            ByVal nNumberOfBytesToRead&, lpNumberOfBytesRead&, ByVal lpOverlapped&)

'* https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-peeknamedpipe
Private Declare Function PeekNamedPipe& Lib "kernel32" (ByVal hNamedPipe&, lpBuffer As Any, _
            ByVal nBufferSize&, lpBytesRead&, lpTotalBytesAvail&, lpBytesLeftThisMessage&)

'* https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-disconnectnamedpipe
Private Declare Function DisconnectNamedPipe& Lib "kernel32" (ByVal hPipe&)

Private Declare Function CloseHandle& Lib "kernel32" (ByVal hObject&)

Private mhPipe As Long
Private mlFileNumber As Long
Private mabytSerialized() As Byte

Private Enum eOpenMode
    PIPE_ACCESS_INBOUND = 1
    PIPE_ACCESS_OUTBOUND = 2
    PIPE_ACCESS_DUPLEX = 3
End Enum

Private Enum ePipeMode
    PIPE_TYPE_BYTE = 0
    PIPE_TYPE_MESSAGE = 4

    PIPE_READMODE_BYTE = 0
    PIPE_READMODE_MESSAGE = 2
   
    PIPE_WAIT = 0
    PIPE_NOWAIT = 1
End Enum

Private Enum ePipeInstances
    PIPE_UNLIMITED_INSTANCES = 255
End Enum

Public Function InitializePipe(Optional sPipeNameSuffix As String = "vbaPipedVariantArrays") As Boolean
    Const csPipeNamePrefix As String = "\\.\pipe\"
    CleanUp
   
    Dim sPipeName As String
    sPipeName = csPipeNamePrefix & sPipeNameSuffix
   
    '* Must call CreateNamedPipe first before calling Open <<pathname>> For Binary otherwise you get bad file number
    mhPipe = CreateNamedPipeW(StrPtr(sPipeName), PIPE_ACCESS_DUPLEX, PIPE_TYPE_BYTE + PIPE_READMODE_BYTE + PIPE_WAIT, _
            PIPE_UNLIMITED_INSTANCES, -1, -1, 0, 0)
           
    If mhPipe = -1 Then mhPipe = 0 'reset from InvalidHandleValue to "no Handle"
   
    If mhPipe Then
        '* only try to find a free VB-FileNumber when mhPipe is valid (i.e. pipe has been created)
        mlFileNumber = FreeFile
        If mlFileNumber Then
            Open sPipeName For Binary As mlFileNumber  'open only, when we got an mlFileNumber
        End If
    End If
   
    InitializePipe = mhPipe <> 0 And mlFileNumber <> 0
End Function

Public Function SerializeToBytes(ByRef vSrc As Variant, ByRef pabytSerialized() As Byte) As Long

    Dim lBytesAvail As Long

    Debug.Assert IsArray(vSrc)

    If mlFileNumber <> 0 Then
   
        '* this next line writes the Variant array to the pipe
        Put mlFileNumber, 1, vSrc
       
        '* we should now have some bytes to read out of the pipe, use PeekNamedPipe to verify there are bytes available
        PeekNamedPipe mhPipe, ByVal 0&, 0, ByVal 0&, lBytesAvail, 0
       
        If lBytesAvail > 0 Then
           
            '* so now we can dimension the byte array
            ReDim Preserve pabytSerialized(0 To lBytesAvail - 1)
           
            '* and now we can read the bytes out of the pipe and into the byte array
            ReadFile mhPipe, pabytSerialized(0), lBytesAvail, lBytesAvail, ByVal 0&
           
            '* return number of bytes as a courtesy (not actually required)
            SerializeToBytes = lBytesAvail
        End If
    End If
End Function

Public Function DeserializeFromBytes(ByRef abytSerialized() As Byte, ByRef pvDest As Variant) As Long
   
    Dim lBytesWritten As Long
   
    If mhPipe <> 0 And mlFileNumber <> 0 Then

        '* write the byte array to the pipe
        WriteFile mhPipe, abytSerialized(0), UBound(abytSerialized) + 1, lBytesWritten, 0
       
        If lBytesWritten = UBound(abytSerialized) + 1 Then
            '* the pipe contains a byte array serialization of a variant array
            '* we can use VBA's Get statement to read it directly into a variant array variable
            Get mlFileNumber, 1, pvDest
           
            '* report the amount of deserialized Bytes as a courtesy (not actually required)
            DeserializeFromBytes = Loc(mlFileNumber)
        End If
    End If
End Function

Private Sub CleanUp()
    If mlFileNumber Then Close mlFileNumber: mlFileNumber = 0
    If mhPipe Then DisconnectNamedPipe mhPipe
    If mhPipe Then CloseHandle mhPipe: mhPipe = 0
End Sub

Private Sub Class_Terminate()
    CleanUp
End Sub

VBA Standard Module - tstPipedVariants

So now we need some client code. Add a standard module to your VBA project and paste in the following code. I called this module tstPipedVariants.

Sub SamePipeForSerializeAndDeserialize()
    Dim oPipe As cPipedVariants
    Set oPipe = New cPipedVariants
   
    If oPipe.InitializePipe Then
        Dim vSource As Variant
        vSource = TestData

        Dim abytSerialized() As Byte

        Call oPipe.SerializeToBytes(vSource, abytSerialized)

        Stop '* at this point vSource is populated but vDestination is empty

        Dim vDestination As Variant
        oPipe.DeserializeFromBytes abytSerialized, vDestination
   
        Stop
    End If
End Sub

Function TestData() As Variant
    Dim vSource(1 To 2, 1 To 4) As Variant
    vSource(1, 1) = "Hello World"
    vSource(1, 2) = True
    vSource(1, 3) = False
    vSource(1, 4) = Null
    vSource(2, 1) = 65535
    vSource(2, 2) = 7.5
    vSource(2, 3) = CDate("12:00:00 16-Sep-1989") 'now()
    vSource(2, 4) = CVErr(xlErrNA)
    TestData = vSource
End Function

In the module tstPipedVariants run the test code subroutine SamePipeForSerializeAndDeserialize() by navigating and pressing F5 to reach the first Stop statement. On the first Stop statement vSource is populated but vDestination isn’t.

However, the byte array abytSerialized() is populated and we can go inspect this. The first twenty bytes are similar to SafeArray and SafeArrayBounds structures. The first two bytes represent a vbVarType of vbArray+vbVariant in low byte, high byte order. Next, two bytes gives the count of dimensions. Then for each dimension there are 8 bytes, giving a 4 byte dimension size and a 4 byte lower bound. This abridged SafeArray descriptor is most welcome. When VBA code writes a variant array to disk it omits such a descriptor which meant I had to augment the code to manually write in the dimensions. I am very much pleased that the Named Pipes implementation does this automatically for me.

After the first twenty bytes of abridged SafeArray descriptor the rest of the data follows. I wrote this up in the original blog post so I’ll refer you to that and skip the rest.

Returning to the test code, press F5 again to get the second Stop statement and behold in the Locals window the vDestination variable is now populated exactly the same as the vSource variable. Note how we did not need to dimension the vDestination variable before populating it, excellent!

This completes the VBA to VBA demo. We can move onto the JavaScript to VBA use case.

Revisiting the Javascript to VBA use case

JavaScript Changes

In the original article I gave some Javascript code to serialize a Javascript array to a byte array that can then be deserialized to a VBA variant array. This JavaScript code needs modifying to interoperate with the new Named Pipes VBA code given above. The change required is to give a correctly formatted abridged safe array descriptor. This is a simple change found at the top of the JavaScriptToVBAVariantArray.prototype.persistGrid function. All other code remains the same, so no further explanation is required. The JavaScript module remains something loadable into both browser and server JavaScript environments. The Node.js project given in the original blog post can still be used.

I have only included the function that has changed, JavaScriptToVBAVariantArray.prototype.persistGrid; for rest of the JavaScript module listing see the original blog post.

JavaScriptToVBAVariantArray.prototype.persistGrid = function persistGrid(grid, rows, columns) {
	try {
		/* Opening sequence of bytes is a reduced form of SAFEARRAY and SAFEARRAYBOUND
		 * SAFEARRAY       https://docs.microsoft.com/en-gb/windows/win32/api/oaidl/ns-oaidl-safearray
		 * SAFEARRAYBOUND  https://docs.microsoft.com/en-gb/windows/win32/api/oaidl/ns-oaidl-safearraybound
		 */

		var payloadEncoded = new Uint8Array(20);

		// vbArray + vbVariant, lo byte, hi byte
		payloadEncoded[0] = 12; payloadEncoded[1] = 32;

		// number of dimensions, lo byte, hi byte
		payloadEncoded[2] = 2; payloadEncoded[3] = 0;

		// number of columns, 4 bytes, least significant byte first
		payloadEncoded[4] = columns % 256; payloadEncoded[5] = Math.floor(columns / 256);
		payloadEncoded[6] = 0; payloadEncoded[7] = 0;

		// columns lower bound (safearray)
		payloadEncoded[8] = 1; payloadEncoded[9] = 0;
		payloadEncoded[10] = 0; payloadEncoded[11] = 0;

		// number of rows, 4 bytes, least significant byte first
		payloadEncoded[12] = rows % 256; payloadEncoded[13] = Math.floor(rows / 256);
		payloadEncoded[14] = 0; payloadEncoded[15] = 0;

		// rows lower bound (safearray)
		payloadEncoded[16] = 1; payloadEncoded[17] = 0;
		payloadEncoded[18] = 0; payloadEncoded[19] = 0;

		var elementBytes;
		for (var colIdx = 0; colIdx < columns; colIdx++) {
			for (var rowIdx = 0; rowIdx < rows; rowIdx++) {
				elementBytes = this.persistVar(grid[rowIdx][colIdx]);
				var arr = [payloadEncoded, elementBytes];

				payloadEncoded = this.concatArrays(arr); // Browser
			}
		}
		return payloadEncoded;
	}
	catch (err) {
		console.log(err.message);
	}
};

VBA web client code

Turning to the client VBA code we can greatly simplify the code now that the dimensioning is done for us. The resulting code is now trivial, here it is below. Add the following code to the tstPipedVariants module you added earlier. This code below requires you to add a Tools->Reference to Microsoft WinHTTP Services, version 5.1...
Sub TestByWinHTTP()
    '* this calls the Node.js project with the new JavaScript serialization
    Dim oWinHttp As WinHttp.WinHttpRequest '* Tools->References->Microsoft WinHTTP Services, version 5.1
    Set oWinHttp = New WinHttp.WinHttpRequest
    oWinHttp.Open "GET", "http://localhost:1337/", False
    oWinHttp.send
   
    If oWinHttp.Status = 200 Then
        If IsEmpty(oWinHttp.ResponseBody) Then Err.Raise vbObjectError, , "No bytes returned!"
        If UBound(oWinHttp.ResponseBody) = 0 Then Err.Raise vbObjectError, , "No bytes returned!"
       
        Dim oPipedVariants As cPipedVariants
        Set oPipedVariants = New cPipedVariants
        If oPipedVariants.InitializePipe Then
       
            Dim vGrid As Variant
            oPipedVariants.DeserializeFromBytes oWinHttp.ResponseBody, vGrid
           
            Stop '* Observe vGrid in the Locals window
           
            '* vGrid is now ready to paste on a worksheet
        End If
    End If
End Sub

So run this code with the Node.js project running and the vGrid should be populated. That's all folks. Enjoy!

Thanks again

Thanks once again to Olaf Schmidt and the anonymous tipper! Getting help to fix my code is very welcome.

Wednesday, 22 April 2020

VBA calling Python calling back into VBA

In this post I again show a Python COM gateway class callable from VBA; this time it demonstrates calling back into the VBA using Application.Run, ThisWorkbook or a VBA-defined class. It does this by ensuring a win32com.client.Dispatch wrapper. It also demonstrates code to report all the local variables for debugging purposes. Further, it demonstrates running the COM server in a separate process.

Calling back using COM

So, I need some code to callback into VBA from a Python COM component, the mechanism will clearly be another COM call. Let's list the COM ways we can do this...

  1. Pass in the Excel.Application object and call Application.Run with the name of a macro to callback on.
  2. Write callback code in the ThisWorkbook module and pass in the Workbook object and have Python call a method in the workbook's ThisWorkbook module.
  3. Write callback code in a VBA defined class and pass in instance of such a class and have Python call a method on that instance.
  4. Write code to implement COM events, i.e. a source interface and declare a variable in VBA using the WithEvents keyword to sink the events.

I have implemented the top three in the code below. If you want COM events you'll need a Type Library, but Python does not supply one automatically; I did write some code to use Python reflection to generate a type library but this is not out of the box functionality from Python. Never mind, a callback routine in Excel could itself raise events if you really wants events; after all, COM events are simply callbacks to multiple listeners instead of a single listener.

For some very odd reason the third case, the VBA defined case had a bug/glitch. It would report not just the return value but instead a tuple of the return value and all the input parameters. This is a pitfall but easily solved and not a problem once one is aware that the problem exists.

Ensure Dispatch

The key tip in the code with regard to calling back is to wrap whatever is passed in from VBA in a win32com.client.Dispatch wrapper. In testing however, sometimes I use an object which I have acquired from the Running Object Table which already comes wrapped. So I wrote a small class called the DispatchEnsurer to inspect what it has been given and wrap where necessary.

Inability to Attach the Microsoft Visual Studio debugger

So I am embarrassed to admit I cannot get the Microsoft Visual Studio debugger to attach to the Excel session and specifically hit breakpoints in my Python code. This means stepping through the code as initiated by a call from Excel VBA is impossible. If you know how to do this please leave a comment. In the meantime I had to write a class to report all the local variables for a method into a nicely formatted string and pass that back to the calling VBA so I can see what is going on (see below). I would point out that it is still possible to acquire a workbook object from the running object table using win32com.client.GetObject(<<workbook full file path>>) and that is primarily how one can get around this problem.

Perhaps I should report this breakpoint problem as a bug to Microsoft's Visual Studio team. I do not believe it is Python's fault.

Declarative Python COM server registration

So the breakpoint problem above was quite painful. Another painful problem was updating the Python script and have Excel VBA pick up the changes. In some instances I had to change the COM's class's CLSID to force the Excel VBA client code to pick up the script changes. Other tactics involved shutting down Excel and reopening but this didn't always work as I believe there is some sort of session recycling (I'll investigate this at some point). Anyway, I found the best way for rapid iterative development without re-registering and without rebooting Excel was to make the Python COM server sit in a separate process and then kill that process to pick up script changes.

Getting a COM server to sit in its own process requires adding the following line of code to your Python COM gateway class

_reg_clsctx_ = pythoncom.CLSCTX_LOCAL_SERVER

Adding one single line of source to effect this change is quite impressive. To get C# to do this requires a degree of fiddling around in the registry plus some extra code, so Python deserves praise for this declarative simplicity.

With this extra line of code the Python executables will not be loaded into the Excel.exe process but instead loaded into a pythonw.exe process. So this means we can kill the pythonw.exe process and force Excel VBA/COM runtime to spawn a new one that will effect any new script changes. Here is some VBA code to kill the pythonw.exe process...

Sub PickupNewPythonScript()
    '# for development only to help pick up script changes we kill the python process
    Call CreateObject("WScript.Shell").Run("taskkill /f /im pythonw.exe", 0, True)
End Sub

So with the ability to kill the Python process and force it to pick up new changes without rebooting Excel and without incrementing CLSIDs and re-registering meant I could start to get some work done!

Reporting all the local variables

As mentioned above without the ability to step through the code one loses the ability to see the values of the local variables. Luckily and to Python's credit there is a locals() dictionary which contains all the local variables; we can print this although I added some extra line breaking and formatting. I give a formatting class in the code below. I use this to return all the local variables when an error occurs. This allows me to debug in the absence of proper/formal debugger support.

VBA Class use case returns tuple of return value and input arguments

I can only assume that this is a bug or a glitch but in the case of the VBA defined class I found that the return value was returned in a tuple that also consisted on the input arguments. Not a real problem as we simply access the tuple's first element with [0].

Python Code

The Python code is one single listing. Run this within Microsoft Visual Studio or from the command line. One needs to run it at least once to register the COM class registration. If you run without Administrator privileges it will request to elevate.

import win32com.server.register
from win32com.client import Dispatch
import pythoncom

class CallingBackIntoVBA(object):
    _reg_clsid_ = "{352B1FE3-8F8E-478B-93D0-A5AAC612D09A}"
    _reg_progid_ = 'PythonInVBA.CallingBackIntoVBA'
    _reg_desc_ = "Demonstrates a Python COM server calling back into calling VBA code"
    _public_methods_ = ['DemoAppRunCallback','DemoThisWorkbookCallback', 'DemoClassCallback']
    _reg_clsctx_ = pythoncom.CLSCTX_LOCAL_SERVER

    def DemoClassCallback(self, classInstance, arg0=None, arg1=None, arg2=None):
        try:
            dispCb = DispatchEnsurer.EnsureDispatch(classInstance)

            # for some reason a tuple of the return value and arguments is
            # returned, very odd!!
            return dispCb.ClassCallback2(arg0,arg1,arg2)[0]  # [0] gets first element of the tuple

        except Exception as ex:
            return LocalsEnhancedErrorMessager.Enhance(ex,str(locals()))

    def DemoAppRunCallback(self,excelApplication, appRunMacro: str, arg0=None, arg1=None, arg2=None):
        try:
            dispApp = DispatchEnsurer.EnsureDispatch(excelApplication)
            return dispApp.Run(appRunMacro, arg0,arg1,arg2)

        except Exception as ex:
            return LocalsEnhancedErrorMessager.Enhance(ex,str(locals()))

    def DemoThisWorkbookCallback(self, workbook, arg0=None, arg1=None, arg2=None):
        try:
            dispWb = DispatchEnsurer.EnsureDispatch(workbook)
            #dispWb.madeUpName()
            return dispWb.ThisWorkbookCallback2(arg0,arg1,arg2)

        except Exception as ex:
            return LocalsEnhancedErrorMessager.Enhance(ex,str(locals()))

class LocalsEnhancedErrorMessager(object):
    @staticmethod
    def Enhance(ex, localsString):
        locals2 = "n Locals:{ " + (",n".join(localsString[1:-1].split(",")) ) + " }"
        if hasattr(ex,"message"):
            return "Error:" + ex.message + locals2
        else:
            return "Error:" + str(ex) + locals2

class DispatchEnsurer(object):
    @staticmethod
    def EnsureDispatch(comObj):
        """ Sometimes we get a PyIDispatch so we'll need to wrap it, this class takes care of that contingency"""
        try:
            dispApp = None
            apptypename = str(type(comObj))
            if apptypename == "<class 'win32com.client.CDispatch'>":
                # this call from GetObject so no need to Dispatch()
                dispApp = comObj
            elif apptypename == "<class 'PyIDispatch'>":
                # this was passed in from VBA so wrap in Dispatch
                dispApp = Dispatch(comObj)
            else:
                # other cases just attempt to wrap
                dispApp = Dispatch(comObj)
            return dispApp 
        except Exception as ex:
            if hasattr(ex,"message"):
                return "Error:" + ex.message 
            else:
                return "Error:" + str(ex)

def RegisterCOMServers():
    print("Registering COM servers...")
    import win32com.server.register
    win32com.server.register.UseCommandLine(CallingBackIntoVBA)

def TestCallingBackIntoVBA():
    import win32com
    cb = CallingBackIntoVBA()
    wb = win32com.client.GetObject(r"C:\Users\Simon\Downloads\PythonVBACallbackDemo.xlsm")
    app = wb.Parent
    sb = app.StatusBar

    ret = cb.DemoAppRunCallback(app,"PythonVBACallbackDemo.xlsm!Module1_Proc1",0,1,2)
    print(ret)

if __name__ == '__main__':
    pass
    RegisterCOMServers()
    #TestCallingBackIntoVBA()

VBA Code Listings

Whereas the Python code is a single listing the VBA code is in contrast dispersed across several modules, please place into one workbook. I called mine PythonVBACallbackDemo.xlsm.

ThisWorkbook module

Place the following code in the ThisWorkbook module

Public Function ThisWorkbookCallback1()
    Debug.Print "ThisWorkbookCallback1"
End Function

Public Function ThisWorkbookCallback2(arg0, arg1, arg2)
    ThisWorkbookCallback2 = CStr(arg0) & " " & CStr(arg1) & " " & CStr(arg2)
    Debug.Print "ThisWorkbook.ThisWorkbookCallback2"
End Function

Callback Class

Place the following code in a VBA class named Callback with its Instancing type set to 2 - PublicNotCreatable

Public Function ClassCallback1()
    Debug.Print "ClassCallback1"
End Function

Public Function ClassCallback2(arg0, arg1, arg2) As String
    ClassCallback2 = CStr(arg0) & " " & CStr(arg1) & " " & CStr(arg2)
    Debug.Print "Callback.ClassCallback2"
End Function

Standard module code

Place the following code in a standard (non-class module), it doesn't matter what you name this module but I called mine modCallbackDemos . The procedures to run (by placing cursor and pressing F5) are ThisWorkbookCallbackDemo(), AppRunMacroDemo() and ClassCallbackDemo().

Option Explicit

Sub PickupNewPythonScript()
    '# for development only to help pick up script changes we kill the python process
    Call CreateObject("WScript.Shell").Run("taskkill /f /im pythonw.exe", 0, True)
End Sub

Sub ThisWorkbookCallbackDemo()
    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.CallingBackIntoVBA")
    
    Debug.Print obj.DemoThisWorkbookCallback(ThisWorkbook, 0, 1, 2)
    
    Set obj = Nothing
End Sub

Sub AppRunMacroDemo()
    'Stop
    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.CallingBackIntoVBA")
    
    Dim sAppRunMacro As String
    sAppRunMacro = ThisWorkbook.Name & "!Module1_Proc1"
    
    Dim vRet As Variant
    vRet = obj.DemoAppRunCallback(Application, sAppRunMacro, 0, 1, 2)
    Debug.Print vRet
    
    Set obj = Nothing
End Sub

Function Module1_Proc1(arg0, arg1, arg2)
    Dim sRet
    sRet = "Module1_Proc1 running"
    Debug.Print sRet
    Module1_Proc1 = sRet & " foobar"
End Function

Sub ClassCallbackDemo()
    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.CallingBackIntoVBA")
    
    Dim oCallback As Callback
    Set oCallback = New Callback
    
    Debug.Print obj.DemoClassCallback(oCallback, 0, 1, 2)
    
    Set obj = Nothing
End Sub

So if I run ThisWorkbookCallbackDemo() then the expected output is

ThisWorkbook.ThisWorkbookCallback2
0 1 2

And if I run AppRunMacroDemo() then the expected output is

Module1_Proc1 running
Module1_Proc1 running foobar

And finally if I run ClassCallbackDemo() then the expected output is

Callback.ClassCallback2
0 1 2

Amending the script and picking up changes

So in the above code listing there is a routine PickupNewPythonScript() which I use to avoid having to reboot Excel. You might like to experiment with it, so make a change in the Python script, save your Python script changes then run PickupNewPythonScript() and see your changes effected. I have added a commented out line of code in the Python script

#dispWb.madeUpName()

which if uncommented and effected will mean the expected output of running ThisWorkbookCallbackDemo() changes to

Error:<unknown>.madeUpName
 Locals:{ 'self': <CallingBackIntoVBA.CallingBackIntoVBA object at 0x030224D0>,
 'workbook': <PyIDispatch at 0x009EDF70 with obj at 0x009301BC>,
 'arg0': 0,
 'arg1': 1,
 'arg2': 2,
 'dispWb': <COMObject <unknown>>,
 'ex': AttributeError('<unknown>.madeUpName') }

thus demonstrating the local variables report.

Enjoy!

Final thoughts

If I could express a preference for which pattern to use I would pass in the Excel Application and use Application.Run. This is because the Application object is most durable. After all, workbooks can be unloaded meaning the callback target could go missing and you are calling on a null (stale) pointer. Moreover, an instance of a VBA class is even more temporary in that class instances can be torn down during a state loss.

I am pleased about running the Python COM server in a separate process and then killing it to spawn a refreshed and updated new instance picking up code changes. This is a major productivity boon.

Wednesday, 8 April 2020

Call for CoronaVirus Contact Tracing Software

Just heard on the radio that the French government is developing a mobile phone app for contact tracing.  My first thought is they should open source the code.  Then I thought why doesn't the programmer community write its own open source app and put it on GitHub.

Then I'm thinking, well perhaps it already exists?  If you know of such an app and can point to open source then please do write a comment below.   I like to see the code and perhaps help out.

Cheers.

Sunday, 5 April 2020

Python, VBA - Beautiful Soup for webscraping

In this post I use Python's Beautiful Soup library to webscrape data from a web page, hopefully other VBA developers will realise that this is a much better library to use than VBA. But I do give a COM wrapper class so that the Python can be called from VBA.

Use case background

I'm worried about Corona Virus and its impact on stock markets, I very much need to query the London Stock Exchange database of securities to look for some safe government bonds to buy into. I have a list of government bonds from the branch of government that manages the issuance of such bonds but their unique identifier is an ISIN which is unique across the globe. Unfortunately, stockbrokers do not use ISIN numbers and I need to find alternative IDs and ticker codes. The London Stock Exchange (I am UK resident/citizen) has this information so I need to cross-reference the issuance authority's ISINs code with the LSE's database to get codes to present to my stock broker. For this I choose to write a web scraping program for I do not know of an official REST service.

Say No to VBA Webscraping

There was a time on this blog when I shared code that allowed a VBA developer to create an instance of instance of Internet Explorer, navigate to a web page and then traverse the HTML DOM to extract the necessary information. Those days have past now for a number of reasons. Firstly, Internet Explorer is legacy as Microsoft have the Edge browser so it would be foolish to write code against Internet Explorer. Secondly, since I branched out into alternatives to VBA I have discovered the quality of other libaries such as the .NET ecosystem and the Python ecosystem which both have many-fold better libraries.
I'd urge no new development in VBA for webscraping as there are three or four better options depending on your architecture and programming language preferences:
  • Chrome Extension, embedded JavaScript is best placed to traverse an HTML Dom and even sink events (needs a web server to relay info to though)
  • .NET Html Agility Pack, so C# developers would recommends this.
  • Python's Beautiful Soup
  • Webdriver interface, formerly known as Selenium
In this post I'm using Python's Beautiful Soup library but I am giving a COM wrapper so that it is callable from VBA because this is an Excel development blog after all.

I do not wish to replicate the Beautiful Soup documentation so I will be brief. Initially, I felt frustrated because I like to use a CSS selector path such as the following but this didn't not work for me which is a shame as it is a nice one shot declarative way to access an element.
#contentIndex > div.search_results_list > table > tbody > tr > td:nth-child(1)
But, I then read the documentation and saw how easily it was to script against a DOM (and actually a DOM fragment as well!) that the resulting code was perfectly easy to write.
All the web-scraping is packed into a class called LSEWebServices. But I also promised a COM wrapper class and that is LSEWebServicesCOMWrapper which simply delegates to an instance of LSEWebServices; I guess I could have inherited perhaps.

Important note: this code has been updated because London Stock Exchange changes their website

import urllib.request as urllib2
from bs4 import BeautifulSoup
import pythoncom
import logging

class LSEWebServices:

    def ExtractLSESearchResults(self,searchTerm: str):
        try:
            req = urllib2.urlopen("https://www.londonstockexchange.com/search?searchtype=all&q=" + searchTerm)
            html = req.read().decode("utf-8")

            try:
                soup = BeautifulSoup(html, 'html.parser')
                searchResultList = soup.find_all('div','item-separator')
                if (len(searchResultList)==1):
                    subText = str(searchResultList[0])
                    soup2 = BeautifulSoup(subText, 'html.parser')
                    instrumentTidm = soup2.find('span','instrument-tidm').text 
                    instrumentDesc = soup2.find_all('span','instrument-uppercase') [1].text
                    instrumentLink = soup2.find('a','tidm-and-description').attrs['href']
                    return (instrumentTidm, instrumentDesc,instrumentLink)
                pass


            except Exception as ex:
                print(ex)


        except Exception as ex:
            print(ex)

    def ExtractLSESecurityInformation(self, link:str):
        try:
            req = urllib2.urlopen(link)
            html = req.read().decode("utf-8")

            soup = BeautifulSoup(html, 'html.parser')
            instrumentInformation = soup.find_all('div','chart-table-instrument-information')
            if (len(instrumentInformation)==1):
                subText = str(instrumentInformation[0])
                soup2 = BeautifulSoup(subText, 'html.parser')
                spanMarketSegmentParent = soup2.find('span',string=' Market segment ').parent()
                marketSegment  = spanMarketSegmentParent[1].text

                spanSEDOLParent = soup2.find('span',string=' SEDOL ').parent()
                SEDOL  = spanSEDOLParent[1].text

                spanMICParent = soup2.find('span',string=' Market identifier code (MIC) ').parent()
                MIC  = spanMICParent[1].text

                spanISINParent = soup2.find('span',string=' ISIN ').parent()
                ISIN  = spanISINParent[1].text

                return (marketSegment, MIC,  SEDOL, ISIN)

        except Exception as ex:
            print(ex)

def TestLSEWebServices():
    lse = LSEWebServices()
    results = lse.ExtractLSESearchResults("gb0031790826")
    print(results)
    sec_info = lse.ExtractLSESecurityInformation(results[2])
    print(sec_info)
    dummy = 1 # a line of code upon which I can place a breakpoint
    pass

class LSEWebServicesCOMWrapper(object):
    _reg_clsid_ = "{81F3D23E-83E5-42DF-96E8-5042933379CF}"
    _reg_progid_ = 'PythonInVBA.LSEWebServicesCOMWrapper'
    _reg_clsctx_ = pythoncom.CLSCTX_LOCAL_SERVER 
    _public_methods_ = ['ExtractLSESearchResults','ExtractLSESecurityInformation']

    def ExtractLSESearchResults(self,searchTerm: str):
        try:
            lse = LSEWebServices()
            results = lse.ExtractLSESearchResults(searchTerm)
            
            if (results is not None):
                return list(results) 
            else:
                return None

        except Exception as ex:
            print(ex)

    def ExtractLSESecurityInformation(self, link:str):
        try:
            lse = LSEWebServices()
            results = lse.ExtractLSESecurityInformation(link)

            if (results is not None):
                return list(results) #.tolist()
            else:
                return None

        except Exception as ex:
            print(ex)

def TestLSEWebServicesCOMWrapper():
    lse = LSEWebServicesCOMWrapper()
    results = lse.ExtractLSESearchResults("gb0031790826")
    print(results)
    sec_info = lse.ExtractLSESecurityInformation(results[2])
    print(sec_info)
    dummy = 1
    pass

def RegisterThis():
    print("Registering COM servers...")
    import win32com.server.register
    win32com.server.register.UseCommandLine(LSEWebServicesCOMWrapper)

if __name__ == '__main__':
    #TestLSEWebServicesCOMWrapper()
    RegisterThis()

Client VBA Code

So the Python code needs to be run once to register the COM servers and for this you need Administrator rights; once run the following VBA code can run.
All the hard work has been done for us in the Python code. In VBA, we create the COM wrapper class and call a couple of methods, the Python returns Python lists which are converted into COM/VBA variant arrays and we just pull out the relevant item. Simple.
Option Explicit

Sub TestLSEWebServicesCOMWrapper()
    
    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.LSEWebServicesCOMWrapper")
    
    Dim vResult As Variant
    vResult = obj.ExtractLSESearchResults("gb0031790826")
    
    'Stop
    Dim vSecInfo As Variant
    vSecInfo = obj.ExtractLSESecurityInformation(vResult(2))

    Debug.Print "ISIN:" & vSecInfo(3), vResult(1), "TIDM:" & vResult(0), "SEDOL:" & vSecInfo(2), ""
    
    'Stop

End Sub
And the above code gives the output
ISIN:GB0031790826           UNITED KINGDOM 2% IL TREASURY 35          TIDM:T2IL     SEDOL:3179082 
So now referring back to the use case I can verify that my stock broker recognizes T2IL and 3179082 as identifiers. So we have converted the ISIN into something useful.