Monday 14 May 2018

VBA - MSXML2 uncompromising interface based programming can throw newbies

So on SO I have just written an answer for a self-professed VBA newbie concerning XML in VBA. Whilst writing the answer one has to judge how much information to pass on, too little does not help solve the problem, too much seems patronising and also one ought not recreate the documentation. But actually there is something about the MSXML2 library specifically which throws VBA newbies and a good blog post should explain what is happening.

COM and interface based programming

VBA is a COM (Microsoft Component Object Model) technology. COM is based on interface programming, every COM call executes via an interface and not directly on an object. Interface based programming advocates encourage polymorphism and breaking an object's identity into separate interfaces.

So for example one could have a banking application with business domain objects such as overdraft, mortgage, credit card. Interface based programming advocates would encourage the identification of a common interface for these objects and a separate interface to handle each object's idiosyncrasies.

Excel's Object Model hides default interfaces

But interface based programming can be confusing to newbies. Excel's object model is COM based and so uses interfaces but each Excel object typically has one default interface and VBA pulls a trick to make the calling the default interface look the same as calling the object. This sleight of hand by VBA is useful for productivity and helps quick scripting of the Excel object model but at some point the newbie might need to access Xml and use MSXML2 library when they may well be flummoxed at having to deal with interfaces.

So using the VBA Object Browser one can see many Excel classes such as Workbook, Worksheet, Application and Range and their methods so it looks like one is calling the objects. But if one uses OLEView.exe one can see that the underlying default interfaces are _Workbook, _Worksheet, _Application and IRange. Range does not exists as its own separate class but this is faked (along with others) some VBA programmers think they are dealing with an object.

Note how some interfaces begin with I, e.g. IRange.

What is a sideways cast?

If an object's functionality is broken down into more than one interface (as sometimes advocated) it becomes necessary to give a mechanism to hop from one interface to another on the same object. How is this interface hopping done in VBA? Answer, with a 'sideways cast'.

The code below is illustrative only, it won't run. It illustrates a sideways cast. The sideways cast is when one uses the Set keyword to equate one 'object variable' to another. But actually they are not 'object variables' but in fact interface variables.

Sub SidewaysCastIllustration()
    '* this is illustrative only, it won't run
    Dim oShape As IShape
    Set oShape = getShape("Square1")
    
    Dim oSquare As ISquare
    Set oSquare = oShape  '* <-- sideways cast take the object oShape and queries for interface ISquare 

End Sub

[For real geeks interested in what happens under the hood a sideways cast calls the QueryInterface method on the the canonical COM interface IUnknown but newbies need not concern themselves with this.]

MSXML2 is uncompromising interface based programming

When using MSXML2 to handle Xml in VBA one declares variables of interface types such as IXMLDOMNode, IXMLDOMElement and IXMLDOMAttribute. Note how they begin with I. Time for code to illustrate. The code at the bottom parses a mini Xml document and two parts, one element and one attribute, are retrieved from the document. Because the selection method SelectSingleNode can return elements or attributes it is defined to return a unifying interface IXMLDOMNode. To then access the details of the element one sideways casts from IXMLDOMNode to IXMLDOMElement.

This program shows Xml parsing with interface based programming and sideways casts to hop between interfaces on object.

Option Explicit

Sub Test()
    '* requires Tools->References->Microsoft XML, v6.0

    Dim sXml As String
    sXml = "<root><foo id='fooey'/><bar><baz>hi</baz></bar></root>"
    
    Dim oDom As MSXML2.DOMDocument60
    Set oDom = New MSXML2.DOMDocument60
    
    oDom.LoadXML sXml
    Debug.Assert oDom.parseError.ErrorCode = 0
    
    Dim oNode As MSXML2.IXMLDOMNode
    Set oNode = oDom.SelectSingleNode("root/foo/@id")

    '*
    '* to access the methods for interacting with an attribute
    '* one needs to "sideways cast" which queries for another interface
    '*
    Dim oAttrSidewaysCast As MSXML2.IXMLDOMAttribute
    Set oAttrSidewaysCast = oNode  '* <-- sideways cast from IXMLDOMNode to IXMLDOMAttribute 
    Debug.Print oAttrSidewaysCast.Value
    
    Set oNode = Nothing  '* reset the variable
    Set oNode = oDom.SelectSingleNode("root/bar/baz")

    '*
    '* to access the methods for interacting with an element
    '* one needs to "sideways cast" which queries for another interface
    '*
    Dim oElementSidewaysCast As MSXML2.IXMLDOMElement
    Set oElementSidewaysCast = oNode  '* <-- sideways cast from IXMLDOMNode to IXMLDOMElement 
    Debug.Print oElementSidewaysCast.nodeTypedValue

End Sub

No comments:

Post a Comment