Thursday 25 January 2018

VBA - XMLHttp Request (XHR) does not parse response as XML

HTML was initially conceived to be like XML in that for every opening tag there is a closing tag and the attributes are enclosed in quotes but in reality it breaks these rules and can rarely be used with an Xml parser. So XML is fussy and HTML is not.

However, take a look at the following code; it uses the XmlHttp request (XHR) object but we should note that it never parses the response as Xml unless you write the code (example code given in separate function). I think this is nice use of XHR. The code goes on to insert the response text as Html into a MSHTML.HTMLDocument and from there can web scrape whatever.

Sub DoNotParseXml()

    Dim oXHR As MSXML2.XMLHTTP60
    Set oXHR = New MSXML2.XMLHTTP60
    
    Dim oHtmlDoc As MSHTML.HTMLDocument
    Set oHtmlDoc = New MSHTML.HTMLDocument
    
    oXHR.Open "GET", "https://coinmarketcap.com/all/views/all/" & "?Random=" & Rnd() * 100, False
    oXHR.setRequestHeader "Content-Type", "text/XML"
    oXHR.send

    If oXHR.Status = "200" Then
        
        '* no parse of 'non well-formed xml' take place
        oHtmlDoc.body.innerHTML = oXHR.responseText
    
        '** do some web scraping with MSHTML.HTMLDocument
    
        '... oHtmlDoc.getElementsByClassName("price")

        
        '* but if we had tried to parse the response text .. it would have errored
        ParseXml oXHR.responseText
    End If
End Sub

Private Function ParseXml(ByVal sText As String) As MSXML2.DOMDocument60
    Dim oDom As MSXML2.DOMDocument60
    Set oDom = New MSXML2.DOMDocument60
    oDom.LoadXML sText
    
    '* it would have errored
    Debug.Assert oDom.parseError = 0

End Function

No comments:

Post a Comment