Thursday 8 February 2018

VBA - Multi-threaded Asynchronous Google Sheet Downloader

Summary: Google Sheets is web/cloud based, this means web calls to get data. Here we use multi-threading and asynchronous callbacks to make the a download lightning fast and fully responsive.

Just to clarify, VBA is inherently single threaded but it can call components that are in turn multi-threaded (or appear so) to make the VBA also appear multi-threaded. In this instance the HTTP Request components are (or appear to be) multi-threaded. So we can fire off a multitude of HTTP requests. But we need a way to handle the results as they come in, thankfully we have the function delegates from the prior article.

Also in this program we use the ScriptControl to host some JavasScript function because I believe that is better suited to parsing JSON which essentially is what this task is. But we go further and actually write the values onto a worksheet with Javascript, this illustrates the inter-operability of the ScriptControl.

The AsynchronousGoogleSheet class module


Option Explicit

'* Tools->References
'*     WinHttp    Microsoft WinHTTP Services, version 5.1     C:\WINDOWS\system32\winhttpcom.dll


Private WithEvents moXHR As WinHttp.WinHttpRequest

Private mfnFunctionDelegateOnError As FunctionDelegate
Private mfnFunctionDelegateOnResponseFinished As FunctionDelegate
Private msSheetTitle As String

'---------------------------------------------------------------------------------------
' Procedure : RunAsynchronous
' DateTime  : 06/02/2018 16:20
' Author    : Simon
' Purpose   :
'---------------------------------------------------------------------------------------
' Arguments :
'    sHttpMethod                  : one of {GET, PUT, POST, DELETE}
'    sURL                         : the web address
'    fnOnErrorDelegate            : the delegate of the function we want to call on an error
'    fnOnResponseFinishedDelegate : the delegate of the function we want to call on completion
'
Public Sub RunAsynchronous(ByVal sSheetTitle As String, ByVal sHttpMethod As String, ByVal sURL As String, _
        ByVal fnOnErrorDelegate As FunctionDelegate, ByVal fnOnResponseFinishedDelegate As FunctionDelegate)

    Set mfnFunctionDelegateOnError = fnOnErrorDelegate
    Set mfnFunctionDelegateOnResponseFinished = fnOnResponseFinishedDelegate

    msSheetTitle = sSheetTitle


    Set moXHR = New WinHttp.WinHttpRequest
    moXHR.Open sHttpMethod, sURL, True
    moXHR.Send
    

End Sub



'---------------------------------------------------------------------------------------
' Procedure : moXHR_OnError
' DateTime  : 06/02/2018 16:16
' Author    : Simon
' Purpose   : see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383929(v=vs.85).aspx
'---------------------------------------------------------------------------------------
'
Private Sub moXHR_OnError(ByVal ErrorNumber As Long, ByVal ErrorDescription As String)
    'Debug.Print "moXHR_OnError"
    If Not mfnFunctionDelegateOnError Is Nothing Then
        mfnFunctionDelegateOnError.Run ErrorNumber, ErrorDescription
    End If
    
End Sub

'---------------------------------------------------------------------------------------
' Procedure : moXHR_OnResponseDataAvailable
' DateTime  : 06/02/2018 16:16
' Author    : Simon
' Purpose   : see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383941(v=vs.85).aspx
'---------------------------------------------------------------------------------------
'
Private Sub moXHR_OnResponseDataAvailable(Data() As Byte)
    'Debug.Print "moXHR_OnResponseDataAvailable"
    '* not interested
End Sub

'---------------------------------------------------------------------------------------
' Procedure : moXHR_OnResponseFinished
' DateTime  : 06/02/2018 16:17
' Author    : Simon
' Purpose   : see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383946(v=vs.85).aspx
'---------------------------------------------------------------------------------------
'
Private Sub moXHR_OnResponseFinished()
    'Debug.Print "moXHR_OnResponseFinished"
    If Not mfnFunctionDelegateOnResponseFinished Is Nothing Then
        If Len(msSheetTitle) = 0 Then
            '* it's the master detail page
            mfnFunctionDelegateOnResponseFinished.Run moXHR.ResponseText
        Else
            '* it's an individual sheet
            mfnFunctionDelegateOnResponseFinished.Run msSheetTitle, moXHR.ResponseText
        End If
    End If
    
End Sub

'---------------------------------------------------------------------------------------
' Procedure : moXHR_OnResponseStart
' DateTime  : 06/02/2018 16:17
' Author    : Simon
' Purpose   : see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383954(v=vs.85).aspx
'---------------------------------------------------------------------------------------
'
Private Sub moXHR_OnResponseStart(ByVal Status As Long, ByVal ContentType As String)
    'Debug.Print "moXHR_OnResponseStart"
    '* not interested
End Sub

The modGoogleSheetsImporter standard module


Option Explicit
Option Private Module

'* Tools->References
'*   WinHttp               Microsoft WinHTTP Services, version 5.1     C:\WINDOWS\system32\winhttpcom.dll
'*   MSScriptControl       Microsoft Script Control 1.0                C:\Windows\SysWOW64\msscript.ocx
'*   Scripting             Microsoft Scripting Runtime                 C:\Windows\SysWOW64\scrrun.dll
'*   MSXML2                Microsoft XML, v6.0                         C:\Windows\SysWOW64\msxml6.dll



'---------------------------------------------------------------------------------------
' Module    : modGoogleSheetsImporter
' DateTime  : 08/02/2018 22:15
' Author    : Simon
' Purpose   : Asynchronously downloads individual Google Sheets
'---------------------------------------------------------------------------------------


Private moAsyncXHRMaster As AsynchronousGoogleSheet
Private marrAsyncXHR() As AsynchronousGoogleSheet

Private msWorkbookID As String
Private msAPIKey As String

'---------------------------------------------------------------------------------------
' Procedure : SC
' DateTime  : 08/02/2018 22:06
' Author    : Simon
' Purpose   : Sets up, caches and returns an instance of ScriptControl loaded with our
'             Javascript functions that include
'               deleteValueByKey()  which removes a property from an object
'               setValueByKey() adds or replaces a property on an object
'               enumKeysToMsDict() enumerates an object's property names to a Dictionary
'               JSON_parse() takes a string and parses to a JScriptTypeInfo object
'               JSON_stringify() takes a JScriptTypeInfo object and converts to a string
'---------------------------------------------------------------------------------------
' Arguments :
'    [out,retval]   : A ScriptControl instance loaded with our Javascript functions
'
Public Function SC() As ScriptControl
    Static soSC As ScriptControl
    If soSC Is Nothing Then


        Set soSC = New ScriptControl
        soSC.Language = "JScript"

        soSC.AddCode "function deleteValueByKey(obj,keyName) { delete obj[keyName]; } "
        soSC.AddCode "function setValueByKey(obj,keyName, newValue) { obj[keyName]=newValue; } "
        soSC.AddCode "function enumKeysToMsDict(jsonObj,msDict) { for (var i in jsonObj) { msDict.Add(i,0); }  } "
        soSC.AddCode GetJavaScriptLibrary("https://raw.githubusercontent.com/douglascrockford/JSON-js/master/json2.js")
        soSC.AddCode "function JSON_stringify(value, replacer,spacer) { return JSON.stringify(value, replacer,spacer); } "
        soSC.AddCode "function JSON_parse(sJson) { return JSON.parse(sJson); } "

        Dim sFnExtractTitleToMsDict As String
        sFnExtractTitleToMsDict = "function extractTitleToMsDict(text,dict) {" & _
                                "    try {" & _
                                "        var doc = JSON.parse(text);" & _
                                "        for (var i = 0; i < doc.sheets.length; i++) { " & _
                                "            var sheetLoop = doc.sheets[i]; " & _
                                "            var title = sheetLoop.properties.title; " & _
                                "            dict.Add(title, i); " & _
                                "        } " & _
                                "        return dict.Count; " & _
                                "    } " & _
                                "    catch (ex) { " & _
                                "        return ('#error in extractTitleToMsDict!'); " & _
                                "    } " & _
                                "}"
        soSC.AddCode sFnExtractTitleToMsDict

        Dim sFnWriteGoogleSheetToExcelWorksheet As String
        sFnWriteGoogleSheetToExcelWorksheet = "function writeGoogleSheetToExcelWorksheet(text,ws,app) {" & _
                                "    try {" & _
                                "        var doc = JSON.parse(text);" & _
                                "        var bByRows=(doc.majorDimension === 'ROWS'); " & _
                                "        var rngLoop;" & _
                                "        var docValues = doc['values'];" & _
                                "        for (var i = 0; i < docValues.length; i++) { " & _
                                "            var major = doc.values[i] ;" & _
                                "            for (var j = 0; j < major.length; j++) { " & _
                                "                if (bByRows) {" & _
                                "                    rngLoop = ws.cells(i+1,j+1); " & _
                                "                } else {" & _
                                "                    rngLoop = ws.cells(j+1,i+1); " & _
                                "                } " & _
                                "                rngLoop.value=major[j]; " & _
                                "            } " & _
                                "        } " & _
                                "        return true; " & _
                                "    } " & _
                                "    catch (ex) { return ('#error in writeGoogleSheetToExcelWorksheet!'); } " & _
                                "}"
'                                "                app.Run('log','i:' + i + '  j:'+j + '  ' + rngLoop.Address + '  ' + major[j]); " & _

        soSC.AddCode sFnWriteGoogleSheetToExcelWorksheet



    End If
    Set SC = soSC
End Function

'---------------------------------------------------------------------------------------
' Procedure : GetJavaScriptLibrary
' DateTime  : 08/02/2018 22:05
' Author    : Simon
' Purpose   : Download core Javascript libraries such as Douglas Crockford's JSON parser
'             we use MSXML2.XMLHTTP60 here because it caches and the content is static
'---------------------------------------------------------------------------------------
' Arguments :
'  [in]sURL      : The URL of the Javascript library
'  [out,retval]  : The javascript source
'
Private Function GetJavaScriptLibrary(ByVal sURL As String) As String

    Dim xHTTPRequest As MSXML2.XMLHTTP60
    Set xHTTPRequest = New MSXML2.XMLHTTP60
    xHTTPRequest.Open "GET", sURL, False
    xHTTPRequest.Send
    GetJavaScriptLibrary = xHTTPRequest.ResponseText

End Function

'----------------------------------------------------------------------------------------------
' Procedure : AddOrClearSheets
' DateTime  : 08/02/2018 22:03
' Author    : Simon
' Purpose   : We can add the sheets in advance, synchronously
'----------------------------------------------------------------------------------------------
' Arguments :
'  [in]dicTitles    : a dictionary contained the titles extracted from the master detail page
'  [in]lSheetCount  : used to test on a smaller subset
'
Private Sub AddOrClearSheets(ByVal dicTitles As Scripting.Dictionary, ByVal lSheetCount As Long)

    Dim dicCurrentSheets As Scripting.Dictionary
    Set dicCurrentSheets = GetCurrentSheets

    '* add the sheets synchronously
    Dim lLoop As Long
    For lLoop = 0 To lSheetCount - 1


        Dim sTitleLoop As String
        sTitleLoop = dicTitles.Keys()(lLoop)

        If Not dicCurrentSheets.Exists(sTitleLoop) Then
            Dim wsAdded As Excel.Worksheet
            Set wsAdded = ThisWorkbook.Worksheets.Add
            wsAdded.Name = sTitleLoop
            dicCurrentSheets.Add sTitleLoop, 0
        Else
            Dim wsLoop As Excel.Worksheet
            Set wsLoop = ThisWorkbook.Worksheets.Item(sTitleLoop)
            wsLoop.Cells.Clear
            
        End If

    Next

End Sub

'---------------------------------------------------------------------------------------
' Procedure : GetIndividualSheetTitles
' DateTime  : 08/02/2018 21:55
' Author    : Simon
' Purpose   : Calls into the Javascript function extractTitleToMsDict because Javascript
'             is better suited to working with JSON
'---------------------------------------------------------------------------------------
' Arguments :
'   [in]sMasterPageJSON : This is JSON of the master page
'   [out,retval]        : a dictionary containing the sheet titles in the Google Sheets workbook
'
Private Function GetIndividualSheetTitles(ByVal sMasterPageJSON As String) As Scripting.Dictionary
    Dim dicTitles As Scripting.Dictionary
    Set dicTitles = New Scripting.Dictionary
    

    Dim vCount As Variant
    vCount = SC.Run("extractTitleToMsDict", sMasterPageJSON, dicTitles)

    If Not IsNumeric(vCount) Then Err.Raise vbError, , "#Error whilst running extractTitleToMsDict!"

    Set GetIndividualSheetTitles = dicTitles


End Function

'---------------------------------------------------------------------------------------
' Procedure : GetCurrentSheets
' DateTime  : 08/02/2018 21:54
' Author    : Simon
' Purpose   : Instead of calling Worksheet.Item on a sheet that doesn't exist, I'd prefer
'             to get a dictionary of the sheets and call Dictionary.Exists()
'---------------------------------------------------------------------------------------
' Arguments :
'   [out,retval]    : a dictionary containing the sheet names in this workbook
'
Private Function GetCurrentSheets() As Scripting.Dictionary

    Dim dicCurrentSheets As Scripting.Dictionary
    Set dicCurrentSheets = New Scripting.Dictionary


    Dim wsLoop As Excel.Worksheet
    For Each wsLoop In ThisWorkbook.Worksheets
        dicCurrentSheets.Add wsLoop.Name, 0
    Next wsLoop

    Set GetCurrentSheets = dicCurrentSheets

End Function

'---------------------------------------------------------------------------------------
' Procedure : TestDownloadGoogleSheetToExcel
' DateTime  : 08/02/2018 21:52
' Author    : Simon
' Purpose   : Test stub
'---------------------------------------------------------------------------------------
'
Sub TestDownloadGoogleSheetToExcel()
    
    Call DownloadGoogleSheetToExcel("1nDSd38lIQj_aTWDRPJ-aPybR0NwFdQ8GLSDJM0-QaR4", gscKey)
End Sub

'---------------------------------------------------------------------------------------
' Procedure : DownloadGoogleSheetToExcel
' DateTime  : 08/02/2018 21:45
' Author    : Simon
' Purpose   : This sets up the asynchronous master detail page web call
'
' Notes     : You can find out the workbookid by looking at the url in the browser when editing
' the sheet in Google Sheets so my url is
'
'  https://docs.google.com/spreadsheets/d/1nDSd38lIQj_aTWDRPJ-aPybR0NwFdQ8GLSDJM0-QaR4/edit#gid=1544561606
'
' So the format is
'
'  https://docs.google.com/spreadsheets/d/[sWorkbookID]/edit#gid=1544561606
'
' As for sAPIKey acquire your own at Google Developer Console
'
'  https://console.developers.google.com/iam-admin/quotas
'
'---------------------------------------------------------------------------------------
' Arguments :
'    [in]sWorkbookID    : the Google Sheet ID for the whole workbook
'    [in]sAPIKey        : you need to get your own APIKey
'
Sub DownloadGoogleSheetToExcel(ByVal sWorkbookID As String, ByVal sAPIKey As String)
    
    msWorkbookID = sWorkbookID
    msAPIKey = sAPIKey


    Dim sMasterTemplate As String
    sMasterTemplate = "https://sheets.googleapis.com/v4/spreadsheets/[workbook_id]?key=[apikey]"

    Dim sMasterPageURL As String
    sMasterPageURL = VBA.Replace(VBA.Replace(sMasterTemplate, "[workbook_id]", sWorkbookID), "[apikey]", sAPIKey)
    
    Set moAsyncXHRMaster = New AsynchronousGoogleSheet
    
    Call moAsyncXHRMaster.RunAsynchronous("", "GET", sMasterPageURL, _
                    FnFactory.FnAppRun("Async_OnError"), _
                    FnFactory.FnAppRun("ContinueAndDownloadIndividualSheets"))
    
    
End Sub
    
'---------------------------------------------------------------------------------------
' Procedure : ContinueAndDownloadIndividualSheets
' DateTime  : 08/02/2018 22:23
' Author    : Simon
' Purpose   : Called by a function delegate when the asynchronous Master Detail Page web
'             call has completed
'---------------------------------------------------------------------------------------
' Arguments :
'    [in]sMasterPageJSON    : the JSON of the master detail page
'
Sub ContinueAndDownloadIndividualSheets(ByVal sMasterPageJSON As String)
    
    Dim dicTitles As Scripting.Dictionary
    Set dicTitles = GetIndividualSheetTitles(sMasterPageJSON)

    Dim sDetailTemplate As String
    sDetailTemplate = "https://sheets.googleapis.com/v4/spreadsheets/[workbook_id]/values/[sheet_title]?key=[apikey]"


    If dicTitles.Count > 0 Then

        Dim lSheetCount As Long
        lSheetCount = 3 'dicTitles.Count '* separate variable so I can test on small number

        ReDim marrAsyncXHR(0 To lSheetCount - 1) As AsynchronousGoogleSheet
        

        AddOrClearSheets dicTitles, lSheetCount

        Dim lLoop As Long
        For lLoop = 0 To lSheetCount - 1

            Dim sTitleLoop As String
            sTitleLoop = dicTitles.Keys()(lLoop)

            Dim sDetailPageURL As String
            sDetailPageURL = VBA.Replace(VBA.Replace(VBA.Replace(sDetailTemplate, "[workbook_id]", msWorkbookID), "[apikey]", msAPIKey), "[sheet_title]", sTitleLoop)

            Set marrAsyncXHR(lLoop) = New AsynchronousGoogleSheet
            Call marrAsyncXHR(lLoop).RunAsynchronous(sTitleLoop, "GET", sDetailPageURL, _
                        FnFactory.FnAppRun("Async_OnError"), _
                        FnFactory.FnAppRun("SheetAvailable"))


        Next
        
    End If
End Sub

'---------------------------------------------------------------------------------------
' Procedure : Async_OnError
' DateTime  : 08/02/2018 21:44
' Author    : Simon
' Purpose   : reports error (not seen this executed once yet)
'---------------------------------------------------------------------------------------
' Arguments :
'    [in]ErrorNumber        : passed on from WinHttp.WinHttpRequest OnError event
'    [in]ErrorDescription   : passed on from WinHttp.WinHttpRequest OnError event
'
Public Sub Async_OnError(ByVal ErrorNumber As Long, ByVal ErrorDescription As String)
    Debug.Print
    Debug.Print ErrorNumber, ErrorDescription
    Stop
    
End Sub


'---------------------------------------------------------------------------------------
' Procedure : SheetAvailable
' DateTime  : 08/02/2018 21:42
' Author    : Simon
' Purpose   :
'---------------------------------------------------------------------------------------
' Arguments :
'    [in]sSheetTitle    : the name of sheet pass to Worksheets.Item()
'    [in]sSheetJSON     : the JSON string representation of a Google Sheet worksheet
'
Public Sub SheetAvailable(ByVal sSheetTitle As String, ByVal sSheetJSON As String)

    Dim ws As Excel.Worksheet
    Set ws = ThisWorkbook.Worksheets.Item(sSheetTitle) 'should work because we added all of them earlier
    ws.Cells.Clear
    ws.Activate

    '* call into our Javascript writer
    Dim vWrite As Variant
    vWrite = SC.Run("writeGoogleSheetToExcelWorksheet", sSheetJSON, ws, Application)

End Sub

'---------------------------------------------------------------------------------------
' Procedure : Log
' DateTime  : 08/02/2018 21:52
' Author    : Simon
' Purpose   : This is callable from JavaScript using Application.Run('Log','hello world');
'---------------------------------------------------------------------------------------
' Arguments :
'    [in]vMsg   : the message to log
'
Function Log(vMsg)
    Debug.Print vMsg
End Function

No comments:

Post a Comment