Friday 20 July 2018

Python - HTML - pytidylib does not install HTML Tidy

So last post I wrote Python class to decompile a *.chm compiled help file. Found within is what looks like HTML 3.2 that be should upgraded to either XHTML or HTML5. I had written some VBA code to do this but since Python month on this blog I am keen to find out what a Python developer would do. They would (I should imagine) use the library https://pypi.org/project/pytidylib/ which wraps the venerable HTML Tidy.

pip install pytidylib does not install HTML Tidy

So one installs pytidylib from a command window with admin rights using

pip install pytidylib
C:\Users\Simon\source\repos\foo\bar>pip install pytidylib
Collecting pytidylib
  Downloading https://files.pythonhosted.org/packages/2d/5e/4d2b5e2d443d56f444e2a3618eb6d044c97d14bf47cab0028872c0a468e0/pytidylib-0.3.2.tar.gz (87kB)
    100% |████████████████████████████████| 92kB 1.4MB/s
Installing collected packages: pytidylib
  Running setup.py install for pytidylib ... done
Successfully installed pytidylib-0.3.2

And Using Visual Studio I run a small example program to test the install

from tidylib import tidy_document
document, errors = tidy_document('''<p>fõo <img src="bar.jpg">''',
options={'numeric-entities':1})
print (document)
print (errors)

But unfortunately it complains of not being able to find libtidy which indicates HTML Tidy is not installed for you.

Here is the stack trace

OSError
  Message=Could not load libtidy using any of these names: libtidy,libtidy.so,libtidy-0.99.so.0,cygtidy-0-99-0,tidylib,libtidy.dylib,tidy
  StackTrace:
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tidylib\tidy.py:99 in Tidy.__init__
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tidylib\tidy.py:234 in get_module_tidy
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\tidylib\tidy.py:222 in tidy_document
C:\Users\Simon\source\repos\CompiledHelpToEbookPythonApp\CompiledHelpToEbookPythonApp\HtmlTidy.py:3 in 

Install HTML Tidy Binaries

It is required to install the HTML Tidy Binaries separately. I got mine from http://binaries.html-tidy.org/. Initially, I took the 32-bit edition which was a mistake and the error persisted. So I took the 64-bit edition, I downloaded tidy-5.6.0-vc14-64b.zip, extracted it and then added the extracted bin folder to my path. Don't forget to restart processes for the environment variables changes to be picked up.

After Successful Install

After successful install this is what is output from the sample program above.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <title></title>
  </head>
  <body>
    <p>fõo <img src="bar.jpg">
  </body>
</html>

line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: plain text isn't allowed in <head> elements
line 1 column 1 - Info: <head> previously mentioned
line 1 column 1 - Warning: inserting implicit <body>
line 1 column 1 - Warning: inserting missing 'title' element

Press any key to continue . . .

Python - VBA - HTML Help - COM Callable Python class runs *.chm decompiler.

I have many old *.chm compiled help files on my computer and for some reason the help viewer is bust. I am considering writing (Python) code which will upgrade these *.chm files into ebooks. The first module is given here, it will run the Microsoft HTML Help executable (hh.exe) to decompile a *.chm file into its constituent html files all within working subdirectories in the %temp% folder.

Submitting to Code Review

I might very well complete a full application and place in github. This means I need to raise my Python standards and so I have submitted this module to codereview.stackexchange.com

The HelpFileDecompiler Python class

Here is the Python code

class HelpFileDecompiler(object):
    _reg_clsid_ = "{4B388A08-8CAB-4568-AF78-47032744A368}"
    _reg_progid_ = 'PythonInVBA.HelpFileDecompiler'
    _public_methods_ = ['DecompileHelpFileMain']

    def DecompileHelpFileMain(self, sHelpFile):
        import os.path
        eg = ", e.g. 'c:\\foo.chm'"
        if not isinstance(sHelpFile, str):
            raise Exception("sHelpFile needs to be a string" + eg)
        if len(sHelpFile) == 0:
            raise Exception("sHelpFile needs to be a non-null string" + eg)
        if sHelpFile.lower()[-4:] != ".chm":
            raise Exception("sHelpFile needs to end with '.chm' " + eg)
        if not os.path.isfile(sHelpFile):
            raise Exception("sHelpFile " + sHelpFile + " not found")

        self.BuildTmpDirectory()
        sCopiedFile = self.CopyChmFileToTempAppPath(sHelpFile)
        self.DecompileHelpFile(sCopiedFile)

    def BuildTmpDirectory(self):
        import os
        import os.path
        sTempAppPath = os.path.join(os.environ['tmp'], 'HelpFileDecompiler')

        if not os.path.isdir(sTempAppPath):
            os.mkdir(sTempAppPath)

    def CopyChmFileToTempAppPath(self, sHelpFile):
        import shutil
        import os.path
        sDestFile = os.path.join(os.environ['tmp'], 'HelpFileDecompiler',
                                    os.path.basename(sHelpFile))
        shutil.copyfile(sHelpFile, sDestFile)
        return sDestFile

    def DecompileHelpFile(self, sHelpFile):
        import win32api
        import subprocess
        import os.path

        if not os.path.isfile(sHelpFile):
            raise Exception("sHelpFile " + sHelpFile + " not found")

        sDecompiledFolder = os.path.join(
            os.environ['tmp'],
            'HelpFileDecompiler',
            os.path.basename(sHelpFile).split(".")[0])

        if not os.path.isdir(sDecompiledFolder):
            os.mkdir(sDecompiledFolder)

        sHELPEXE = "C:\Windows\hh.exe"
        if not os.path.isfile(sHELPEXE):
            raise Exception("sHELPEXE " + sHELPEXE + " not found")

        # Not allowed to quote arguments to HH.EXE
        # so we take the short path to eliminate spaces
        sHelpFileShort = win32api.GetShortPathName(sHelpFile)
        sDecompiledFolderShort = win32api.GetShortPathName(sDecompiledFolder)

        # so now we can run the decompiler
        subprocess.run([sHELPEXE, '-decompile',
                        sDecompiledFolderShort, sHelpFileShort])


if __name__ == '__main__':
    print ("Registering COM servers...")
    import win32com.server.register
    win32com.server.register.UseCommandLine(HelpFileDecompiler)

    helpFile=("C:\\Program Files\\Microsoft Office 15\\root\\vfs\\"
                "ProgramFilesCommonX86\\Microsoft Shared\\VBA\\VBA7.1\\"
                "1033\\VBLR6.chm")
    test = HelpFileDecompiler()
    test.DecompileHelpFileMain(helpFile)

Here is some test VBA client code

Option Explicit

Sub Test()

    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.HelpFileDecompiler")
    
    obj.DecompileHelpFileMain "C:\Program Files\Microsoft Office 15\root\vfs\" & _
              "ProgramFilesCommonX86\Microsoft Shared\VBA\VBA7.1\" & _
              "1033\VBLR6.chm"

End Sub

Friday 6 July 2018

OLEDB - C++ - Solved: ATL's OLEDB Provider sample crashes Excel (uncaught exception from msado15.dll)

Happy Conclusion

This post initially did not reach a happy conclusion, I investigated compiling the ATL OLEDB Provider samples but reached an impasse as the ADO runtime throws an exception. I wrote up the problem as a blog post nevertheless because sometimes defeats can be as revealing as victories. I asked on StackOverflow and they came through. Especially Simon Mourier on StackOverflow solved it and I am thankful. I have not changed the main body of the original post, instead I have posted solution at the bottom here.

ADO Recordsets as a marshalling vessel

So I was pondering how to get data from Python to VBA without using COM's SAFEARRAY which can often but not always be acquired with tolist(). One alternative is to create an ADO recordset, it is possible to create an ADO recordset by creating the an Xml representation, i.e. to concatenate a correctly formatted string.

In the era of Visual Basic 5 (upon which VBA is based) code marshalled tabular data from one execution process to another (perhaps even on a different computer as part of an N-tiered scalable architecture, DNA) by transmitting a two dimensional array (SafeArray) housed in OLE Variant. In VB6, the recommended vessel for marshalling tabular became the disconnected ADO recordset. Marshalling means serializing in one process and then deserializing in another. For recordsets, the ADO run-time would do the marshalling where as for OLE Variants it was the COM run-time. They have similar speeds but an ADO recordset packs so much more functionality that one should opt for it.

ADO Recordsets come from OLE DB Providers

The vast majority of ADO recordsets are created by an OLE DB Provider such as the Microsoft SQL Server OLE DB Provider. Can we create our own OLE DB Provider? In theory yes, ATL has had for almost two decades the ability to create OLE DB Providers. I never tried until now. My attempt to get the sample working have got stuck. I'll try not to replicate the documentation, instead I will give links plus some extra info where pertinent.

The ATL Sample OLEDB Provider that Finds Files from a directory

ADO and OLEDB were meant to be an advance on the previous generation of data access technology by allowing a single object model to access not just database tables but also non-tabular data sources such as email stores.

In the case of the ATL Sample OLEDB Provider I have read enough code to say that it (when if works, if ever) actually creates a recordset where each row is a file in a given directory. This is useful as a sample because everyone has a file system and this obviates the need to install a database. It is also illustrative of a non-database source.

The sample reads files by calling API functions such as FindFirstFile.

When running the ATL OLEDB Provider wizard one can customise the name, I will choose "FindFiles".

Running Visual Studio 2017 ATL Wizards to create Find File OLEDB Provider Sample

I will try not to recreate the documentation but will give some screenshots as the official documentation does not (I suspect it is not being maintained).

Because we are creating a COM component you will need to run Visual Studio with admin rights

New Project -> Visual C++ -> ATL -> ATL Project

Click OK on the next screen without entering anything

In the solution explorer you'll find two projects, focus on the top project, FindFiles. On the FindFiles project icon or or any folder icon in the FindFiles project right-click mouse button and then select Add -> New Item . You'll get the following screen from which you should select ATL -> ATL OLEDB Provider and then click Add

The next screen is the wizard, enter FindFiles into the ShortName and (nearly) all the other fields are updated to reflect the short name. Also, add FindFiles to the ProgID, whilst not strictly necessary this will help debugging later. Click finish to commit and the wizard will write the code for you.

Save all the files and compile (don't forget you'll need admin rights).

Enumerating OLE DB Providers

I forget where I found this code (somewhere on Microsoft.com) but it enumerates all the OLE DB Providers installed, after a successful compilation FindFiles should appear

using System;
using System.Data;
using System.Data.OleDb;

namespace OldDbEnumerator
{

    class Program
    {
        static void Main()
        {
            OleDbDataReader reader = OleDbEnumerator.GetRootEnumerator();

            DisplayData(reader);

            Console.WriteLine("Press any key to continue.");
            Console.ReadKey();
        }

        static void DisplayData(OleDbDataReader reader)
        {
            while (reader.Read())
            {
                for (int i = 0; i < reader.FieldCount; i++)
                {
                    Console.WriteLine("{0} = {1}",
                     reader.GetName(i), reader.GetValue(i));
                }
                Console.WriteLine("==================================");
            }
        }
    }
}

And in the output of that is the new Provider, your GUID will be different (it is randomly selected)

SOURCES_NAME = FindFiles Class
SOURCES_PARSENAME = {E387836C-6248-4319-92E8-BCD070844D86}
SOURCES_DESCRIPTION = FindFiles Class
SOURCES_TYPE = 1
SOURCES_ISPARENT = False
SOURCES_CLSID = {E387836C-6248-4319-92E8-BCD070844D86}

Excel VBA client code CRASHES EXCEL

Sorry for caps shouting but it is important to save your Excel session before you play with the next chunk of code. Here is some client VBA code. It crashes Excel VBA.

We'll use for debugging later so give your self a new workbook called TestClient.xlsm, in a new standard module paste the following code

Sub TestOleDbProvider()
    
    On Error GoTo ErrHand
    
    
    Dim cn As ADODB.Connection
    Set cn = New ADODB.Connection
    
    cn.Open "Provider=FindFiles;Server=foo;Database=bar" '* this works
    
    Dim cmd As ADODB.Command
    Set cmd = New ADODB.Command
    
    Set cmd.ActiveConnection = cn   '* this works
    
    cmd.CommandText = "*.*"   '* this works
    
    Stop
    Dim rs As ADODB.Recordset
    Set rs = cmd.Execute  '* crashes here
    
    
    
    Exit Sub
ErrHand:
    Debug.Print Err.Description & " (" & Err.Number & ")"
    'Stop

End Sub

Before you run this code, goto ThisWorkbook and enter the following helpful code which will always navigate to the above code upon workbook opening...

Private Sub Workbook_Open()
    Application.GoTo "TestOleDbProvider"
End Sub

Then save the workbook before you run TestOleDbProvider because it will crash and you will not have an opportunity to save it!

So running the above code crashes Excel, later I will show that an uncaught exception is being thrown by msado15.dll. Now this maybe to do with the fact I do not know how to yet supply a query. Peaking ahead I can show some of the C++ code (CFindFilesRowset::Execute) where it is assumes *.* if supplied an empty string

  CW2TEX<_MAX_PATH> szDir(m_strCommandText == L"" ? L"*.*" : m_strCommandText);

Time to investigate.

Debugging the OLEDB Provider with breakpoints

So we must investigate. I set the project's properties to start Excel and load a workbook called TestClient.xlsm as part of the debug properties. So on the FileFind project icon right-click and select properties (last entry on the menu) to display FindFiles Property Pages within which select the Debugging entry on the left hand side.

Running the code I get an unhandled exception and the call stack is firmly in msado15.dll..

 msado15.dll!CQuery::SetSQL(unsigned short *) Unknown Non-user code. Symbols loaded.
  msado15.dll!CQuery::SetCommandText(long,unsigned long,unsigned char,unsigned char) Unknown Non-user code. Symbols loaded.
  msado15.dll!CQuery::Execute(enum ExecuteTypeEnum,char,unsigned long,bool,unsigned long,unsigned long,long,struct tagVARIANT *,unsigned long,void *,long *,struct _ADORecordset * *) Unknown Non-user code. Symbols loaded.
  msado15.dll!CCommand::_Execute(enum ExecuteTypeEnum,char,unsigned long,bool,unsigned long,unsigned long,long,long,struct tagVARIANT *,unsigned long,void *,long *,struct _ADORecordset * *) Unknown Non-user code. Symbols loaded.
  msado15.dll!CCommand::ExecuteWithModeFlag(struct tagVARIANT *,struct tagVARIANT *,long,struct _ADORecordset * *,int) Unknown Non-user code. Symbols loaded.
  msado15.dll!CCommand::Execute(struct tagVARIANT *,struct tagVARIANT *,long,struct _ADORecordset * *) Unknown Non-user code. Symbols loaded.
  VBE7.DLL!1e813579() Unknown No symbols loaded.
  [Frames below may be incorrect and/or missing, no symbols loaded for VBE7.DLL]  Annotated Frame
  VBE7.DLL!1e7cff4b() Unknown No symbols loaded.
  VBE7.DLL!1e829d13() Unknown No symbols loaded.
  VBE7.DLL!1e82fea2() Unknown No symbols loaded.
  VBE7.DLL!1e82bcb5() Unknown No symbols loaded.
  [External Code]  Annotated Frame

And with that I am stuck. I guess I could ask Stack Overflow. I've never bothered Microsoft for support before, might do so on this occasion. I'll try a SO bounty first I think.

Update: Solution given by Simon Mourier

I am delighted to say that Simon Mourier at StackOverflow solved this by adding another interface ICommandText to the interface map.

BEGIN_COM_MAP(CFindFilesCommand)
    ...
    COM_INTERFACE_ENTRY(ICommandText) 
    ...
END_COM_MAP()

Apparently, the ADO runtime was querying for this interface and not finding it and thus calling on a null pointer. I'm surprised the ADO runtime doesn't check for null pointers but never mind. The cased is solved.

Thursday 5 July 2018

VBA - Python - OCR - Optical Character Recognition

It was Python month on this blog last month but still plenty of ideas of how to leverage the huge Python ecosystem and bring functionality to the feet of VBA Developers. In this blog I play with Optical Character Recognition (OCR) and get it callable from VBA using a COM gateway class.

Tesseract

The OCR Python library I use here is Tesseract which has a long pedigree and happily has Python bindings. But it needs some care to install properly.

Tesseract Installation

I found that using  pip install pytesseract  falsely reported success. Instead, what was necessary was the following steps...

  1. Find a site with a Tesseract Windows binary installer. I found Tesseract at UB Mannheim
  2. Run the Tesseract Windows binary installer. I ran https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe
  3. Add the Tessearcht directory (for me 'C:\Program Files (x86)\Tesseract-OCR') to PATH environment variable
  4. Close down and restart all potential client processes, Visual Studio, Excel, Command Windows (cmd.exe), Powershell, any process.

We'll also use an Image library, Pillow, to load the image file, so use  pip install pillow 

OCRBatch Com Gateway Class

I have given this pattern many times over the last month here is a Python class with enough extra code to allow it to be registered as a COM class and thus callable from VBA. Here is the code, which must be run at least once under admin privileges to enable registration (thereafter not required).

## pip install pillow      ## succeeded

## https://github.com/UB-Mannheim/tesseract/wiki
## https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe
## add C:\Program Files (x86)\Tesseract-OCR to PATH
## restart Visual Studio so it picks up changes to PATH



class OCRBatch(object):
    _reg_clsid_ = "{A7C1275F-7ABD-4AA2-90E5-462392D821DF}"
    _reg_progid_ = 'PythonInVBA.OCRBatch'
    _public_methods_ = ['RunBatch', 'RunOCR']

    def RunBatch(self, rootDir):
        import os
        for subdir, dirs, files in os.walk(rootDir):
            for file in files:
                #print os.path.join(subdir, file)
                filepath = subdir + os.sep + file

                if file.endswith(".jpeg") and file.startswith("File_"):
                    ocrFile = filepath + ".txt"
                    self.RunOCR (filepath,ocrFile)

    def RunOCR(self,imageFile, ocrFile):
        import io
        from PIL import Image
        import pytesseract

        img = Image.open(imageFile )
        text = pytesseract.image_to_string(img)

        with io.open(ocrFile,'w', encoding="utf-8") as f:
            f.write(text)
            f.close()

if __name__ == '__main__':
    print ("Registering COM servers...")
    import win32com.server.register
    win32com.server.register.UseCommandLine(OCRBatch)

    rootdir="C:\\Users\\Simon\\Downloads\\ocr"
    test = OCRBatch()
    test.RunBatch(rootdir)

The code has two methods RunBatch and RunOCR. The latter operates on a single file whilst the former operates on a folder and its subfolders. The folder is scanned for filenames that start with "File_" and end with ".jpeg" because that is the naming convention of the input device, you should change it to suit your input device.

When run on each file Tesseract will scan the Image and generate some text, the text is then saved to a file in the same directory with very similar name (simply suffixed with ".txt"). For me, the output can be quite jumbled and typically needs cleaning up in a text editor but it is better than typing in from scratch.

Test Script

Just quickly point out that I've added some test script code to the tail of the script, this is where the __main__ procedure runs the COM registration. I find this a useful place to write some test code and will probably continue this convention.

OS Walk

Again, quickly point another (smaller) Python nicety in that os.walk recursively walks a directory structure and its subfolders. In VBA one would need to write some code to do that. Here the Python ecosystem delivers another time saving.

Client VBA Code

So thanks to COM registration the client VBA is trivial...

Option Explicit

Sub Test()

    Dim obj As Object
    Set obj = VBA.CreateObject("PythonInVBA.OCRBatch")
    
    obj.RunBatch "C:\Users\Simon\Downloads\ocr"

End Sub

Node.js - Using HTTPS + CORS

So, on this blog I've given a Node.js simple web service example before but it was plain vanilla and didn't handle either secure connections with HTTPS or handle this curious protocol called CORS (see below). I remedy that here by given some working code which allows POSTing of json payloads.

So to make your Node.js web service handle https you need to use

var https = require('https');
var fs = require('fs');

var options = {
    // create the following two files beforehand with openssl
    // https://stackoverflow.com/questions/12871565/how-to-create-pem-files-for-https-web-server#answer-12907165
    key: fs.readFileSync('key.pem'),
    cert: fs.readFileSync('cert.pem')
};

where you are specifying the private key and the SSL certificate which you must have created before. Switch to your node.js source files directory so openssl.exe creates the files in the right place. A good stack overflow Q&A here shows how to create these files.

CORS Cross-Origin Resource Sharing

Ideally all files and content are delivered from one single web domain (+port). Sometimes a use case is not that simple. Sometimes a web resource needs to be accessed from a different origin. CORS allows this cross origin request. As you can imagine you are opening up a security hole; so in production think carefully about where cross-origin requests might originate and tie down as much as possible. In production, don't use wildcards like in sample code given below! A naughty person could launch a denial of service attack if you allow them! A good CORS guide is here.

Pre-flight request

Today I learnt that some client will send an HTTP OPTIONS request before sending a HTTP POST request; this is known as a "pre-flight request". If you are not using a web server framework then you will need to handle this manually by setting HTTP response headers like the following...

            // IN PRODUCTION DO NOT USE WILDCARDS!!!
            response.setHeader("Access-Control-Allow-Origin", "*");
            response.setHeader('Access-Control-Allow-Methods', 'POST, OPTIONS');
            response.setHeader("Access-Control-Allow-Headers", "Origin, X-Requested-With, content-type, Accept");

So in the above code I have not tied down Origin but you dear reader must do in production! But this is a blog sample. I have tied down the HTTP methods, only allowing POST and OPTIONS. I have also tied down the headers.

Whilst one should should tie down options as much as possible if you tie down too much your client will assume it has been refused permission and not send the follow-on POST request (in my use case). I had to debug CORS today because my POSTs were not coming through. If the preflight OPTIONS request is not implemented then yes any POST request will not follow on.

Anyway, here is a full working example. The following code runs an HTTPS service at 127.0.0.1:8000 and only allows OPTIONS and GET. Don't forget in production to further restrict the origin and not use wildcards...

'use strict';

var https = require('https');
var fs = require('fs');

var options = {
    // create the following two files beforehand with openssl
    // https://stackoverflow.com/questions/12871565/how-to-create-pem-files-for-https-web-server#answer-12907165
    key: fs.readFileSync('key.pem'),
    cert: fs.readFileSync('cert.pem')
};

const port = 8000;

console.log('nversion 0.001n');

//https://stackoverflow.com/questions/5998694/how-to-create-an-https-server-in-node-js#answer-21809393
https.createServer(options, function (request, response) {

    switch (request.method) {
        case "OPTIONS":
            // IN PRODUCTION DO NOT USE WILDCARDS!!!
            response.setHeader("Access-Control-Allow-Origin", "*");
            response.setHeader('Access-Control-Allow-Methods', 'POST, OPTIONS');
            response.setHeader("Access-Control-Allow-Headers", "Origin, X-Requested-With, content-type, Accept");
            response.end();
            break;
        case "GET":
            console.log(request.url);
            response.end('Hello Node.js Server!');
            break;
        case "POST":
            // IN PRODUCTION DO NOT USE WILDCARDS!!!
            response.setHeader("Access-Control-Allow-Origin", "*");
            response.setHeader('Access-Control-Allow-Methods', 'POST, OPTIONS');
            response.setHeader("Access-Control-Allow-Headers", "Origin, X-Requested-With, content-type, Accept");
            
            //https://nodejs.org/en/docs/guides/anatomy-of-an-http-transaction/

            let body = [];
            request.on('data', (chunk) => {
                body.push(chunk);
            }).on('end', () => {
                body = Buffer.concat(body).toString();
                // at this point, `body` has the entire request body stored in it as a string
                console.log('nbody received:nn' + body);

            });

            response.end();
            break;
    }
}).listen(port);

Excel - VBA - Parsing International Currency Amounts in VBA

VBA Currency Parsing is limited to one's own currency

So parsing currencies in ordinary VBA can work so long as one sticks to one's own currency, so the following, CCur can handle commas and minus signs and on my machine Sterling Pound symbol but trips up on dollars.

Sub Test()

    Debug.Assert VBAParseCCur("1000") = 1000
    Debug.Assert VBAParseCCur("-1000") = -1000
    Debug.Assert VBAParseCCur("-10,000") = -10000
    
    Debug.Assert VBAParseCCur("£-10,000") = -10000 '* works on my (UK) machine
    Debug.Assert VBAParseCCur("$-10,000") = -10000  '* fails on my (UK) machine

End Sub

Function VBAParseCCur(v)
    On Error Resume Next
    VBAParseCCur = CCur(v)
End Function

COM System Parsers

So typically C++ programmers have greater access to the system functions than VBA programmers. A C++ programmer can call VarParseNumFromStr to parse a number out of a string.

HRESULT VarParseNumFromStr(
  LPCOLESTR strIn,
  LCID      lcid,
  ULONG     dwFlags,
  NUMPARSE  *pnumprs,
  BYTE      *rgbDig
);

One can see that one can specify the locale id as second parameter. If you know what you want, in this case I want a currency from a string then you can chose a more specific parsing function, VarCyFromStr

HRESULT VarCyFromStr(
  LPCOLESTR strIn,
  LCID      lcid,
  ULONG     dwFlags,
  CY        *pcyOut
);

With the help of VBFormus.com expert Olaf Schmidt we can call VarCyFromStr from VBA. And now we have the ability to parse multi-currency strings, see this code.

Option Explicit


'* http://www.vbforums.com/showthread.php?762443-VB6-Tabulator-Crosstab-Class
'* https://docs.microsoft.com/en-gb/previous-versions/windows/desktop/api/oleauto/nf-oleauto-varcyfromstr
Private Declare Function VarCyFromStr& Lib "oleaut32" (ByVal sDate&, ByVal LCID&, ByVal Flags&, C As Currency)

Private Enum lcidLocaleId
    lcidEN_US = 1033    '* US
    lcidEN_EN = 2057    '* UK
    lcidFR_FR = 1036    '* Eurozone
    lcidRU = 1049       '* Russia
    lcidJA = 1041       '* Japan
End Enum

Private Enum chrwCurrencies
    chrwEuro = 8364     '* for euros
    chrwRouble = 8381   '* for roubles
    chrwYen = 165       '* for yen
End Enum

Private Function CCurLA(ByVal sAmount As String, ByVal LCID As lcidLocaleId) As Currency
    Dim HRes As Long
    
    HRes = VarCyFromStr(StrPtr(sAmount), LCID, 0, CCurLA)
    If HRes Then Err.Raise HRes
End Function

Sub TestVarCyFromStr_Yen()

    Dim sYen As String
    sYen = ChrW(chrwYen) & "-1000"
    Dim curYen As Currency
    
    curYen = CCurLA(sYen, lcidJA)
    Debug.Assert curYen = -1000
End Sub

Sub TestVarCyFromStr_Roubles()

    Dim sRoubles As String
    sRoubles = ChrW(chrwRouble) & "-1000"
    Dim curRoubles As Currency
    
    curRoubles = CCurLA(sRoubles, lcidRU)
    Debug.Assert curRoubles = -1000
End Sub

Sub TestVarCyFromStr_Euros()

    Dim sEuros As String
    sEuros = ChrW(chrwEuro) & "-1000"
    Dim curEuros As Currency
    
    curEuros = CCurLA(sEuros, lcidFR_FR)
    Debug.Assert curEuros = -1000
End Sub

Sub TestVarCyFromStr_Dollars()

    Dim sDollars As String
    sDollars = "$-10,00"
    Dim curDollars As Currency
    
    curDollars = CCurLA(sDollars, lcidEN_US)
    Debug.Assert curDollars = -1000
End Sub

Sub TestVarCyFromStr_Pounds()

    Dim sPounds As String
    sPounds = "£-10,00"
    Dim curPounds As Currency
    
    curPounds = CCurLA(sPounds, lcidEN_EN)
    Debug.Assert curPounds = -1000
End Sub

Python - VBA - Com Interop - Factory Method Pattern

Python can interop with VBA via COM. One can create a Python class and expose it to VBA for scripting, this allows VBA developers to access the massive Python ecosystem. Sometimes, if you want to build an object hierarchy then you want some classes to be only creatable via another class, this is sometimes called the Factory Method Pattern. It can frequently occur in parent-child object relationships. In this post I show how to do this in Python.

Here is the Python code which must be run at least once with admin privilege to ensure COM registration (thereafter not required).

import win32com.client


class MyParent(object):
    _reg_clsid_ = "{C61A7C6E-B657-4D55-AD36-8850B2E501AC}"
    _reg_progid_ = 'PythonInVBA.MyParent'
    _public_methods_ = ['Greet', 'GetChild']

    def __init__(self):  # Rules of Com say parameterless constructors
        self.child = win32com.client.Dispatch("PythonInVBA.MyChild")
        self.child.SetName("foo")

    def Greet(self):
        return "Hello world"

    def GetChild(self):
        return self.child


class MyChild(object):
    _reg_clsid_ = "{15DAAEE2-3A37-4DE1-9973-CCD011DF4888}"
    _reg_progid_ = 'PythonInVBA.MyChild'
    _public_methods_ = ['Initialize', 'GetName', 'SetName']

    def __init__(self):  # Rules of Com say paramerless constructors
        pass

    def GetName(self):
        return self.name

    def SetName(self, sName):
        self.name = sName

if __name__ == '__main__':
    print ("Registering COM servers...")
    import win32com.server.register
    win32com.server.register.UseCommandLine(MyParent)
    win32com.server.register.UseCommandLine(MyChild)

The trick is to create the child class with win32com.client.Dispatch("PythonInVBA.MyChild") and not the standard constructor MyChild() . My thanks to Stackoverflow user Kunif for solving this.

And so the VBA client code looks like this

Sub Test_MyParent_Returning_MyChild()
    On Error GoTo ErrHand:

    Dim objMyParent As Object
    Set objMyParent = VBA.CreateObject("PythonInVBA.MyParent")

    Dim objMyChild As Object

    Set objMyChild = objMyParent.GetChild()
    Debug.Print objMyChild.GetName  '* prints foo

    Exit Sub
ErrHand:
    Debug.Print Err.Description
End Sub