Tuesday 9 October 2018

VBA - Microsoft.ACE.OLEDB.12.0 details

I investigated Microsoft.ACE.OLEDB.12.0 and have plenty of artefacts and findings. In case you haven't met this component it allows data to be read and written to Excel worksheets using SQL technology.

Nomenclature

As far as I can see ACE stands for Access Connectivity Engine. This wikipedia article is a good web page which highlights the history of the name.

COM Registry entries

Some time back (with some help from StackOverflow) I got the ATL C++ Sample OLEDB Provider compiled and working. From that experience I can tell you that every provider string is in fact a COM Prog ID. This means we call write code like this to test the installation...

Private Sub TestInstallation()
    Dim iunkOleDb As IUnknown
    Set iunkOleDb = VBA.CreateObject("Microsoft.ACE.OLEDB.12.0")  '<--- this would error if not installed
End Sub

It also means if we scan the registry for the Prog ID "Microsoft.ACE.OLEDB.12.0" then we can find other details. I have placed a registry export of the COM registry entries in Appendix A.

From the details it can be seen that the ProgId is 'Microsoft.ACE.OLEDB.12.0' whilst the fuller name is 'Microsoft Office 12.0 Access Database Engine OLE DB Provider' and it is implemented in the executable ACEOLEDB.DLL. This gives us some search terms to google on.

Installation

If you need to install this then you must download the Microsoft Access Database Engine 2010 Redistributable . The accompanying explanatory text says this is not a replacement for Jet saying one should use SQL Server Express Edition but to be honest I think many of us do see Microsoft.ACE.OLEDB.12.0 as a Jet replacement.

Also on that download page there some help about how to use an Extended Property in the connection string to specify the correct file format version.

File Type (extension)                               Extended Properties
---------------------------------------------------------------------------------------------
Excel 97-2003 Workbook (.xls)                       "Excel 8.0"
Excel 2007-2010 Workbook (.xlsx)                    "Excel 12.0 Xml"
Excel 2007-2010 Macro-enabled workbook (.xlsm)      "Excel 12.0 Macro"
Excel 2007-2010 Non-XML binary workbook (.xlsb)     "Excel 12.0"

As it turns out, one can supply a wider range of values than those shown above, i.e. non-Excel file formats. Appendix D shows a screenshot of the registry which I believe shows all the valid values, they are all ISAM Formats.

Pitfall - Workbooks needs to be saved

I suspect the code in the provider is contingent on the workbook's file extension and it will complain if it has no file extension. When you create a workbook, it is just "Book1" ; it has no file extension until it has been saved at least once. This pitfall is easily countered with a line of defensive code to inspect the workbook's file extension ...

    Debug.Assert UBound(Split(ThisWorkbook.Name, ".")) > 0  '* Workbook needs a file extension, i.e. saved at least once!

... or ...

    If UBound(Split(ThisWorkbook.Name, ".")) = 0 Then Err.Raise vbObjectError, , "#Workbook needs a file extension, i.e. saved at least once!"

Connection Strings Resources

An excellent resource for how to build a connection string for any data provider is www.connectionstrings.com and on that link one can see connection strings for historic versions of Excel. Also on that page are details of extended properties.

Jet Extended Properties

I'd like to compile a list of extended properties that relate to Microsoft.ACE.OLEDB.12.0 . I suspect many of them are inherited from the Jet. So here is a list of Jet extended properties courtesy of Working with MS Excel(xls / xlsx) Using MDAC and Oledb - CodeProject, a great article that I won't try and replicate.

Looks like Extended Properties needs enclosing double quotes (in some cases at least).

  • HDR - Short for Header, if YES then the top row are like column headers and interpreted as field names.
  • ReadOnly
  • FirstRowHasNames - different way to do same as HDR
  • MaxScanRows - data types are inferred from n rows, this sets n
  • IMEX - I'm guessing this is short for Import/Export and is also used in column type inference

Related to IMEX is ImportMixedTypes which I have seen in an Microsoft.ACE.OLEDB.12.0 connection string but not in a Jet connection string. For Jet and Microsoft.ACE.OLEDB.12.0 ImportMixedTypes is a registry entry but it also looks like supplying ImportMixedTypes in the Microsoft.ACE.OLEDB.12.0 connection string allows an override. For explanation of ImportMixedTypes here is another great article, this time at dailydoseofexcel.com, Daily Dose of Excel - External Data – Mixed Data Types .

Pitfall - The Problem of Type Inference

So the OLEDB Provider infers a column's data type from its contents, sampling the data. I don't much like this, I'd prefer a way to specify the data type but I have yet to find a way to do this. Perhaps it is best to ensure the data in the cells is consistent, we can lock sheets and control access to ensure a user does not corrupt the data but then that creates a need to show a separate data entry form. I will mull this. In the meantime I'd advise you are very disciplined that any data you write is type consistent for that column.

Access Connectivity Engine

So I have discovered another bunch of registry entries which I placed in Appendix B. So there is another dll at work here, ACEEXCL.DLL. I will try to investigate how ACEEXCL.DLL interacts with ACEOLEDB.DLL. UPDATE: I solved this in Appendix D!

Pitfall - Pass CursorTypeEnum.adOpenKeyset or CursorTypeEnum.adOpenStatic When Opening a Recordset

Even after correctly forming a connection strings I have still had some issues using this OLEDB provider. So in in my use case when calling the Recordset.Open method it is critical to pass the right enumeration value. CursorTypeEnum.adOpenDynamic and CursorTypeEnum.adOpenForwardOnly did not throw errors they simply returned an empty recordset! This matters because I believe one of them is the assumed default. I needed to pass either CursorTypeEnum.adOpenKeyset or CursorTypeEnum.adOpenStatic to get any rows back.

Pitfall - Better To Specify an Exact Range Than a Whole Sheet

Even after sorting a connection string and CursorTypeEnum parameter one can still get bugs. If a whole sheet is specified then it will infer data from the whole Worksheet.UsedRange. This means if you dirty your cells on the sheet (by entering anything and deleting them) then that cell and all those between it and $A$1 will be implied to belong to the table. So it is better to find the range with [A1].CurrentRegion.Address and either (1) define a name over that range and pass range name into the SQL or (2) used the explicit address of the range, e.g. $A$1:$B$3

Sample Code to Open a Recordset

So now we know where the pitfalls lie we can write some defensive sample code. This code opens a recordset and prints out its contents. PLEASE USE A FRESH NEW WORKBOOK! There is some setup code to write some data to a sheet in SetUpSomeData() so best to use a new workbook but remember to save the workbook at least once.

The code demonstrates the following points ...

  • It defends against the pitfalls of unsaved workbooks;
  • it supplies a working CursorTypeEnum;
  • it restricts the cells to select, by two different methods (1) by name and (2) by cell address

As a bonus I have added some code in ReadExcelCatalog which demonstrates using the ADOX library to read schema information so one can tell exactly what the OLEDB provider is inferring for a column type. Enjoy!.

Option Explicit
Option Private Module

'* Tools -> References
'* ADODB  Microsoft ActiveX Data Objects 6.1 Library  C:\Program Files (x86)\Common Files\System\ado\msado15.dll
'* ADOX   Microsoft ADO Ext. 6.0 for DDL and Security C:\Program Files (x86)\Common Files\System\ado\msadox.dll

Private Sub SetUpSomeData()
    '* WARNING this will wipe data!
    Dim sht As Excel.Worksheet
    Set sht = ThisWorkbook.Worksheets.Item("Sheet1")
    sht.Cells.Clear
    
    '*
    '* use our array literal trick, for more tricks tips and 'blue sky thinking'
    '* see http://exceldevelopmentplatform.blogspot.com
    '*
    Dim vData As Variant
    vData = [{"Color","RGB";"Red","FF0000";"Green","00FF00"}]

    sht.Range("A1:B3").Value2 = vData

End Sub

Private Sub TestInstallation()

    Dim iunkOleDb As IUnknown
    Set iunkOleDb = VBA.CreateObject("Microsoft.ACE.OLEDB.12.0")  '<--- this would error if not installed

End Sub


Private Sub ReadData()
    '* Code here assumes there is data in top left of Sheet1, call SetUpSomeData() first if you have no data
    'Call SetUpSomeData

    Dim oConn As ADODB.Connection
    Set oConn = New ADODB.Connection
    
    If UBound(Split(ThisWorkbook.Name, ".")) = 0 Then Err.Raise vbObjectError, , "#Workbook needs a file extension, i.e. saved at least once!"
    
    'Debug.Assert UBound(Split(ThisWorkbook.Name, ".")) > 0  '* Workbook needs a file extension, i.e. saved at least once!
    
    oConn.Open "Provider=Microsoft.ACE.OLEDB.12.0;" & _
           "Data Source=" & ThisWorkbook.FullName & ";" & _
           "Extended Properties='Excel 12.0 Macro;HDR=YES'"
           
    
    Dim sht As Excel.Worksheet
    Set sht = ThisWorkbook.Worksheets.Item("Sheet1")
    
    
    '*
    '* Limit the range to the block of cells contigous with A1
    '*
    Dim rngTable As Excel.Range
    Set rngTable = sht.Cells(1, 1).CurrentRegion
    
    
    '*
    '* Case 1 - using Named Range
    '* (use separate recordset)
    '*
    Dim rsByName As ADODB.Recordset
    Set rsByName = New ADODB.Recordset
    
    ThisWorkbook.Names.Add "MyTable", rngTable
    
    Dim sCmdTextUsingName As String
    sCmdTextUsingName = "Select * From MyTable"
    
    rsByName.Open sCmdTextUsingName, oConn, CursorTypeEnum.adOpenStatic '* can use CursorTypeEnum.adOpenKeyset
    Debug.Assert rsByName.RecordCount > 0
    
    
    
    '*
    '* Case 2 - using Cell Addresses
    '* (use separate recordset)
    '*
    Dim rsByCellAddress As ADODB.Recordset
    Set rsByCellAddress = New ADODB.Recordset
    
    Dim sCmdTextUsingCellAddress As String
    sCmdTextUsingCellAddress = "Select * From [" & sht.Name & "$" & rngTable.Address(False, False, xlA1) & "]"
    
    rsByCellAddress.Open sCmdTextUsingCellAddress, oConn, CursorTypeEnum.adOpenStatic '* can use CursorTypeEnum.adOpenKeyset
    Debug.Assert rsByCellAddress.RecordCount > 0
    

    '*
    '* output one of the recordsets (they should be identical)
    '*
    DumpRecordset rsByCellAddress

End Sub

Private Sub DumpRecordset(ByVal rs As ADODB.Recordset)
    '*
    '* Some code to iterate over the recordset
    '*
    rs.MoveFirst
    
    Dim lFieldCount As Long
    lFieldCount = rs.Fields.Count
    
    While Not rs.EOF
            
        Dim sOutputLine As String
        sOutputLine = ""
            
        Dim sFieldAndValue As String
        Dim lFieldLoop As Long
        For lFieldLoop = 0 To lFieldCount - 1
            sFieldAndValue = rs.Fields.Item(lFieldLoop).Name & ":" & rs.Fields.Item(lFieldLoop).Value
            
            sOutputLine = sOutputLine & VBA.IIf(Len(sOutputLine) > 0, vbTab, "") & sFieldAndValue
        Next
        Debug.Print sOutputLine
        rs.MoveNext
    Wend

End Sub


Private Sub ReadExcelCatalog()
    '*
    '* Some code to give the schema details such as columns names and columns types (what is inferred rathe than what is defined)
    '*
    Dim oConn As ADODB.Connection
    Set oConn = New ADODB.Connection
    
    If UBound(Split(ThisWorkbook.Name, ".")) = 0 Then Err.Raise vbObjectError, , "#Workbook needs a file extension, i.e. saved at least once!"
    
    'Debug.Assert UBound(Split(ThisWorkbook.Name, ".")) > 0  '* Workbook needs a file extension, i.e. saved at least once!
    
    oConn.Open "Provider=Microsoft.ACE.OLEDB.12.0;" & _
           "Data Source=" & ThisWorkbook.FullName & ";" & _
           "Extended Properties='Excel 12.0 Macro;HDR=YES'"
    
    Dim catDB As ADOX.Catalog
    Set catDB = New ADOX.Catalog
    Set catDB.ActiveConnection = oConn
    
    Dim adoxTableLoop As ADOX.Table
    For Each adoxTableLoop In catDB.Tables
        If adoxTableLoop.Name = "MyTable" Then
            Dim adoxColumnLoop As ADOX.Column
            For Each adoxColumnLoop In adoxTableLoop.Columns
                Debug.Print adoxColumnLoop.Name & vbTab & Switch(adoxColumnLoop.Type = adVarWChar, "String", adoxColumnLoop.Type = adDouble, "Double")
            Next
        End If
    Next

End Sub

Links

Appendix A - COM Registry entries

It always useful to poke around in the registry to see what makes something tick, here is a registry export of the related keys. It turns out there is a second bunch of registry keys to tune the behaviour (page down). The following set of registry keys fulfil the COM registration requirements for OLEDB providers.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}]
"OLEDB_SERVICES"=dword:fffffffe
@="Microsoft.ACE.OLEDB.12.0"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}\ExtendedErrors]
@="Microsoft.ACE.OLEDBErrors.12.0"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}\ExtendedErrors\{3BE786A0-0366-4F5C-9434-25CF162E475F}]
@="Microsoft.ACE.OLEDBErrors.12.0"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}\InprocServer32]
@="C:\\Program Files (x86)\\Common Files\\Microsoft Shared\\OFFICE15\\ACEOLEDB.DLL"
"ThreadingModel"="Both"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}\OLE DB Provider]
@="Microsoft Office 12.0 Access Database Engine OLE DB Provider"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Classes\Wow6432Node\
CLSID\{3BE786A0-0366-4F5C-9434-25CF162E475E}\ProgID]
@="Microsoft.ACE.OLEDB.12.0"

Appendix B - Access Connectivity Engine Registry entries

So I have discovered another bunch of registry entries which I found after discovering this page Initializing the Microsoft Excel Driver -MSDN.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\15.0\ClickToRun\REGISTRY\MACHINE\Software\Wow6432Node\
Microsoft\Office\15.0\Access Connectivity Engine\Engines\Excel]
"DisabledExtensions"="!xls"
"ImportMixedTypes"="Text"
"FirstRowHasNames"=hex:01
"AppendBlankRows"=dword:00000001
"TypeGuessRows"=dword:00000008
"win32"="C:\\Program Files (x86)\\Common Files\\Microsoft Shared\\OFFICE15\\ACEEXCL.DLL"

Appendix C - Access Connectivity Engine Files and Dependencies

So there is a whole bunch of files prefixed with ACE*.DLL which look related to Access Connectivity Engine, for me they are located in

C:\Program Files\Microsoft Office 15\root\vfs\ProgramFilesCommonX86\Microsoft Shared\OFFICE15

which looks like some sort of virtualised file system (is that what the vfs stands for?). Anyway, here is the list

 Directory of C:\Program Files\Microsoft Office 15\root\vfs\ProgramFilesCommonX86\Microsoft Shared\OFFICE15

1,680,128 ACECORE.DLL   'depends on OS files OLE32.DLL, ADVAPI32.DLL, KERNEL32.DLL, OLEAUT32.DLL and C++ Files MSVCR100.DLL and MSVCP100.DLL
  432,384 ACEDAO.DLL    'depends on OS files OLE32.DLL, ADVAPI32.DLL, KERNEL32.DLL, OLEAUT32.DLL and C++ Files MSVCR100.DLL 
   35,032 ACEERR.DLL    'depends on OS files OLE32.DLL, ADVAPI32.DLL, KERNEL32.DLL               and C++ Files MSVCR100.DLL 
  633,688 ACEES.DLL     'depends on OS files OLE32.. ADVAPI32.. KERNEL32.. OLEAUT32.. VERSION.DLL and MSVCR100.. MSVCP100..
  186,600 ACEEXCH.DLL   'depends on ACECORE.DLL ; OS files OLE32.. ADVAPI32.. KERNEL32.. OLEAUT32..  and MSVCR100.. 
  400,184 ACEEXCL.DLL   'depends on ACECORE.DLL ; OS files OLE32.. ADVAPI32.. KERNEL32.. OLEAUT32..  and MSVCR100..  MSVCP100..
  278,256 ACEODBC.DLL   'depends OS files GDI32.DLL OLE32.. ADVAPI32.. KERNEL32.. COMDLG32.DLL  and MSVCR100..  
   15,000 ACEODEXL.DLL  'depends on ACEODBC.DLL ; OS file KERNEL32..  and MSVCR100..  
   15,016 ACEODTXT.DLL  'depends on ACEODBC.DLL ; OS file KERNEL32..  and MSVCR100..  
  329,552 ACEOLEDB.DLL  'depends on OS files OLE32.DLL, ADVAPI32.DLL, KERNEL32.DLL, OLEAUT32.DLL and C++ Files MSVCR100.DLL 
  161,400 ACETXT.DLL    'depends on ACECORE.DLL ; OS files OLE32.. ADVAPI32.. KERNEL32.. OLEAUT32..  and C++ MSVCR100..  MSVCP100..
3,049,184 ACEWDAT.DLL   'depends on OS file KERNEL32.DLL and C++ File MSVCR100.DLL 

I do not know what ACEES.DLL or ACEWDAT.DLL are but all the other files we can guess at their purpose.

ACECORE.DLL   'The core library for Access Connectivity Engine (ACE)
ACEDAO.DLL    'The DAO (Data Access Objects) companion file for ACE
ACEERR.DLL    '?A repository of error messages?
ACEEXCH.DLL   'The Microsoft Exchange companion/driver file for ACE
ACEEXCL.DLL   'The Microsoft Excel companion/driver file for ACE via OLEDB
ACEODBC.DLL   'The ODBC (Open Database Connectivity) companion file for ACE
ACEODEXL.DLL  'The Microsoft Excel companion/driver file for ACE via ODBC
ACEODTXT.DLL  'The Textfile companion/driver file for ACE via ODBC
ACEOLEDB.DLL  'The core OLEDB ACE file
ACETXT.DLL    'The Textfile companion/driver file for ACE via OLEDB

We know the route into the code starts with COM and ACEOLEDB.DLL (see Appendix A), looking at the entry points for ACEOLEDB.DLL we see the classic COM entry points

DllCanUnloadNow
DllGetClassObject
DllMain

If we look at the entry point for ACEEXCL.DLL we see the classic COM entry points

DllGetClassObject

so very much a COM DLL. I wonder what classes are created and passed out by these DLLs. OLEVIEW sheds no light on this or ACEOLEDB.DLL

Appendix D - ISAM Formats (first term of Extended Properties) Map to Engines

Below is a screenshot of the registry which I believe shows all the valid values for the first term of the Extended Properties. They are all ISAM Formats.

Looking down the list of Value Data pairs for the given key, 'Excel 12.0 Macro', we can see one entry 'Engine' with 'Excel' as the string data. The 'HTML Export' and 'HTML Import' keys also have 'Engine' Values with 'Text' as the string data. In another screenshot we can see that they must be mapping to the keys under the Engines key. I have drawn some mapping lines (sorry no arrow heads).

Let's look at what is the Excel engine key. Voila, it tells which DLL to load to handle requests for Excel in the win32 value ...ACEEXCL.DLL . The Value-Data pairs shown below in the next screenshot have already been detailed in Appendix B but it is only now that I have pieced together the logic sequence to the load the right 'engine' file.

1 comment:

  1. Wow, nice homework.
    Latest security updates late in 2019 or early 2020 may have messed up the ability to use embedded read-only passwords. One alternative was to manually install a latest MSOLEDB or similar file, which isn't very flexible, to allow VBA passwords. Any idea about that? Thanks.

    ReplyDelete