Previously, I have given a VBA routine to pretty print an Xml Dom to a file which gives line breaks. In this post, I give code to write the Xml file as report to the worksheet. I even given code to color in the document to highlight the Xml syntax.
Warning: The coloring code is not very fast, I think the Excel.Range.Characters() method which allows the subselection and formatting of portions of text within a cell is the performance bottlenext (this is Microsoft code and not my code, I promise). So, I have shipped an extra Boolean parameter, bCancelColoring, to switch off the colorisation.
The code requires two library references so go to menu and take Tools->References to get the References dialog box and select (1) Microsoft Scripting Runtime and (2) Microsoft Xml, v6.0
It is intended that the code below be hosted in one single module called modXmlReports. I will be publishing code later that calls this module.
The colorisation was an interesting problem, it required walking the Dom and associating an element with a row number. This meant when parsing the Xml, I could query the Dom for the element's name. This meant I could skip any regular expression parsing code which was welcome.
The screenshot below is (an edited version) of a Microsoft sample xml file, Books.xml. Inevitably, I save my test files in a different location to you, dear reader, so please amend the file path in the test routine. For workspace, the code also uses a subfolder of the system's temp folder called 'modXmlReports'.
I had previously writetn the code as opening the xml text file as a workbook because Excel nicely parses the tabs and prints the xml indents onto the worksheet for us. I have switched to reading the xml text file manually and counting the tabs myself. This led to more stable code; I prefer not to keep opening and closing workbooks.
That's about it. This is meant for developers rather than end users. I have been working with mpeg files and I rewrote my mpeg file parser but I wanted a file to view, so I choose to parse the Mpeg file into an Xml file. So I am a client of the module below. I'll publish that Mpeg code soon.
So here is the code
modXmlReports
Option Explicit
'***********************************************************************************
'* Module Name: modXmlReports
'* Copyright exceldevelopmentplatform.blogspot.com 2nd November 2019
'*
'* Requires Tools Reference
'* Microsoft Scripting Runtime
'* Microsoft XML, v6.0
'*
'* Description:
'* This module will pretty print an Xml file i.e. it will place each element on at least a separate line
'* and will tab indent to show the level of recursive depth. Also it will read the prettified Xml file
'* and write onto a worksheet. It will even color the Xml syntax to make the contents clearer though this
'* colorisation code can be slow and so is cancellable
'*
'***********************************************************************************
'************************************************************
'* test routine
'************************************************************
Private Sub TestPrettyReportXml()
Dim sXmlInputFile As String
sXmlInputFile = "N:\Dash\Books.xml" '*<----- this will differ for you, dear reader!
PrettyReportXml ThisWorkbook, sXmlInputFile, False, ""
End Sub
'*********************************************************************************************************
'* Name: PrettyReportXml
'* Description: Sole entry point for this module
'*********************************************************************************************************
Public Function PrettyReportXml(ByVal wbReports As Excel.Workbook, _
ByVal sXmlInputFile As String, _
ByVal bCancelColoring As Boolean, _
ByVal sWorkFolder As String) As Boolean
Debug.Assert (Not wbReports Is Nothing) And (Len(sXmlInputFile) > 0)
'* this is a three step process:
'* (1) first load the Xml.txt file into a workshete and copy in
'* (2) associate each line with an element of the dom, taking into account open and close on different lines
'* (3) color in the text
Dim xmlDoc As MSXML2.DOMDocument60
Set xmlDoc = New MSXML2.DOMDocument60
If Not xmlDoc.Load(sXmlInputFile) Then
Debug.Print "Xml file '" & sXmlInputFile & "' does not parse! Aborting."
Else
Dim sXmlPrettifiedFileName As String
sXmlPrettifiedFileName = ReportFileName(sXmlInputFile, sWorkFolder)
Dim wsReturn As Excel.Worksheet
Set wsReturn = OpenAndCopyXmlReportIntoWorkbook(wbReports, xmlDoc, sXmlPrettifiedFileName)
Dim dicLinesAndElements As Scripting.Dictionary, dicColors As Scripting.Dictionary
If Not bCancelColoring Then
Set dicLinesAndElements = AssociateLineNumbersWithElements(wsReturn, xmlDoc)
Set dicColors = New Scripting.Dictionary
dicColors.Add "Punctuation", VBA.Information.RGB(0, 0, 255) '* feel free to change these colors
dicColors.Add "XmlToken", VBA.Information.RGB(153, 0, 0)
dicColors.Add "Content", VBA.Information.RGB(0, 0, 0)
ColorInText wsReturn, dicLinesAndElements, dicColors
CleanUpLinesAndElements dicLinesAndElements
End If
'* force tidy up
Set dicLinesAndElements = Nothing
Set dicColors = Nothing
Set wsReturn = Nothing
Set xmlDoc = Nothing
Set wbReports = Nothing
sXmlInputFile = vbNullString
sXmlPrettifiedFileName = vbNullString
'* end of tidy up
PrettyReportXml = True
End If
End Function
'************************************************************
'* function which govern the working file location
'************************************************************
Private Function LeafName(ByVal sInputFile As String) As String
Dim fso As Scripting.FileSystemObject
Set fso = New Scripting.FileSystemObject
Dim filMpeg As Scripting.File
Set filMpeg = fso.GetFile(sInputFile)
LeafName = filMpeg.name
Set filMpeg = Nothing
Set fso = Nothing
End Function
Private Function ReportFileName(ByVal sInputFile As String, ByVal sWorkFolder As String) As String
Dim fso As Scripting.FileSystemObject
Set fso = New Scripting.FileSystemObject
'* we want to work with our file which we will place in same directory as ThisWorkbook
If sWorkFolder = "" Then sWorkFolder = TempFolder
ReportFileName = fso.BuildPath(sWorkFolder, LeafName(sInputFile) & ".txt")
Set fso = Nothing
End Function
Private Function TempFolder() As String
Dim fso As Scripting.FileSystemObject
Set fso = New Scripting.FileSystemObject
Const TemporaryFolder As Long = 2
TempFolder = fso.GetSpecialFolder(TemporaryFolder)
If Not fso.FolderExists(fso.BuildPath(TempFolder, "modXmlReports")) Then
Dim fldTemp As Scripting.Folder
Set fldTemp = fso.GetFolder(TempFolder)
fldTemp.SubFolders.Add "modXmlReports"
Debug.Assert fso.FolderExists(fso.BuildPath(TempFolder, "modXmlReports"))
End If
TempFolder = fso.BuildPath(TempFolder, "modXmlReports")
Set fso = Nothing
End Function
'Private Function ThisWorkbookHomeFolder() As String
' Dim fso As Scripting.FileSystemObject
' Set fso = New Scripting.FileSystemObject
'
' 'Debug.Assert ThisWorkbook.Saved = True
' ThisWorkbookHomeFolder = fso.GetFile(ThisWorkbook.FullName).ParentFolder.ShortPath
' Set fso = Nothing
'
' '* alternative
' 'Dim vSplit
' 'vSplit = VBA.Split(ThisWorkbook.FullName, "\")
' 'ReDim Preserve vSplit(0 To UBound(vSplit) - 1)
' 'ThisWorkbookHomeFolder = VBA.Join(vSplit, "\")
'End Function
'************************************************************
'* prettify Xml and load into workbook routines
'************************************************************
Private Function OpenAndCopyXmlReportIntoWorkbook(ByVal wbReports As Excel.Workbook, _
ByVal xmlDoc As MSXML2.DOMDocument60, ByVal sXmlPrettifiedFileName As String) As Excel.Worksheet
Debug.Assert (Not wbReports Is Nothing) And (Not xmlDoc Is Nothing) And (Len(sXmlPrettifiedFileName) > 0)
If StrComp(Right$(sXmlPrettifiedFileName, 4), ".txt", vbTextCompare) <> 0 Then
Debug.Print "File '" & sXmlPrettifiedFileName & "' must end in .txt so as to bypass Excel's Xml file opening logic. Aborting."
GoTo SingleExit
Else
Dim vSrc As Variant, lLineCount As Long, lColCount As Long
vSrc = WriteAndReadPrettifiedReport(xmlDoc, sXmlPrettifiedFileName, lLineCount, lColCount)
Dim wsDest As Excel.Worksheet
Set wsDest = wbReports.Worksheets.Add(After:=wbReports.Worksheets.Item(wbReports.Worksheets.count))
Dim sSheetName As String
sSheetName = VBA.Split(LeafName(sXmlPrettifiedFileName), ".")(0)
If SheetExists(wbReports, sSheetName) Then
Application.DisplayAlerts = False
wbReports.Worksheets.Item(sSheetName).Delete
Application.DisplayAlerts = True
End If
Set wbReports = Nothing
wsDest.name = sSheetName
wsDest.Cells(1, 1).Resize(lLineCount, lColCount).Value2 = vSrc
vSrc = Empty
Set OpenAndCopyXmlReportIntoWorkbook = wsDest
Set wsDest = Nothing
Set wbReports = Nothing
DoEvents '* allow repaint
End If
SingleExit:
End Function
Private Function WriteAndReadPrettifiedReport(ByVal xmlDoc As MSXML2.DOMDocument60, _
ByVal sXmlPrettifiedFileName As String, _
ByRef plLineCount As Long, ByRef plColCount As Long)
Dim fso As Scripting.FileSystemObject
Set fso = New Scripting.FileSystemObject
Dim txtOut As Scripting.TextStream
Set txtOut = fso.CreateTextFile(sXmlPrettifiedFileName)
txtOut.Write PrettyPrintXml(xmlDoc)
txtOut.Close
Set txtOut = Nothing
Set xmlDoc = Nothing
Dim lMaxTabCount As Long
lMaxTabCount = 0
Dim dicLines As Scripting.Dictionary
Set dicLines = New Scripting.Dictionary
Dim txtIn As Scripting.TextStream
Set txtIn = fso.OpenTextFile(sXmlPrettifiedFileName)
Dim sLine As String
sLine = txtIn.ReadLine
While Not txtIn.AtEndOfStream
Dim lTabCount As Long
lTabCount = CountChars(sLine, vbTab)
If lTabCount > lMaxTabCount Then lMaxTabCount = lTabCount
DoEvents
dicLines.Add dicLines.count, sLine
sLine = txtIn.ReadLine
Wend
dicLines.Add dicLines.count, sLine '* add final line
txtIn.Close
Set txtIn = Nothing
Set fso = Nothing
plLineCount = dicLines.count
plColCount = lMaxTabCount + 1
ReDim vRet(1 To dicLines.count, 1 To plColCount) As Variant
Dim vLine As Variant, lRow As Long
For Each vLine In dicLines.Items
lRow = lRow + 1
lTabCount = CountChars(vLine, vbTab)
vRet(lRow, lTabCount + 1) = Replace(vLine, vbTab, "")
Next vLine
Set dicLines = Nothing
WriteAndReadPrettifiedReport = vRet
End Function
Private Function CountChars(ByVal s As String, ByVal C As String)
CountChars = Len(s) - Len(Replace(s, C, ""))
End Function
Private Function SheetExists(ByVal wb As Excel.Workbook, ByVal sSheetName As String) As Boolean
SheetExists = SheetNames(wb).Exists(sSheetName)
End Function
Private Function SheetNames(ByVal wb As Excel.Workbook) As Scripting.Dictionary
Dim dic As Scripting.Dictionary
Set dic = New Scripting.Dictionary
Dim ws As Excel.Worksheet
For Each ws In wb.Worksheets
dic.Add ws.name, ""
Next
Set SheetNames = dic
End Function
Private Function PrettyPrintXml(ByVal dom As MSXML2.DOMDocument60) As String
Dim reader As MSXML2.SAXXMLReader60
Set reader = New MSXML2.SAXXMLReader60
Dim writer As MSXML2.MXXMLWriter60
Set writer = New MSXML2.MXXMLWriter60
writer.omitXMLDeclaration = True
writer.indent = True
reader.putProperty "http://xml.org/sax/properties/lexical-handler", writer
Set reader.contentHandler = writer
reader.Parse dom.XML
PrettyPrintXml = writer.output
Set reader = Nothing
Set writer = Nothing
End Function
'************************************************************
'* walking the dom to associate elements with lines routines
'************************************************************
Private Function AssociateLineNumbersWithElements(ByVal wsReturn As Excel.Worksheet, _
ByVal xmlDoc As MSXML2.DOMDocument60) As Scripting.Dictionary
'* thankfully each line will have its own element
'* so we can now assocaite an element with a line and
'* leverage the Xml parser instead of doing regular expressions
Dim dicLinesAndElements As Scripting.Dictionary
Set dicLinesAndElements = New Scripting.Dictionary
WalkDomDoc xmlDoc.DocumentElement, dicLinesAndElements
Set AssociateLineNumbersWithElements = dicLinesAndElements
End Function
Private Sub CleanUpLinesAndElements(ByVal dicLinesAndElements As Scripting.Dictionary)
Dim vKey
For Each vKey In dicLinesAndElements
Set dicLinesAndElements.Item(vKey) = Nothing
dicLinesAndElements.Remove vKey
Next
End Sub
Private Sub WalkDomDoc(ByVal xmlElem As MSXML2.IXMLDOMElement, ByVal dicLinesAndElements As Scripting.Dictionary)
Dim bHasChildren As Boolean
bHasChildren = False
Dim lMyLine As Long
lMyLine = dicLinesAndElements.count + 1
dicLinesAndElements.Add lMyLine, Array("Both", xmlElem)
Dim xmlNodeLoop As MSXML2.IXMLDOMNode
For Each xmlNodeLoop In xmlElem.ChildNodes
If xmlNodeLoop.NodeType = NODE_ELEMENT Then
If Not bHasChildren Then
bHasChildren = True
'* replace
dicLinesAndElements.Item(lMyLine) = Array("Open", xmlElem)
End If
WalkDomDoc xmlNodeLoop, dicLinesAndElements
End If
Next xmlNodeLoop
If bHasChildren Then
dicLinesAndElements.Add dicLinesAndElements.count + 1, Array("Close", xmlElem)
End If
End Sub
'**************************************
'* the color routines
'**************************************
Private Sub ColorInText(ByVal wsReturn As Excel.Worksheet, ByVal dicLinesAndElements As Scripting.Dictionary, ByVal dicColors As Scripting.Dictionary)
'* ASSUMPTION: thanks to the PrettyPrintXml() function beneath each element node is on its own line.
Debug.Assert Not wsReturn Is Nothing
Dim lRowLoop As Long
For lRowLoop = 1 To wsReturn.Cells(1, 1).CurrentRegion.Rows.count
Dim bFoundXmlCell As Boolean
bFoundXmlCell = False
Dim rngXmlCell As Excel.Range
Dim lColumnLoop As Long
For lColumnLoop = 1 To wsReturn.Cells(1, 1).CurrentRegion.Columns.count
Set rngXmlCell = wsReturn.Cells(lRowLoop, lColumnLoop)
If Len(rngXmlCell) > 0 Then
bFoundXmlCell = True
Exit For
End If
Next
If bFoundXmlCell Then
ColorXml rngXmlCell, dicLinesAndElements.Item(lRowLoop), dicColors
End If
Next
End Sub
Private Sub ColorXml(ByVal rngXmlCell As Excel.Range, ByVal vTagDetails, ByVal dicColors As Scripting.Dictionary)
Dim vOpenOrClose
vOpenOrClose = vTagDetails(0)
Dim xmlElem As MSXML2.IXMLDOMElement
Set xmlElem = vTagDetails(1)
Dim lNodeNameLen As Long
lNodeNameLen = Len(xmlElem.nodeName)
If vOpenOrClose = "Close" Or vOpenOrClose = "Both" Then
ColorCloseTagToken rngXmlCell, lNodeNameLen, dicColors, xmlElem, vOpenOrClose
End If
If vOpenOrClose = "Open" Or vOpenOrClose = "Both" Then
ColorOpenTagToken rngXmlCell, lNodeNameLen, dicColors
ColorAttributes rngXmlCell, lNodeNameLen, dicColors, xmlElem
End If
End Sub
Private Sub ColorAttributes(ByVal rngXmlCell As Excel.Range, ByVal lNodeNameLen As Long, _
ByVal dicColors As Scripting.Dictionary, ByVal xmlElem As MSXML2.IXMLDOMElement)
If xmlElem.Attributes.Length > 0 Then
Dim sXml As String
sXml = rngXmlCell.Value2
Dim lQuotes As Long
lQuotes = 1
Dim xmlAttrLoop As MSXML2.IXMLDOMAttribute
For Each xmlAttrLoop In xmlElem.Attributes
Dim lFindId As Long
lFindId = InStr(lQuotes, sXml, xmlAttrLoop.name, vbTextCompare)
rngXmlCell.Characters(Start:=lFindId, Length:=Len(xmlAttrLoop.name)).Font.Color = dicColors.Item("XmlToken")
Dim lEquals As Long
lEquals = InStr(lFindId + 1, sXml, "=", vbTextCompare)
rngXmlCell.Characters(Start:=lEquals, Length:=1).Font.Color = dicColors.Item("Punctuation")
lQuotes = InStr(lEquals + 1, sXml, """", vbTextCompare) '* assume double quotes, TODO make single quote aware
rngXmlCell.Characters(Start:=lQuotes, Length:=1).Font.Color = dicColors.Item("Punctuation")
lQuotes = InStr(lQuotes + 1, sXml, """", vbTextCompare) '* assume double quotes, TODO make single quote aware
rngXmlCell.Characters(Start:=lQuotes, Length:=1).Font.Color = dicColors.Item("Punctuation")
Next
End If
End Sub
Private Sub ColorCloseTagToken(ByVal rngXmlCell As Excel.Range, ByVal lNodeNameLen As Long, _
ByVal dicColors As Scripting.Dictionary, ByVal xmlElem As MSXML2.IXMLDOMElement, ByVal vOpenOrClose)
Dim lStart As Long
lStart = VBA.IIf(vOpenOrClose = "Both", InStr(1, rngXmlCell, Chr$(60) & "/" & xmlElem.nodeName, vbTextCompare), 1)
rngXmlCell.Characters(Start:=lStart, Length:=2).Font.Color = dicColors.Item("Punctuation")
rngXmlCell.Characters(Start:=lStart + 2, Length:=lNodeNameLen).Font.Color = dicColors.Item("XmlToken")
rngXmlCell.Characters(Start:=lStart + 2 + lNodeNameLen, Length:=1).Font.Color = dicColors.Item("Punctuation")
End Sub
Private Sub ColorOpenTagToken(ByVal rngXmlCell As Excel.Range, ByVal lNodeNameLen As Long, ByVal dicColors As Scripting.Dictionary)
rngXmlCell.Characters(Start:=1, Length:=1).Font.Color = dicColors.Item("Punctuation")
rngXmlCell.Characters(Start:=2, Length:=lNodeNameLen).Font.Color = dicColors.Item("XmlToken")
Dim lRightAngleBracket As Long
lRightAngleBracket = InStr(1, rngXmlCell.Value2, Chr$(62), vbBinaryCompare)
Debug.Assert lRightAngleBracket > 0
rngXmlCell.Characters(Start:=lRightAngleBracket, Length:=1).Font.Color = dicColors.Item("Punctuation")
End Sub
I was here to comment because I think the original code you posted had a function missing - looks like you've already updated it. Thank you for updating it - and thank you for creating this truly interesting and useful blog...
ReplyDelete