SQL is the most popular Databases language. It is robust, proven and works with extremely large tables stored in any common relational Database software. If you want to learn more about Databases and SQL – read my last week’s Blog post here.
Excel is often used to store data in a table-like fashion: columns with headers and many rows below the columns’ headers.
If we have Excel tables (or ranges of cells arranged like tables) and we have SQL that is designed to query tables – why not use them both for better results?
Did I just say better results? Well, that was a polite way of saying: new possibilities are now open for you.
Well, to be accurate, I must say that technically speaking, you can do anything you want with VBA over Excel tables. However, going about quickly aggregating a 4-table join, processing thousands or tens of thousands of records – is not something I recommend doing without SQL (yes, even if using smart data structures and programming techniques, such as arrays).
What can be achieved in a single, well crafted SQL statement, may require hundreds of lines of VBA code, and the SQL statement will most probably perform much better.
Excel inherited the Jet engine developed by Microsoft for MS-Access. The Jet is a fair Databases connectivity and SQL engine, albeit not fully compliant with the SQL ANSI standards. The Jet is no longer developed and its latest version is 4.0.
Since MS-Office 2007 (.xlsx file extension), the Ace engine replaced the Jet engine (latest version: 16.0). The main benefits of the Ace over the Jet engine, are support for MS-Office 64bit, support some complex data types, encryption and security, and SharePoint and MS-Outlook integration.
However the ACE engine is backward-compatible, some inconsistencies may arise. I always suggest to start off with the Ace engine version 16, revert down to Ace version 12, and then (if not on MS-Office 64bit) default to the Jet engine.
The Jet engine is part of the Windows installation package. If Ace isn’t, download it here: https://www.microsoft.com/en-us/download/details.aspx?id=54920 (chose either the 32bit or the 64bit version to match your MS-Office installation (I said MS-Office – not MS-Windows!).
Although I cover in great detail VBA, relational Databases and SQL in my online courses, I will assume here that you know SQL and that you know VBA.
The rules you need to know about using SQL over Excel tables through the Ace/Jet engines, are:
Refer to a Worksheet (requires that the “UsedRange” property resolve to cells arranged as a table) named “Past Invoices” using this syntax:
Refer to a range of cells within the “Past Invoices” Worksheet like that:
I always leave no room for interpretations and always specify the explicit range.
FROM ((… JOIN … ON… ) JOIN … ON … )
SELECT TOP 5 …
ORDER BY 2, 4
As with any data source, we first need to establish a connection using either the Jet engine or the Ace engine.
I typically use the Ace engine and establish an early binding to the ADODB library by adding a reference to the Microsoft ActiveX Data Objects 6.1 Library (in the VBA editor, open: Tools -> References and tick this entry).
Next, we create an instance of the ADODB Connection object in a local variable and call the Open method in order to establish the connection. We need to provide the correct connection string to the Open method – more about that shortly.
Here’s a full example of a Subroutine that establishes a connection to the current Excel Workbook (“ThisWorkbook”):
Public ConnDB as ADODB.Connection
'Open the connection to the DB Provider, if not already open
Dim strProvider As String
'strProvider = "Provider=Microsoft.Jet.OLEDB.4.0;" 'If using the old JET engine
strProvider = "Provider=Microsoft.ACE.OLEDB.16.0;"
On Error GoTo FailedConnection
If (ConnDB Is Nothing) Then
Set ConnDB = New ADODB.Connection
If Not (ConnDB.State = 1) Then
ConnDB.Open strProvider & _
"Data Source=" & ThisWorkbook.FullName & ";" & _
"Extended Properties=""Excel 12.0;HDR=Yes;"";" 'The JET engine only works with "Excel 8.0"
MsgBox "Failed connecting to Excel as a DB. Please contact support.", vbOKOnly + vbCritical, "Database Error"
Set ConnDB = Nothing
Couple of comments on that code:
As you can see, the Connection String has a Provider part, a Data Source part, and an Extended Properties part.
The Provider part is the engine used.
The Data Source part points to the full path and file name of an accessible Workbook. In this example I’m just taking ThisWorkbook’s FullName property
The Extended Properties allows us some control over the behavior of the engine. Note how it is in itself a string enclosed within double quotes, therefore they are escaped (“”) as they are within an already constructed string (the Connection String itself).
Here are the important parameters for the Extended Properties, delimited by semicolon (;):
Yes, most SQL statements will work, including INSERT, UPDATE, DELETE.
However, the best dish, in my mind, is given to us in the SELECT command.
Here’s an example of constructing an SQL query that find the top 5 selling Body Care products.
We see a join between the Products table and the Order Lines table, with a SUM aggregate on the Total column in the Order Lines table.
The result is presented in the range with a top-left cell named Top_Five_Products_Anchor.
As the Ace engine does not know what is a Smart Table (ObjectList), I’m translating the tables’ range address into the proper notation valid for the Ace engine. Mainly, we need to strip off the “$” character that is reserved as the delimiter between the Worksheet name and the range itself, and enclose within square brackets.
Dim strTable1 As String
Dim strTable2 As String
Dim strSQL As String
Dim rs As New ADODB.Recordset
strTable1 = "[DB Data$" & Replace(ThisWorkbook.Sheets("DB Data").ListObjects("ExcelarateProductsTable").Range.Address, "$", "") & "]"
strTable2 = "[DB Data$" & Replace(ThisWorkbook.Sheets("DB Data").ListObjects("ExcelarateOrdersLinesTable").Range.Address, "$", "") & "]"
strSQL = "SELECT TOP 5 PRD.[Product Code], PRD.[Product Description], SUM(ORL.[Total])"
strSQL = strSQL & " FROM " & strTable1 & " AS PRD"
strSQL = strSQL & " INNER JOIN " & strTable2 & " AS ORL ON PRD.[Product Code]=ORL.[Product]"
strSQL = strSQL & " WHERE PRD.[Product Group]='Body Care'"
strSQL = strSQL & " GROUP BY PRD.[Product Code], PRD.[Product Description]"
strSQL = strSQL & " ORDER BY SUM(ORL.[Total]) DESC"
rs.Open strSQL, ConnDB
For simplification, I haven’t implemented error handling here (although we have that in the called Subs).
Finally, here’s my version of the DisconnectDB Sub:
'Close the connection to the DB if open
On Error GoTo FailedConnection
If Not (ConnDB Is Nothing) Then
If Not (ConnDB.State = 0) Then
MsgBox "Failed closing the DB connection.", vbOKOnly + vbCritical, "Database Error"
Set ConnDB = Nothing
The above code examples are taken from my online course: Beyond Excel Boundaries with Databases and SQL: High-Preforming, Scalable Business Applications. More details in this link.
The in-depth course, Beyond Excel Boundaries with Databases & SQL, is now open.If you are serious about your career, or about understanding Databases, SQL and how to develop Excel VBA business-grade programs with Databases - this course is definitely for you.