The Excel Dashboards Guide: Data Staging
May 06, 2020
How should a dashboard strategy be conceptualized? How do you translate the accumulated data into an effective business compass for the decision maker? Start your journey here
After we have finished all the preparations for our Dashboard in the previous Blog post of this Dashboards’ Guide series, we’re now ready to start implementing our Dashboard in Excel.
Preparing the Data for the Dashboards
A good data staging serving our Dashboards must be very efficient, to quickly refresh the Dashboards. This implies that we need to carefully strike a balance across all resources required to calculate the data.
Multiple options and techniques are at our disposal: are we writing complex Worksheet formulas? are we splitting the calculation across multiple intermediary tables? should we add a helper column(s) to a data table to replace costly functions with faster ones? should we use a Pivot Table for staging? are we writing SQL queries over our Worksheet data tables for high performance? are we staging the Dashboards’ data feed in a helper (hidden) Worksheet, or are we populating the dashboards’ elements directly with the calculated data? can a single staging table serve more than one Dashboard element (by adding more columns to it)?
These are some of the considerations and trade-offs we need to account for. They are affected by several factors, including:
- What is the size of the source data tables? Is it spread across multiple tables?
- Where is the data sourced from (this Workbook, other Workbooks, text (csv) files, Database server, Web-site)?
- How often are the Dashboards expected to be refreshed? How many elements are being refreshed concurrently?
- What are the triggers that call for a refresh?
- Do we know what to expect from the target machine in terms of Excel version, database type, network speed and reliability?
- What’s the usual load on the Workbook? Is it “loaded” with functions? How fast does it calculate?
- Can we leverage the inherent calculation speed of Pivot Tables, with timed refresh triggers?
- Are we anticipating future changes that profoundly impact the business requirements or the underlying architecture/design of the solution?
Separating the Data Retrieval from the Dashboards Presentation
Consider the following situations:
- We have many Dashboards’ elements to refresh, and it takes time to fetch the data and stage it, as well as to refresh the visual elements of the Dashboards. Taken together, the wait time until the user enjoys a refreshed Dashboard is too high.
- We source our data from an external provider that has a significant latency. Think about accessing the Internet with a request, or accessing another application (or even another Excel Workbook), with long latency.
First, you want to make sure your architecture and overall solution design are the best possible to serve the requirements. Be sure to remain with only delays that are beyond your control.
In such cases, we should consider separating the data sourcing (into “cold” staging tables) from the visual painting of the Dashboards’ elements. The “cold” staging tables are refreshed from the source in the appropriate frequency, either by a timer cue or by another trigger identified by the application. Since these tables are “cold” (Not attached to the visual Dashboards elements), their refresh has no further impact on any other element in the Workbook.
When the Dashboards are called for re-painting, we quickly copy the data from the “cold” staging tables into the “plugged” staging tables – for Dashboards calculation only.
The first part, refreshing the “cold” staging tables, can even be performed outside of the Dashboards’ Workbook. Another program can be tasked with refreshing the data from the source every so and so hours, storing the data in a text (csv) file that is very quickly read into the Dashboards’ Workbook when needed.
While considering all of the specific circumstances you are facing, I offer the following guidelines:
- Assign a hidden Worksheet for staging. Keep the Dashboard Worksheet clean and free of unnecessary burden.
- SQL queries over Excel tables are very (very!) fast. They also allow for complex joining of multiple Excel tables just like you’re used to with Database tables.
- Worksheet functions are typically calculated frequently, especially on a Workbook that is not dedicated only for Dashboards but also serves as a working Workbook. Use formulas scarcely for serving Dashboards off large data-sets as much as possible.
- Even if not using SQL for fast queries, use efficient VBA programming for preparing the data. Use arrays for fast in-memory processing – look here to see how fast it is.
- Importing a text (csv) file is very fast. Can you prepare the data sourced outside of the Workbook into text files and stage your tables from these text files?
- Can you keep your staging tables up-to-date at all times instead of waging an overall refresh when called for presentation? Maybe each table can be updated as soon as the operational transaction is committed? Here’s an example: when saving User Form’s data into the data table, the staging tables dependent on this data table can be quickly updated as well.
- Calculate a single staging table to serve multiple Dashboards’ elements if possible, instead of calculating a separate staging table per each element. Example: multiple elements may present calculation on your employees. A single Employees staging table is enough to calculate many columns to serve all such elements.
In closing this post, keep in mind that the main goal of the data staging phase is performance. The whole design, architecture and procedures need to be geared towards servicing the Dashboards view fast!
As hidden from the eye as data staging is, the visual effect of Dashboards will be the focus of our next (and last) post on Dashboard – next week.
Yes, I want to be among the first to join
NOW OPEN FOR FOUNDING MEMBERS!
Join today to the Excel VBA Inner Circle with Mor Sagmon.
The first to enroll as founding members will enjoy exceptional, life-long benefits and prices.
Click the button below to learn more and join.