The official PASS Blog is where you’ll find the latest blog posts from PASS community members and the PASS Board. Contributors share their thoughts and discuss a wide variety of topics spanning PASS and the data community.

Analytical Databases in the Microsoft Data Platform: A Single Source of Truth

Microsoft’s documentation calls Analysis Services an analytical data engine. This documentation references the Tabular Model relying on Vertipaq technology. The marketing involves Azure analytical databases and the growing presence of the refined Big Data tools. No matter what the message, analytical reporting relies on data. The source, structure, and retrieval are important yesterday and today.

The term “database” can be added to the analytical engine expression when one set of data is deployed for consumption. The analytical database encapsulates a business area or areas. A visualization tool using an analytical database like Power BI works best with a star schema structure. The star schema is usually a fact table with relationships to dimension tables called a data mart. Many data marts with conformed dimensions will become a data warehouse. Other data warehouse structures, like data vault or third normal form, can be used as an analytical database source just like a star schema.

Why Use an Analytical Database?

Analytical analysis has morphed from reporting to various levels of statistics. All can be explained as a single source of truth. The business needs data to slice and dice, finding good or bad patterns. Further, the data can be used to estimate the future business path through statistical modeling.

Why not use the source database?

Relational database applications are written to meet the needs of a business. At the time of development, some analytical reporting is not known because requirements were not gathered to support this analysis. The end users, during interviews, have issues speaking the IT language and the IT analyst does not understand all the business terms. The analytical and reporting requirements usually do not get hammered out until further development has been implemented, thus enters agile development. What this leaves is the transactional database structure lacking certain reporting abilities.

History

Analysis Services became a durable way to present calculations in a single database. Having these measures created in an Analysis Services database prevented different reports from running the same calculation in T-SQL. Now, an analyst or report writer does not have to remember where the calculation details are located to re-create it. It eliminates the copy and paste of a calculation.

The Online Analytical Processing (OLAP) term first referred to a multi-dimensional cube in Analysis Services. Even though there are multi-dimensional cubes still around, the vast majority of new SQL Server Analysis Services (SSAS) databases are created in the Tabular Model. This semantic layer on top of the star schema structures the measure's relationship to dimension attributes.

First marketed as a semantic model using the vertipaq engine, this database started in Excel as PowerPivot. PowerPivot eliminated Excel’s large data set limit and used a new technology for compressed column stored data. This was ported to Analysis Services and eventually to Power BI. The Tabular Model can be deployed to Azure, On-Premises, or the Power BI Service to be shared and reused. The features of each depend on the release.

Star Schema

What’s interesting is that the data structure that works best behind analytical databases is referred to as a star schema. Not to get too far off track, but a star schema is really a Kimball method data mart. Kimball even admits that they did not invent the star schema. This includes the snowflake schema which has nothing to do with the cloud database product. The idea behind star schema data marts is getting a business area of analytical data created faster than a full data warehouse. Conformed dimensions are what links data marts together to form a data warehouse.

Future

Now, with Azure, there are a multitude of rapidly changing options for an analytical database. The big data area has really grown up with many tools to remove the complexity of retrieving data and analyzing it. This whole set of big data would be a database, but its lack of consistent structure leaves it out of the analytical database realm. This set of big data is for the data scientist or analyst. The analyst might collaborate with reporting teams to add calculations for an existing analytical database producing a single source of truth. The single source of truth can be shared with others.

Analysis Services is the rock in all this new technology. It has been integrated into the Power BI Premium service. Recently, the XMLA endpoint has been opened for read (existing) and write (preview). Open source tools can now bring value to the analytical database. In addition, DevOps for analytical databases can proceed. This is a new and exciting time for people in the data warehouse realm. Since analytical database development can be deployed on premise, in Azure, and integrated into Power BI, there is no worry that it will be around for a long time.

Thomas LeBlanc
About the author

Thomas LeBlanc (Microsoft Data Platform MVP) is a Product Manager at Idera working with Wherescape data warehouse automation tools 3D and Streaming. WhereScape tools focus on the central goal of extracting the data in operational systems, normalizing it to the enterprise's standards, and sharing it with the various business stakeholders. As a PASS volunteer, he is past chair of Excel BI and Data Arch VCs and is active in the Baton Rouge Analytics and SQL Server User groups, as well as chair of the 1st SQLSaturday BI Edition in Baton Rouge.

1 comments on article "Analytical Databases in the Microsoft Data Platform: A Single Source of Truth"

4/25/2020 8:55 AM
Linda Larkan

Thanks for the excellent post!!


Please login or register to post comments.

Theme picker

Back to Top
cage-aids
cage-aids
cage-aids
cage-aids