In this series we’re going to be looking at Tabular model development in Analysis Services, which encompasses a few different technologies. One of the fundamental and underlying architectures we need to understand first is the BI Semantic model and what is the BI Semantic model.
Fundamentally the purpose of the BI Semantic Model (BISM) is to create one semantic model for all end-user experiences. Rather than having every application and every reporting technology need to read underlying basic information and put together data models–and by data models we mean the way tables relate, the calculations we use (gross margins, percentages and so on).
If we build all that intelligence into the model then every one of our end-user accessed applications can reuse all that intelligence and modeling effort. The BISM is a structure to create that kind of a consumable model.
The BISM has three basic components :
The first is the data model itself. You can think of the data model as it’s organized for end-users. This isn’t the raw data as it’s coming out of a transactional system. It’s probably not even the data as it’s stored in a data warehouse. It’s something that’s more intelligent than that; it’s put together so end- user applications like Excel can read and understand it and present it to users in a logical way.
The second piece is the business logic. Once we’ve put together the data elements into a logical format and made it easy to understand, we can add logic to that. That might be in the form of calculations, like ratios and percentages; it might be time calculations like YTD and MTD numbers and Year-over-year and so on. There’s a lot of intelligence we can put into that model that goes beyond what we might have with a regular database.
The third component is data access. This is pulling data out of its original source and putting it into our data model and applying business logic to it.
We have these three basic elements of the semantic model, and next we’ll look at how those elements fit into the overall architecture.
At the bottom of this architecture we have the source of the data. That’s really what we’re calling our “data access”. BISM is designed so we can pull data out of relational databases as well as transaction files, web services, and pull this all together into the model for our users to use.
That’s a powerful thing if you think about it, because typically we give users access to databases, but they have to worry about going in and creating the intelligence that’s not in the database itself. BISM lets us do that for them.
On the output side of that we have the presentation of our data model with all the business logic to end-user applications. These might be Excel, Reporting Services, custom applications, PowerPivot and so on.
The BI Semantic Model is really design to bring in data, service applications and add a lot of intelligence in the middle.
The way we actually deploy this is using Analysis Services, and in SQL Server 2012 there are a few different ways to deploy Analysis Services. We have the traditional Multidimensional cube environment, which uses MDX as the language to define business logic and queries. In this mode the data access is MOLAP (Multidimensional Online Application Processing). MOLAP is a way of pulling the data out of the source database and making a cached copy of it, which yields very high performance for users because it’s been indexed and optimized.
ROLAP is Relational OLAP, which refers to making relational queries at the time that the data model requests the data. This has some real-time aspects to it that can be very useful, but is typically not as performant in the end-user application as MOLAP, though it can be made to perform well with significant performance tuning effort.
The new way to deploy this semantic model is what we call “Tabular”. Tabular refers to “Tables”, so as we get into the tabular design more, we’ll find that the way you interact with the model is more like related tables and less like Multidimensional cubes.
It has a query and business rule expression language as well, which is called DAX (Data Analysis Expressions). It’s different than MDX, and the only thing they have in common is the “X” at the end. There are many differences between the two languages, but each has its strengths, which we’ll talk about later.
Vertipaq [xVelocity] more or less corresponds to MOLAP; both are caching data access mechanisms. Direct Query more or less corresponds to ROLAP, as both are ways to access data from a relational store at the time the query is executed.
That’s our overall architecture. What is special about the “Model”? Why do we bother doing this? Why not just put data into the relational database and let the users figure this stuff out on their own?
I think the most important aspect of this is that the model helps us to hide the complexity from end- users. If we can hide the underlying complexity of the data that’s in the underlying data sources, the more successful users will be in using the information and extracting value from it.
Secondly, tools like Excel can read the model and know how to display the data. If we only surface tables, then there will be significantly more work in Excel to extract data, store it in Excel, try to relate it with VLOOKUPs and so on. The BISM model allows Excel to simply read the model and use it.
We do have various structures we want to show users, like hierarchies. The BISM is a place where we can define these structures. The relational database doesn’t really know about hierarchies. Relational databases have foreign keys, but that’s only a rudimentary way to express a hierarchy. A semantic model has explicitly defined hierarchies that directly correlate into visual elements within end- user applications.
We can also incorporate attributes of things that are images, fonts, font sizes and so on that you won’t find in a relational database but can incorporate into a semantic model.
Flexible naming of measures and dimensions goes along with hiding data complexity. The model lets us rename things for users in ways that make more sense to them. For example rather than having a six character column name, we can give a column in the model a name like “Customer Identifier” (or something like that).
We can build into the model complex formulas. Gross Margin, for example, isn’t really a complex formula. But if you think of how this is typically implemented in a Business Intelligence environment without a model, it’s implemented in the reporting layer, and it’s implemented over and over and over. We hope it’s implemented correctly in every report where it’s used, but the reality is that it might not be. The model lets us take calculations like this and embed them right into the model so applications will see them.
Finally, Automatic Aggregations are an important feature. Aggregating data is sometimes the reason we’re implementing a BI solution. We want to get the Gross Margin % for a region sliced by product. The BISM model allows these aggregations automatically, so users don’t have to consider the correct method to implement an aggregated ratio. We remove that responsibility from the user by putting this logic into the semantic model.