Thursday, February 24, 2011

Creating Cube with SQL Server 2005 Analysis Services

Before we dive into building a cube, we need to review some basic definitions. This will help to ensure that at the end you not only have a working cube but an idea of what the cube represents, as well as how to use it. The first two items we will define are Facts and Dimensions.

In the AdventureWorksDW database you will find tables prefixed with the letters ‘Dim’ and ‘Fact’. Ideally, any time you build something to be called a data warehouse you would be building your tables with these two concepts in mind. But what are they? Well, the simplest explanation is to think of them as nouns and verbs, and try not to reflect back to those frustrating hours spent in grammar class. Instead, try to forge ahead with a “Schoolhouse Rock” version of grammar by looking at an example of each.

A dimension is commonly thought of as your noun or subject. It represents an object or a thing that either does something or has something done to it. It may help to also think of a dimension as something that can exist independently of events. An employee, a customer, and a product are all examples of dimensions. When you see a dimension built in the project below, you will find that it has things called attributes, which are simply another way to say they are columns from the underlying data source.

A fact, however, is your verb. It is the event that takes place against a dimension. An example of a fact would be the sale of a product. The sale is the fact, and the product is the dimension. Now, going forward you should be able to understand the fundamental difference between facts and dimensions. When outlining what is needed to build a cube, you should be able to answer basic questions regarding dimensions and facts. And for those of you that did well in grammar class, you could probably diagram a few sentences and build yourself a well structured data warehouse.

Later, we will be able to create a visual representation of facts and dimensions, and how they come together to form what is called a schema. For now, let us press on with two more important definitions: measures and hierarchies.

Your fact table should contain numerical data only, and not contain the descriptive information that would be found in a dimension table. This numerical data is then defined as a measure. Think of a fact table as an entity that contains rows based upon events, such that you would perform aggregate calculations on those events. For example, a fact table could contain one row for each sale made. Later on you will browse through some measures, and this will help you to understand how they were built and what they are to be used for.
Hierarchies are defined inside of dimensions, and are created based upon existing primary-foreign key relationships. Hierarchies allow for one to “drill-through” a cube. Think of an example involving customers, address, countries, etc. You could start browsing a cube based upon countries where you customers reside, looking at totals sales by country. Then, you could drill down to view specific states within the United States, then specific addresses, then even a specific customer name. Each of those levels is defined as an hierarchy.
Getting Started
First thing you will need is to create a new Analysis Services project in BIDS.

Figure 1
Once that is done, you will need to define a data source next. Inside the Solution Explorer you will see the project you have just created, and directly under that project you should see a folder named ‘Data Sources’ (Figure 2)

Figure 2

Right-click and select ‘New Data Source’. You will be prompted with the Data Source Wizard, select ‘Next’ and you should see the following (Figure 3):

Figure 3

Select ‘New’, and define a connection. Here, I will connect to an existing AdventureWorksDW database (screen print deliberately not shown), then click ‘Next’.

Figure 4

Select ‘Default’ for the Impersonation Information settings. If you are not able to connect with default permissions, you will need to get in touch with whoever is administering your SSAS installation and make certain you agree as to what selection you should be making in the above screen. Another click of ‘Next’, then name your Data Source, and then select ‘Finish’.

You should now see the following in your Solution Explorer (Figure 5):

Figure 5

Now, you may have noticed that we are using the sample AdventureWorksDW database, as opposed to the AdventureWorks database. Why? Well, because the AdventureWorksDW database has already been built with nicely defined dimension and fact tables, complete with pristine data. Such entities are crucial to the building of cubes, and the AdventureWorksDW database makes things easier for this walkthrough.

However, in real world scenarios it is unlikely you will be handed a pristine data warehouse to work with when building cubes. The options you have at that point are outside the scope of this article, but could include the building of views against existing source data. I recommend any book by Ralph Kimball on the subject of data warehousing to help you along in this area. But for now, we will just continue with our building of a cube against a nice data source.
Configure Data Source View
The next step is to configure your data source view. In the Solution Explorer, right-click on the ‘Data Source Views’ Folder, then select ‘New Data Source View’ to launch the wizard. Click ‘Next’, and you will be prompted to select a data source. Select the one we just created and you will see the following (Figure 6):

Figure 6

At this point, you should already have an idea of what data you are trying to analyze, for example perhaps someone is asking to analyze the internet sales data. So, let’s continue with our demo by selecting the DimCustomer, DimGeography, DimProduct, DimTime, and FactInternetSales tables, then click ‘Next’. We will name our view ‘Adventure Works DSV’ and click ‘Finish’. BIDS will then display the design view of the data source view we just created.

The Cube
The next step is to build the actual cube. In Solution Explorer, right-click on the ‘Cubes’ folder and select ‘New Cube’ to launch the wizard, then click ‘Next’ and you will see the following screen (Figure 7):

Figure 7

For our demo we will accept the defaults and just click ‘Next’ However, you may want to experiment with different cubes by selecting different options, such as creating attributes only or not using the auto build feature. You can even build the cube without a data source and select a template to use. Some experimentation here would give you a greater sense of what actions the wizard will perform for you if you compare the finished products side-by-side.

You should now see the following (Figure 8):

Figure 8

Select the data source view we have just created and click ‘Next’. The wizard will now detect the dimension and fact tables and analyze the relationships between them to offer some suggestions. Click ‘Next’ to review the suggestions (Figure 9):

Figure 9

There is one extra piece of information not readily apparent to the wizard, and that is the identity of the Time dimension table. The Time dimension is quite important, as it will be used to represent time periods that are useful for analyzing and reporting on data.

We will manually select the DimTime table as our Time dimension, and select ‘Next’. You should now see the following (Figure 10):

Figure 10

 Next we need to define some time periods. In this screen you will match the name of a column in the source table to the name of the property in the Time dimension. We will make the following selections (Figure 11):

Figure 11

Then click ‘Next’ to review the available Measures for the cube (Figure 12).

Figure 12

Take a minute to examine the names of these measures. You will see names that are tagged as “Amount”, “Cost”, and “Key”. Look back to how we defined measures, and this screen should be clearer to you as to how the wizard arrived at these conclusions. We will accept the defaults and click ‘Next’. The wizard will now detect the hierarchies available and when complete you should see the following screen (Figure 13):

Figure 13

Now you can review the dimensions that were created and make adjustments if desired. Notice the wizard says “All relationships detected”. Refer back to how we defined a hierarchy and this screen should be clearer to you. Of course we are going to take the defaults and simply click ‘Next’, choose a name for our cube (Adventure Works Cube), and then select ‘Finish’. You should be taken to the design view of the cube inside of BIDS.

The Cube is Built, Now What?
Now we can start looking through some of the basic features inside the cube designer. The first screen you would be viewing after the cube is built is the cube designer. Here you will be able to explore the Measures and Dimensions on the left hand side of the GUI. You should recall that measures tend to be items that are, well, measurable and also note how the Fact table is being displayed here.

The Hierarchies and Attributes are displayed just below the Measures. Browsing through the Hierarchies you will see that most of the links simply bring you to a design view of a dimension. If you go to “Ship Date”, you will find an example of two hierarchies that have been defined by the Auto-Build process. Click on the “Edit Dim Time” link to go to the design view for the Time Dimension, and note that there are two hierarchies defined for this dimension. Close this design view and go back to the design view for the cube. Now, click on the Attributes tab and you will see the attributes associated with the dimensions. Again, this information is also displayed in the design view for the Dimensions. You may be wondering at this point about the “Ship Date” that is listed. After all, it is not listed on the right hand side under the Dimensions section of the Solution Explorer. So, what is “Ship Date”? Is it a dimension? Is it a fact? Where did it come from?

The short answer is that it was created by the auto-build process. But why? Well, the process saw a column named ShipDateKey in the FactInternetSales table inside the AdventureWorksDW database, which is what we used to build our data source and view. This column is defined as a foreign key column to the DimTime table. So, the process decided to build an extra dimension for our use. Click on the very next tab named “Dimension Usage” and you should see a list of all available dimensions, including one named “Dim Time (Ship Date)”. Scroll to the right and you will see the connection made to the FactInternetSales table.

With the exception of the “Browser” tab at the far right, the remaining tabs are for functionality that is outside the scope of this introductory document. The Browser tab will let us start to examine the data, so let’s get started.

Before we are allowed to browse the cube, we need to process and deploy the project to the SSAS instance. Right-click on the project name in the Solution Explorer and select ‘Properties’. Make certain you are deploying to an instance of SSAS:

Figure 14

Right-click again on the project name and select ‘Deploy’. The cube will now be deployed to the instance of SSAS, and after it is complete we will be able to use the Browser tab. When the deploy is complete, go to the browser tab, and on the left expand the “Measures”, followed by “Fact Internet Sales”, then drag and drop the “Sales Amount” into the section that says “Drag Totals or Detail Fields Here”. You should see the following (Figure 15):

Figure 15

Next, we will start to slice and dice this number by adding some dimensions. So, find the Calendar Year attribute in the Order Date dimension and drag to the column section as follows (Figure 16):

Figure 16

Finally, drag and drop the “Product Line” into the columns section to see the following (Figure 16):

Figure 17

Finally, go get yourself a snack. You have built your first cube and are well on your way to a career in Business Intelligence. Keep experimenting with the steps I have outlined above. Over time you will become more familiar with the terminology and concepts and before long you will be offering to build cubes that would make our self Proud.

No comments:

Post a Comment