MS BI TOOLS

MICROSOFT BUSINESS INTELLIGENCE

Saturday, February 26, 2011

Pivots with Dynamic Columns in SQL Server 2005


Pivots in SQL Server 2005 can rotate a table, i.e. they can turn rows into columns. PIVOTs are frequently used in reports, and they are reasonably simple to work with. However, I've seen quite a few questions about this operator. Most questions were about the column list in the PIVOT statement. This list is fixed, but many times the new columns are determined by the report at a later stage. This problem is easily solved when we mix pivots with dynamic SQL, so here is a very simple example about how to dynamically generate the pivot statement:
PIVOT allows you to turn data rows into columns. For example, if you have a table like this (I use only three months here for simplicity):
CREATE TABLE Sales ([Month] VARCHAR(20) ,SaleAmount INT)

INSERT INTO Sales VALUES ('January', 100)
INSERT INTO Sales VALUES ('February', 200)
INSERT INTO Sales VALUES ('March', 300)

SELECT FROM SALES
 

Month             SaleAmount
----------------  -----------
January           100
February          200
March             300 

Suppose we wanted to convert the above into this:
 
January     February    March
----------- ----------  ----------
100         200         300

We can do this using the PIVOT operator, as follows:
SELECT  [January]
      [February]
      [March]
FROM    SELECT    [Month]
                  SaleAmount
          FROM      Sales
        p PIVOT SUM(SaleAmount)
                    FOR [Month] 
                      IN ([January],[February],[March])
                  ) AS pvt

However, in the above example, I have the column names fixed as the first three months. If I want to create a result in which the columns are dynamic (for example, they are read from a table), then I need to construct the above query dynamically. To demonstrate this let’s look at the following example:
In the first table I have the column names I want to use:
CREATE TABLE Table1 (ColId INT,ColName VARCHAR(10))
INSERT INTO Table1 VALUES(1, 'Country')
INSERT INTO Table1 VALUES(2, 'Month')
INSERT INTO Table1 VALUES(3, 'Day')

In the second table I have the data. This consists of a row identifier (tID), a column ID (ColID) that refers to the column type in Table1, and a value:
CREATE TABLE Table2 (tID INT,ColID INT,Txt VARCHAR(10))

INSERT INTO Table2 VALUES (1,1, 'US')
INSERT INTO Table2 VALUES (1,2, 'July')
INSERT INTO Table2 VALUES (1,3, '4')
INSERT INTO Table2 VALUES (2,1, 'US')
INSERT INTO Table2 VALUES (2,2, 'Sep')
INSERT INTO Table2 VALUES (2,3, '11')
INSERT INTO Table2 VALUES (3,1, 'US')
INSERT INTO Table2 VALUES (3,2, 'Dec')
INSERT INTO Table2 VALUES (3,3, '25')

Now I would like to retrieve data from these two tables, in the following format:

tID         Country    Day        Month
----------- ---------- ---------- ----------
1           US         4          July
2           US         11         Sep
3           US         25         Dec 

In other words I want to turn the data rows in Table2 into columns. If I had a fixed set of columns for the result, i.e. the columns Country, Day, and Month were fixed, I could use SQL Server 2005’s PIVOT operator in a query like:
SELECT  tID
      [Country]
      [Day]
      [Month]
FROM    SELECT    t2.tID
                  t1.ColName
                  t2.Txt
          FROM      Table1 AS t1
                    JOIN Table2 
                       AS t2 ON t1.ColId t2.ColID
        p PIVOT MAX([Txt])
                    FOR ColName IN [Country][Day],
                                     [Month] ) ) AS pvt
ORDER BY tID ;

However I need to construct this query dynamically, because the column names Country, Day, and Month are specified in a table, and can be changed independently from my query. In our case these columns are given in Table1.
In the first step to generate the final pivot query I need to create the list of columns, in this case [Country], [Day], [Month].
Since there is no string concatenation aggregator in SQL (a concatenation aggregator would not be deterministic without some order restriction), and since the column names are stored in rows of a table, I need to flatten these columns into a single row or variable. There are various solutions to achieve this. One solution would be to use a query like:
DECLARE @cols NVARCHAR(2000)
SELECT  @cols COALESCE(@cols ',[' colName ']',
                         '[' colName ']')
FROM    Table1
ORDER BY colName

This query works both on SQL Server 2000 and 2005. It is efficient, but some may not like it because it uses the same variable (@cols) on both sides of an assignment. Another solution that works on SQL Server 2005 only is to use XML PATH.
DECLARE @cols NVARCHAR(2000)
SELECT  @cols STUFF(( SELECT DISTINCT TOP 100 PERCENT
                                '],[' t2.ColName
                        FROM    Table1 AS t2
                        ORDER BY '],[' t2.ColName
                        FOR XML PATH('')
                      ), 12'') + ']'

This second query (I’ve seen this posted by Peter Larsson) has the advantage that it does not use the @cols variable on the right hand side. I like this solution more, since this can be extended as a general string concatenation aggregate in more complex queries.
Both of the above queries generate, from Table1, the string: ‘[Country],[Day], [Month]’. This column list is used twice in the pivot query that we aim to construct. Once it is use in the list of columns that we want to retrieve, and once it is used as the list of values that should become columns. Having constructed this list of columns above, we can just concatenate it with the missing parts of the pivot query like:
DECLARE @query NVARCHAR(4000)
SET @query N'SELECT tID, '+
@cols +'
FROM
(SELECT  t2.tID
      , t1.ColName
      , t2.Txt
FROM    Table1 AS t1
        JOIN Table2 AS t2 ON t1.ColId = t2.ColID) p
PIVOT
(
MAX([Txt])
FOR ColName IN
( '+
@cols +' )
) AS pvt
ORDER BY tID;'
Executing this with
EXECUTE(@query)
will give us the expected result: a table that is pivoted and shows columns that were specified in a table:
tID         Country    Day        Month
----------- ---------- ---------- ----------
1           US         4          July
2           US         11         Sep
3           US         25         Dec

Dynamic column names and fields in SSRS (Custom Matrix)


I had no choice but to do work with creating dynamic column headings and dynamically determine what field in my dataset should go where in a report today. Since this is the second time I’ve gone through this exercise and knowing the lack of information out there on really how to do it, I thought it deserves a blog entry.
So here is the basis of the requirements. You have a query that uses PIVOT but thrown into the mix is the need for dynamic columns in the PIVOT. This is usually a task when you are going after things like current week plus the last 52 weeks. That was the case in this situation. I needed to bring in a dynamic set of columns to be used in PIVOT. The matrix in 2005 did not give me what I needed in the end result so this is the path I took.
First task is to write the procedure to use PIVOT with dynamic column headers. I’m not going to go into that method since it’s well documented out there and out of scope. I will point you to Pivots with Dynamic Columns in SQL Server 2005 as it explains the way to accomplish this well.
I wrote something in AdventureWorks to for this example so if you have AdventureWorks floating around you should be able to read this and run through step for step with success.
Here is our procedure. I’m sure my methods will take great notice from my local TSQL friends :) The dynamic SQL more so than anything...
  1. CREATE PROCEDURE GetSalesPerWeek
  2. AS
  3. DECLARE @weeks_ordered TABLE (num VARCHAR(3))
  4. DECLARE @weeks TABLE (wk INT)
  5. DECLARE @DATE DATETIME
  6. DECLARE @cols NVARCHAR(3000)
  7. DECLARE @INT INT
  8. DECLARE @col_pv VARCHAR(2000)
  9. DECLARE @query VARCHAR(3000)
  10.  
  11.  
  12. SET @INT = 1
  13. SET @DATE = GETDATE()
  14.  
  15. WHILE @INT <= 52
  16. BEGIN
  17.     INSERT INTO @weeks VALUES (@INT)
  18.     SET @INT = @INT + 1
  19. END
  20.  
  21.  
  22. INSERT INTO @weeks_ordered
  23. SELECT
  24. wk
  25. FROM @weeks
  26. ORDER BY
  27. CASE WHEN DATEPART(wk,@DATE) - wk < 0
  28. THEN DATEPART(wk,@DATE) - wk + 53
  29. ELSE DATEPART(wk,@DATE) - wk
  30. END DESC
  31.  
  32. SELECT @col_pv = STUFF(( SELECT  
  33.                                 '],[' + w.num
  34.                         FROM  @weeks_ordered AS w
  35.                         FOR XML PATH('')
  36.                       )12'') + ']'
  37.  
  38. SELECT  @cols = STUFF(( SELECT  
  39.                                 '],0) as W' + CASE WHEN CAST(w.num - 1 AS VARCHAR(2)) = 0 THEN '52' ELSE
  40.                                                         CAST(w.num - 1 AS VARCHAR(2)) END + ',isnull([' + w.num
  41.                         FROM  @weeks_ordered AS w
  42.                         FOR XML PATH('')
  43.                       )12'') + '],0) as W' + CAST(DATEPART(wk,GETDATE()) AS VARCHAR(2))
  44.  
  45.  
  46. IF OBJECT_ID('tempdb.dbo.#detail') IS not null
  47.         DROP TABLE #detail
  48.  
  49. CREATE TABLE #detail
  50. (
  51. AccountNumber VARCHAR(10)
  52. ,PruductNumber VARCHAR(25)
  53. ,OrderQty INT
  54. ,WeekNumber SMALLINT
  55. )
  56.  
  57.  
  58. INSERT INTO #detail
  59. SELECT
  60.     cust.AccountNumber
  61.     ,items.ProductNumber
  62.     ,det.OrderQty
  63.     ,DATEPART(wk,hdr.ShipDate) WeekNumber
  64. FROM
  65. Sales.SalesOrderHeader hdr
  66. INNER Join Sales.SalesOrderDetail det ON hdr.SalesOrderID = det.SalesOrderID
  67. INNER Join Production.Product items ON det.ProductID = items.ProductID
  68. INNER Join Sales.Customer cust ON hdr.CustomerID = cust.CustomerID
  69. INNER Join @weeks ord ON DATEPART(wk,ShipDate) = wk
  70. WHERE ShipDate >= DATEADD(wk,-52,'2004-06-01')
  71. GROUP BY
  72.     cust.AccountNumber
  73.     ,items.ProductNumber
  74.     ,det.OrderQty
  75.     ,DATEPART(wk,hdr.ShipDate)
  76.     ,wk
  77. ORDER BY
  78. CASE WHEN DATEPART(wk,ShipDate) - wk < 0
  79. THEN DATEPART(wk,ShipDate) - wk + 53
  80. ELSE DATEPART(wk,ShipDate) - wk
  81. END
  82.  
  83.  
  84. SET @query =
  85. '
  86. Select 
  87.     AccountNumber
  88.     ,PruductNumber
  89.     , '     + RIGHT(@cols,LEN(@cols)-10) + '
  90. From
  91.     #detail as sales
  92. PIVOT (sum(OrderQty) FOR WeekNumber IN (' + @col_pv + ')) as pv
  93. Order By AccountNumber
  94. '
  95. EXEC(@query)

The results shows us the PIVOT results of each account number and the sales for the week of the year

Image reduced in size to fit your browser, click for original size
The problem with all of this is the dynamic nature of the column names. In reporting services we’re used to handling column names as static entities. So here is how we’ll build our report given the fact these column names can and will change over time.
So create a new report in your solution and add a new DataSet. Make it a text call with the following statement
  1. EXEC GetSalesPerWeek
Run the DataSet to verify everything comes in ok.

Image reduced in size to fit your browser, click for original size
Now add a new table to your empty report. Add the account number and product number as you would normally. Next we need to figure out what week is actually first. To do this we’re going to write a function in the code section of SSRS
In the layout tab go to Report and select Report Properties. This will give you the properties for the entire report. Select the Code tab. Copy and paste the following code into the window
  1. Public Function GetColumnHeading(ByVal x As Integer)
  2.        Dim WeeksArr As New System.Collections.ArrayList()
  3.         Dim i As Long
  4.         Dim CurrentWeek As Long
  5.  
  6.         CurrentWeek = DatePart(DateInterval.WeekOfYear, System.DateTime.Now)
  7.  
  8.         For i = 1 To 52
  9.             WeeksArr.Add(1 + (i + CurrentWeek - 1) Mod 52)
  10.         Next
  11.         Return WeeksArr(x)
  12.     End Function
This code was written by our own gmmastros. Thanks to him for this and the help it gave me when I needed it. Gets the job done and it does it quickly.
Final results should look like this

Image reduced in size to fit your browser, click for original size
Hit OK to save.
Now in the field next to the Product Number go ahead and enter an expression for the heading like this
="W" & Code.GetColumnHeading(0)
Recall in our procedure we return each week as Wnn for the week number. So in our code we created an ArrayList filled up with the order we want. The same order we based the procedure off of. Now by using the index of the ArrayList we can simply call for the heading that should be all the way to the left (-51 weeks from the current) by means of index of 0. In the details textbox we can then simply do the following as well given the same guidelines
=Fields("W" & Code.GetColumnHeading(0)).Value
Most developers don’t know they can reference the fields by name in this manner. Usually it just isn’t required and that is the case. It can be useful to note that you can dynamically fill the object name in though and get the same results as Fields!name.Value
I went ahead and put a few more columns and increased the index requested from the ArrayList. Running that results in the following.

Image reduced in size to fit your browser, click for original size
Now you have your customer matrix by means of dynamic column and field referencing. You also a nice example of PIVOT by means of dynamic column names.

Adding columns to SSRS 2005 report Dynamically


  • I have a stored procedure that generates a list of columns and the associated data. 
  • The number of columns (and their names) may change from time to time. Because remote machine is in accessible. so Currently there are over 500 columns in the dataset.
  • Ultimately, I am comparing the min value of an item in each column to the max value of the item in the same column. If the data is the same it will not be displayed on the report, if it is different then we will display it on the report.
I cannot drag/drop 500 plus field onto the report and do an
                          IIF((Min(Fields!Value) <> Max(Fields!Value), "Show Min and Max Values", "") 
on each one because the field names may change the next time I run the report. I need to enumerate the columns (fields) returned by the stored procedure and add them to the report dynamically. 

In other words here  I'll have Three columns on the report: Field Name, Old Value and New value.

Does anyone know how to do this, it means generating such a report dynamically. 

Help me in doing this.

ThanQ

Friday, February 25, 2011

Execute SQL Server 2005 Integration Services package from C#


Introduction

There are many ways to execute a SQL Server 2005 Integration Services (SSIS) package.  You can use the command line utility DTExec.exe or its window equivalent, DTExecUI.exe. A package can be executed within SQL Server Business Intelligence Studio (Visual Studio) or from a SQL Server Agent Job Step. A package can also be executed from .NET code!

Example SSIS Package

For the purposes of demonstration, I created a simple package that takes some data out of a SQL Server AdventureWorks database and dumps it into a flatfile. The package also has one variable.
Sample screenshot
Sample screenshot
In any kind of "real world" scenario, your package will usually be driven by a configuration file.  An SSIS configuration file is an XML file with a.dtsConfig extension that contains settings you can apply to a package (without actually changing or editing the package).

Common SQL Server Reporting Services Challenges


In every business there are several different groups of report users, from chief executives, to business analysts, to operational staff, who all need access to reliable and current data in order to track overall business performance, investigate the effectiveness of individual business functions, or simply for ad-hoc day-to-day reporting.
In most organizations, at least some attempt has been made to meet these reporting needs. Historically, however, the problem has been that the available reports have not always been up-to-date, or even accurate. Furthermore, individual departments have tended to adopt a "silo" approach, using different tools/systems to create reports that are useful within their silo, but not necessarily consistent or compatible with those produced by other departments. In many cases, there doesn't even exist a shared understanding of the business data that underpin these reports.

Thursday, February 24, 2011

SSAS::::MDX Optimization Techniques: Segregating DISTINCT COUNT

Initial Approach via MDX

Let’s initialize the MDX Sample Application, the platform from which we perform many practice exercises within the articles of our series. (We choose it because any organization that has installed MSAS has access to the Sample Application). We will create our initial query by taking the following steps:

1. Start the MDX Sample Application.

We are initially greeted by the Connect dialog, shown in Figure 1.


Figure 1: The Connect Dialog for the MDX Sample Application

Creating Cube with SQL Server 2005 Analysis Services

Definitions::::
Before we dive into building a cube, we need to review some basic definitions. This will help to ensure that at the end you not only have a working cube but an idea of what the cube represents, as well as how to use it. The first two items we will define are Facts and Dimensions.

In the AdventureWorksDW database you will find tables prefixed with the letters ‘Dim’ and ‘Fact’. Ideally, any time you build something to be called a data warehouse you would be building your tables with these two concepts in mind. But what are they? Well, the simplest explanation is to think of them as nouns and verbs, and try not to reflect back to those frustrating hours spent in grammar class. Instead, try to forge ahead with a “Schoolhouse Rock” version of grammar by looking at an example of each.

Wednesday, February 23, 2011

MS SQL Server interview questions


This one always gets asked. For a while the database interview questions were limited to Oracle and generic database design questions. This is a set of more than a hundred Microsoft SQL Server interview questions. Some questions are open-ended, and some do not have answers.
1. What is normalization? - Well a relational database is basically composed of tables that contain
related data. So the Process of organizing this data into tables is actually referred to as
normalization.
2.What is a Stored Procedure? - Its nothing but a set of T-SQL statements combined to perform a
single task of several tasks. Its basically like a Macro so when you invoke the Stored procedure,
you actually run a set of statements.
3. Can you give an example of Stored Procedure? - sp_helpdb , sp_who2, sp_renamedb are a set of system defined stored procedures. We can also have user defined stored procedures which can be called in similar way.
4. What is a trigger? - Triggers are basically used to implement business rules. Triggers is also similar to stored procedures. The difference is that it can be activated when data is added or edited or deleted from a table in a database.

SSIS common Interview Questions


1) What is the control flow
2) what is a data flow
3) how do you do error handling in SSIS
4) how do you do logging in ssis
5) how do you deploy ssis packages.
6) how do you schedule ssis packages to run on the fly
7) how do you run stored procedure and get data
8) give a scenario: Want to insert a text file into database table, but during the upload want to
change a column called as months - January, Feb, etc to a code, - 1,2,3.. .This code can be read from another database table called months. After the conversion of the data , upload the file. If there are any errors, write to error table. Then for all errors, read errors from database, create a file, and mail it to the supervisor.
How would you accomplish this task in SSIS?
9)what are variables and what is variable scope ?