MS BI TOOLS

MICROSOFT BUSINESS INTELLIGENCE

Thursday, December 23, 2010

SSIS and SSRS Interview questions with Answers

 1) What is the control flow

ANS: In SSIS a workflow is called a control-flow. A control-flow links together our modular data-flows as a series of operations in order to achieve a desired result.

A control flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, you use precedence constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow.
SQL Server 2005 Integration Services (SSIS) provides three different types of control flow elements: containers that provide structures in packages, tasks that provide functionality, and precedence constraints that connect the executables, containers, and tasks into an ordered control flow.

2) what is a data flow

ANS: A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations. Before you can add a data flow to a package, the package control flow must include a Data Flow task. The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package.

SQL Server 2005 Integration Services (SSIS) provides three different types of data flow components: sources, transformations, and destinations. Sources extract data from data stores such as tables and views in relational databases, files, and Analysis Services databases. Transformations modify, summarize, and clean data. Destinations load data into data stores or create in-memory datasets.


3) how do you do error handling in SSIS

ANS: When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.

For example, a data conversion fails because a column contains a string instead of a number, an insertion into a database column fails because the data is a date and the column has a numeric data type, or an expression fails to evaluate because a column value is zero, resulting in a mathematical operation that is not valid.

Errors typically fall into one the following categories:

-Data conversion errors, which occur if a conversion results in loss of significant digits, the loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if the requested conversion is not supported.
-Expression evaluation errors, which occur if expressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values.
-Lookup errors, which occur if a lookup operation fails to locate a match in the lookup table.

Many data flow components support error outputs, which let you control how the component handles row-level errors in both incoming and outgoing data. You specify how the component behaves when truncation or an error occurs by setting options on individual columns in the input or output.

For example, you can specify that the component should fail if customer name data is truncated, but ignore errors on another column that contains less important data.


4) How do you do logging in ssis.

ANS: SSIS includes logging features that write log entries when run-time events occur and can also write custom messages.

Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files.

Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

To customize the logging of an event or custom message, Integration Services provides a schema of commonly logged information to include in log entries. The Integration Services log schema defines the information that you can log. You can select elements from the log schema for each log entry.

To enable logging in a package
1. In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
2. On the SSIS menu, click Logging.
3. Select a log provider in the Provider type list, and then click Add.


5)what are variables and what is variable scope ?
ANS: Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions.

Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.

Scope : A variable is created within the scope of a package or within the scope of a container, task, or event handler in the package. Because the package container is at the top of the container hierarchy, variables with package scope function like global variables and can be used by all containers in the package. Similarly, variables defined within the scope of a container such as a For Loop container can be used by all tasks or containers within the For Loop container.


6) True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT command against the relational engine. It commits all of the data to the database.
ANS: False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure.

7) True or False: SSIS has a default means to log all records updated, deleted or inserted on a per table basis.
ANS: False, but a custom solution can be built to meet these needs.

8) What is a breakpoint in SSIS? How is it setup? How do you disable it?

ANS: A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option.

9) How do you eliminate quotes from being uploaded from a flat file to SQL Server?

ANS: In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services


10) Can you explain how to setup a checkpoint file in SSIS?
ANS: The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.

11) How do you upgrade an SSIS Package?

ANS: Depending on the complexity of the package, one or two techniques are typically used:
Recode the package based on the functionality in SQL Server DTS
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate.

4 comments:

  1. good one, but need few more q&A

    how about these.....




    Q1 Explain architecture of SSIS?
    SSIS architecture consists of four key parts:
    a) Integration Services service: monitors running Integration Services packages and manages the storage of packages.
    b) Integration Services object model: includes managed API for accessing Integration Services tools, command-line utilities, and custom applications.
    c) Integration Services runtime and run-time executables: it saves the layout of packages, runs packages, and provides support for logging, breakpoints, configuration, connections, and transactions. The Integration Services run-time executables are the package, containers, tasks, and event handlers that Integration Services includes, and custom tasks.
    d) Data flow engine: provides the in-memory buffers that move data from source to destination.



    Q2 How would you do Logging in SSIS?
    Logging Configuration provides an inbuilt feature which can log the detail of various events like onError, onWarning etc to the various options say a flat file, SqlServer table, XML or SQL Profiler.

    ReplyDelete
  2. Q3 How would you do Error Handling?
    A SSIS package could mainly have two types of errors
    a) Procedure Error: Can be handled in Control flow through the precedence control and redirecting the execution flow.
    b) Data Error: is handled in DATA FLOW TASK buy redirecting the data flow using Error Output of a component.

    Q4 How to pass property value at Run time? How do you implement Package Configuration?
    A property value like connection string for a Connection Manager can be passed to the pkg using package configurations.Package Configuration provides different options like XML File, Environment Variables, SQL Server Table, Registry Value or Parent package variable.

    Q5 How would you deploy a SSIS Package on production?
    A) Through Manifest
    1. Create deployment utility by setting its propery as true .
    2. It will be created in the bin folder of the solution as soon as package is build.
    3. Copy all the files in the utility and use manifest file to deply it on the Prod.
    B) Using DtsExec.exe utility
    C)Import Package directly in MSDB from SSMS by logging in Integration Services.

    Q6 Difference between DTS and SSIS?
    Every thing except both are product of Microsoft :-).

    Q7 What are new features in SSIS 2008?
    explained in other post
    http://sqlserversolutions.blogspot.com/2009/01/new-improvementfeatures-in-ssis-2008.html

    Q8 How would you pass a variable value to Child Package?
    too big to fit here so had a write other post
    http://sqlserversolutions.blogspot.com/2009/02/passing-variable-to-child-package-from.html


    Q9 What is Execution Tree?
    Execution trees demonstrate how package uses buffers and threads. At run time, the data flow engine breaks down Data Flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread. When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation and each new tree may also give you an additional worker thread.

    Q10 What are the points to keep in mind for performance improvement of the package?
    http://technet.microsoft.com/en-us/library/cc966529.aspx

    Q11 You may get a question stating a scenario and then asking you how would you create a package for that e.g. How would you configure a data flow task so that it can transfer data to different table based on the city name in a source table column?

    ReplyDelete
  3. Q13 Difference between Unionall and Merge Join?
    a) Merge transformation can accept only two inputs whereas Union all can take more than two inputs

    b) Data has to be sorted before Merge Transformation whereas Union all doesn't have any condition like that.

    Q14 May get question regarding what X transformation do?Lookup, fuzzy lookup, fuzzy grouping transformation are my favorites.
    For you.

    Q15 How would you restart package from previous failure point?What are Checkpoints and how can we implement in SSIS?
    When a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time that the package is run.

    Q16 Where are SSIS package stored in the SQL Server?
    MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90, sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the supporting roles.

    Q17 How would you schedule a SSIS packages?
    Using SQL Server Agent. Read about Scheduling a job on Sql server Agent

    Q18 Difference between asynchronous and synchronous transformations?
    Asynchronous transformation have different Input and Output buffers and it is up to the component designer in an Async component to provide a column structure to the output buffer and hook up the data from the input.

    Q19 How to achieve parallelism in SSIS?
    Parallelism is achieved using MaxConcurrentExecutable property of the package. Its default is -1 and is calculated as number of processors + 2.

    -More questions added-Sept 2011
    Q20 How do you do incremental load?
    Fastest way to do incremental load is by using Timestamp column in source table and then storing last ETL timestamp, In ETL process pick all the rows having Timestamp greater than the stored Timestamp so as to pick only new and updated records

    Q21 How to handle Late Arriving Dimension or Early Arriving Facts.


    Late arriving dimensions sometime get unavoidable 'coz delay or error in Dimension ETL or may be due to logic of ETL. To handle Late Arriving facts, we can create dummy Dimension with natural/business key and keep rest of the attributes as null or default. And as soon as Actual dimension arrives, the dummy dimension is updated with Type 1 change. These are also known as Inferred Dimensions.

    ReplyDelete
  4. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in MSBI, kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor led training on in MSBI. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Nitesh Kumar
    MaxMunus
    E-mail: nitesh@maxmunus.com
    Skype id: nitesh_maxmunus
    Ph:(+91) 8553912023
    http://www.maxmunus.com/


    ReplyDelete