We offer private, customized training for 3 or more people at your site or online.
Microsoft says that SQL Server Integration Services (SSIS) “is a platform for building high performance data integration solutions, including extraction, transformation, and load (ETL) packages for data warehousing.” A simpler way to think of SSIS is that it's the solution for automating data movements. SSIS provides a way to build packages made up of tasks that can move data around from place to place and alter it on the way. There are visual designers (hosted within Business Intelligence Development Studio) to help you build these packages as well as an API for programming SSIS objects from other applications.
In this chapter, you'll see how to build and use SSIS packages. First, though, we'll look at a simpler facet of SSIS: The SQL Server Import and Export Wizard.
If you choose to use the supplied solution files rather than building your own, you may need to edit the properties of the OLE DB Connection Managers within the projects to point to your own test server. You'll learn more about Connection Managers in the “Working with Connection Managers” section later in this chapter. |
SSIS 2008 Tutorial: The Import and Export Wizard
Though SSIS is almost infinitely customizable, Microsoft has produced a simple wizard to handle some of the most common ETL tasks: importing data to or exporting data from a SQL Server database. The Import and Export Wizard protects you from the complexity of SSIS while allowing you to move data between any of these data sources:
You can launch the Import and Export wizard from the Tasks entry on the shortcut menu of any database in the Object Explorer window of SQL Server Management Studio.
Try It!
To import some data using the Import and Export Wizard, follow these steps:
In addition to executing its operations immediately, the Import and Export Wizard can also save a package for later execution. You'll learn more about packages in the remainder of this chapter. |
SSIS 2008 Tutorial: Creating a Package
The Import and Export Wizard is easy to use, but it only taps a small part of the functionality of SSIS. To really appreciate the full power of SSIS, you'll need to use BIDS to build an SSIS package. A package is a collection of SSIS objects including:
You'll see how to build each of these components of a package in later sections of the chapter, but first, let's fire up BIDS and create a new SSIS package.
Try It!
To create a new SSIS package, follow these steps:
Figure 16-3 shows the new, empty package.
Figure 16-3: Empty SSIS package
SSIS 2008 Tutorial: Working with Connection Managers
SSIS uses connection managers to integrate different data sources into packages. SSIS includes a wide variety of different connection managers that allow you to move data around from place to place. Table 16-1 lists the available connection managers.
Connection Manager |
Handles |
ADO |
Connecting to ADO objects such as a Recordset. |
ADO.NET |
Connecting to data sources through an ADO.NET provider. |
CACHE |
Connects to a cache either in memory or in a file |
MSOLAP100 |
Connecting to an Analysis Services database or cube. |
EXCEL |
Connecting to an Excel worksheet. |
FILE |
Connecting to a file or folder. |
FLATFILE |
Connecting to delimited or fixed width flat files. |
FTP |
Connecting to an FTP data source. |
HTTP |
Connecting to an HTTP data source. |
MSMQ |
Connecting to a Microsoft Message Queue. |
MULTIFILE |
Connecting to a set of files, such as all text files on a particular hard drive. |
MULTIFLATFILE |
Connecting to a set of flat files. |
ODBC |
Connecting to an ODBC data source. |
OLEDB |
Connecting to an OLE DB data source. |
SMOSever |
Connecting to a server via SMO. |
SMTP |
Connecting to a Simple Mail Transfer Protocol server. |
SQLMobile |
Connecting to a SQL Server Mobile database. |
WMI |
Connecting to Windows Management Instrumentation data. |
Table 16-1: Available Connection Managers
To create a Connection Manager, you right-click anywhere in the Connection Managers area of a package in BIDS and choose the appropriate shortcut from the shortcut menu. Each Connection Manager has its own custom configuration dialog box with specific options that you need to fill out.
Try It!
To add some connection managers to your package, follow these steps:
Figure 16-5 shows the SSIS package with the three Connection Managers defined.
Figure 16-5: An SSIS package with two Connection Managers
SSIS 2008 Tutorial: Building Control Flows
The Control Flow tab of the Package Designer is where you tell SSIS what the package will do. You create your control flow by dragging and dropping items from the toolbox to the surface, and then dragging and dropping connections between the objects. The objects you can drop here break up into four different groups:
Task |
Purpose |
ActiveX Script |
Execute an ActiveX Script |
Analysis Services Execute DDL |
Execute DDL query statements against an Analysis Services server |
Analysis Services Processing |
Process an Analysis Services cube |
Bulk Insert |
Insert data from a file into a database |
Data Mining Query |
Execute a data mining query |
Data Profiling Task |
Generate a profile of sample data, determining distribution of values or percentage of NULLs, etc. |
Execute DTS 2000 Package |
Execute a Data Transformation Services Package (DTS was the SQL Server 2000 version of SSIS) |
Execute Package |
Execute an SSIS package |
Execute Process |
Shell out to a Windows application |
Execute SQL |
Run a SQL query |
File System |
Perform file system operations such as copy or delete |
FTP |
Perform FTP operations |
Message Queue |
Send or receive messages via MSMQ |
Script |
Execute a custom task |
Send Mail |
Send e-mail |
Transfer Database |
Transfer an entire database between two SQL Servers |
Transfer Error Messages |
Transfer custom error messages between two SQL Servers |
Transfer Jobs |
Transfer jobs between two SQL Servers |
Transfer Logins |
Transfer logins between two SQL Servers |
Transfer Master Stored Procedures |
Transfer stored procedures from the master database on one SQL Server to the master database on another SQL Server |
Transfer SQL Server Objects |
Transfer objects between two SQL Servers |
Web Service |
Execute a SOAP Web method |
WMI Data Reader |
Read data via WMI |
WMI Event Watcher |
Wait for a WMI event |
XML |
Perform operations on XML data |
Table 16-2: SSIS control flow tasks
Task |
Purpose |
Back Up Database |
Back up an entire database to file or tape |
Check Database Integrity |
Perform database consistency checks |
Execute SQL Server Agent Job |
Run a job |
Execute T-SQL Statement |
Run any T-SQL script |
History Cleanup |
Clean out history tables for other maintenance tasks |
Maintenance Cleanup |
Clean up files left by other maintenance tasks |
Notify Operator |
Send e-mail to SQL Server operators |
Rebuild Index |
Rebuild a SQL Server index |
Reorganize Index |
Compacts and defragments an index |
Shrink Database |
Shrinks a database |
Update Statistics |
Update statistics used to calculate query plans |
Table 16-3: SSIS maintenance plan tasks
Container |
Purpose |
For Loop |
Repeat a task a fixed number of times |
Foreach Loop |
Repeat a task by enumerating over a group of objects |
Sequence |
Group multiple tasks into a single unit for easier management |
Table 16-4: SSIS containers
Try It!
To add control flows to the package you've been building, follow these steps:
Figure 16-6 shows the completed set of control flows.
Figure 16-6: Adding control flows
As it stands, this package uses the file system task to copy the file specified by the DepartmentList connection to the file specified by the DepartmentListBackup connection, overwriting any target file that already exists. It then executes the data flow task. In the next section, you'll see how to configure the data flow task.
SSIS 2008 Tutorial: Building Data Flows
The Data Flow tab of the Package Designer is where you specify the details of any Data Flow tasks that you've added on the Control Flow tab. Data Flows are made up of various objects that you drag and drop from the Toolbox:
Source |
Use |
ADO NET |
Extracts data from a database using a .NET data provider |
Excel |
Extracts data from an Excel workbook |
Flat File |
Extracts data from a flat file |
OLE DB |
Extracts data from a database using an OLE DB provider |
Raw File |
Extracts data from a raw file (proprietary Microsoft format) |
XML |
Extracts data from an XML file |
Table 16-5: Data flow sources
Transformation |
Effect |
Aggregate |
Aggregates and groups values in a dataset |
Audit |
Adds audit information to a dataset |
Cache Transform |
Populates a CACHE connection manager |
Character Map |
Applies string operations to character data |
Conditional Split |
Evaluates and splits up rows in a dataset |
Copy Column |
Copies a column of data |
Data Conversion |
Converts data to a different datatype |
Data Mining Query |
Runs a data mining query |
Derived Column |
Calculates a new column from existing data |
Export Column |
Exports data from a column to a file |
Fuzzy Grouping |
Groups rows that contain similar values |
Fuzzy Lookup |
Looks up values using fuzzy matching |
Import Column |
Imports data from a file to a column |
Lookup |
Looks up values in a reference dataset |
Merge |
Merges two sorted datasets |
Merge Join |
Merges data from two datasets by using a join |
Multicast |
Creates copies of a dataset |
OLE DB Command |
Executes a SQL command on each row in a dataset |
Percentage Sampling |
Extracts a subset of rows from a dataset |
Pivot |
Builds a pivot table from a dataset |
Row Count |
Counts the rows of a dataset |
Row Sampling |
Extracts a sample of rows from a dataset |
Script Component |
Executes a custom script |
Slowly Changing Dimension |
Updates a slowly changing dimension table |
Sort |
Sorts data |
Term Extraction |
Extracts data from a column |
Term Lookup |
Looks up the frequency of a term in a column |
Union All |
Merges multiple datasets |
Unpivot |
Normalizes a pivot table |
Table 16-6: Data Flow Transformations
Destination |
Use |
ADO NET |
Sends data to a .NET data provider |
Data Mining Model Training |
Sends data to an Analysis Services data mining model |
DataReader |
Sends data to an in-memory ADO.NET DataReader |
Dimension Processing |
Processes a cube dimension |
Excel |
Sends data to an Excel worksheet |
Flat File |
Sends data to a flat file |
OLE DB |
Sends data to an OLE DB database |
Partition Processing |
Processes an Analysis Services partition |
Raw File |
Sends data to a raw file |
Recordset |
Sends data to an in-memory ADO Recordset |
SQL Server Compact |
Sends data to a SQL Server CE database |
SQL Server |
Sends data to a SQL Server database |
Table 16-7: Data Flow Destinations
If you are running SQL Server Integration Services on a 64-bit machine, the Excel source and destination will throw an exception. During development, you can select Project > Project_name Properties, select the Debugging page and change the Run64BitRuntime property to false. When deploying the package, you'll need to shell out to the 32-bit SSIS runtime when scheduling the package. |
Try It!
To customize the data flow task in the package you're building, follow these steps:
Figure 16-10 shows the completed set of data flows.
Figure 16-10: Adding data flows
The data flows in this package take a table from the Chapter16 database, transform one of the columns in that table to all uppercase characters, and then write that transformed column out to a flat file.
SSIS 2008 Tutorial: Creating Event Handlers
SSIS packages also support a complete event system. You can attach event handlers to a variety of events for the package itself or for the individual tasks within a package. Events within a package "bubble up." That is, suppose an error occurs within a task inside of a package. If you've defined an OnError event handler for the task, then that event handler is called. Otherwise, an OnError event handler for the package itself is called. If no event handler is defined for the package either, the event is ignored.
Event handlers are defined on the Event Handlers tab of the Package Designer. When you create an event handler, you handle the event by building an entire secondary SSIS package, and you have access to the full complement of data flows, control flows, and event handlers to deal with the original event.
By adding event handlers to the OnError event that call the Send Mail task, you can notify operators by e-mail if anything goes wrong in the course of running an SSIS package. |
Try It!
To add an event handler to the package we've been building, follow these steps:
This event handler will be called when the Data Flow Task finishes executing, and will insert one new row into the tracking table when it is called.
SSIS 2008 Tutorial: Saving and Running Packages
Now that you've created an entire SSIS package, you're probably ready to run it and see what it does. But first, let's look at the options for saving SSIS packages. When you work in BIDS, your SSIS package is saved as an XML file (with the extension dtsx) directly in the normal Windows file system. But that's not the only option. Packages can also be saved in the msdb database in SQL Server itself, or in a special area of the file system called the Package Store.
Storing SSIS packages in the Package Store or the msdb database makes it easier to access and manage them from SQL Server's administrative and command-line tools without needing to have any knowledge of the physical layout of the server's hard drive.
Saving Packages to Alternate Locations
To save a package to the msdb database or the Package Store, you use the File > Save Package As menu item within BIDS.
Try It!
To store copies of the package you've developed, follow these steps.
Running a Package
You can run the final package from either BIDS or SQL Server Management Studio. When you're developing a package, it's convenient to run it directly from BIDS. When the package has been deployed to a production server (and saved to the msdb database or the Package Store) you'll probably want to run it from SQL Server Management Studio.
SQL Server also includes a command-line utility, dtsexec, that lets you run packages from batch files. |
Running a Package from BIDS
With the package open in BIDS, you can run it using the standard Visual Studio tools for running a project. Choose any of these options:
Try It!
To run the package that you have loaded in BIDS, follow these steps:
All of the events you see in the Execution Results pane are things that you can create event handlers to react to within the package. As you can see, DTS issues a quite a number of events, from progress events to warnings about extra columns of data that we retrieved but never used. |
Running a Package from SQL Server Management Studio
To run a package from SQL Server Management Studio, you need to connect Object Browser to SSIS.
Try It!
SELECT * FROM DepartmentExports
SSIS 2008 Tutorial: Exercises
One common use of SSIS is in data warehousing - collecting data from a variety of different sources into a single database that can be used for unified reporting. In this exercise you'll use SSIS to perform a simple data warehousing task.
Use SSIS to create a text file, c:\EmployeeDept.txt, containing the last names, department names, start and end dates of the AdventureWorks2008 employees. Retrieve the last names from the Person.Person table and the department start and end dates from the HumanResources.EmployeeDepartmentHistory table in the AdventureWorks2008 database, and the department names from the Chapter16 database.
You can use the Merge Join data flow transformation to join data from two sources. One tip: the inputs to this transformation need to be sorted on the joining column.
Solutions to Exercises
In-Depth SSIS TrainingFor in-depth SSIS training, click here to view all of Accelebrate's SSIS training courses for you and your staff. |
Our live, instructor-led lectures are far more effective than pre-recorded classes
If your team is not 100% satisfied with your training, we do what's necessary to make it right
Whether you are at home or in the office, we make learning interactive and engaging
We accept check, ACH/EFT, major credit cards, and most purchase orders
Alabama
Birmingham
Huntsville
Montgomery
Alaska
Anchorage
Arizona
Phoenix
Tucson
Arkansas
Fayetteville
Little Rock
California
Los Angeles
Oakland
Orange County
Sacramento
San Diego
San Francisco
San Jose
Colorado
Boulder
Colorado Springs
Denver
Connecticut
Hartford
DC
Washington
Florida
Fort Lauderdale
Jacksonville
Miami
Orlando
Tampa
Georgia
Atlanta
Augusta
Savannah
Hawaii
Honolulu
Idaho
Boise
Illinois
Chicago
Indiana
Indianapolis
Iowa
Cedar Rapids
Des Moines
Kansas
Wichita
Kentucky
Lexington
Louisville
Louisiana
New Orleans
Maine
Portland
Maryland
Annapolis
Baltimore
Frederick
Hagerstown
Massachusetts
Boston
Cambridge
Springfield
Michigan
Ann Arbor
Detroit
Grand Rapids
Minnesota
Minneapolis
Saint Paul
Mississippi
Jackson
Missouri
Kansas City
St. Louis
Nebraska
Lincoln
Omaha
Nevada
Las Vegas
Reno
New Jersey
Princeton
New Mexico
Albuquerque
New York
Albany
Buffalo
New York City
White Plains
North Carolina
Charlotte
Durham
Raleigh
Ohio
Akron
Canton
Cincinnati
Cleveland
Columbus
Dayton
Oklahoma
Oklahoma City
Tulsa
Oregon
Portland
Pennsylvania
Philadelphia
Pittsburgh
Rhode Island
Providence
South Carolina
Charleston
Columbia
Greenville
Tennessee
Knoxville
Memphis
Nashville
Texas
Austin
Dallas
El Paso
Houston
San Antonio
Utah
Salt Lake City
Virginia
Alexandria
Arlington
Norfolk
Richmond
Washington
Seattle
Tacoma
West Virginia
Charleston
Wisconsin
Madison
Milwaukee
Alberta
Calgary
Edmonton
British Columbia
Vancouver
Manitoba
Winnipeg
Nova Scotia
Halifax
Ontario
Ottawa
Toronto
Quebec
Montreal
Puerto Rico
San Juan