Importing/Exporting Data  «Prev  Next»

Lesson 1

Importing and Exporting Data in SQL-Server

Data Loading and Extracting using SQL Server

Because concurrency is not as big an issue, denormalized data models are common. Data loading is handled by ETL (extract, transform, and load) processes that typically run on time intervals, and those processes will frequently store data in a redundant fashion that allows fast reads and minimal joins. These denormalized models are often highly indexed for fast retrieval, and index maintenance can be handled at ETL time.
The indexing on an OLAP system is done with very high fi ll-factors resulting in compressed indexes for very fast reads. As a SQL Server system administrator, one of the tasks you will have to perform is loading and extracting of data. SQL Server provides multiple methods to do this, each with its own benefits.

Learning Objectives

  1. Load data with the insert statement
  2. Load data with the select into statement
  3. Use the bulk insert statement
  4. Use the bcp program
  5. Identify key features of Data Transformation Services
  6. Use DTS Designer to create a package.
  7. Create, schedule, and run DTS packages
  8. Identify key features of DTS Wizard
  9. Run the DTS Wizard
The next lesson will cover loading data with the insert statement.

ETL Process

The first part of an ETL process involves extracting the data from the source systems. Most data warehousing projects consolidate data from different source systems. Each separate system may also use a different data organization/format. Common data source formats are relational databases and flat files, but may include non-relational database structures such as
  1. Information Management System (IMS) or other data structures such as
  2. Virtual Storage Access Method (VSAM) or
  3. Indexed Sequential Access Method (ISAM),
or fetching from outside sources such as through web spidering or screen-scraping. The streaming of extracted data source and load on-the-fly to the destination database is another way of performing ETL when no intermediate data storage is required. In general, the goal of the extraction phase is to convert the data into a single format which is appropriate for transformation processing. An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the data meets an expected pattern or structure. If not, the data may be rejected entirely or in part.

Save Time Writing

Data Loading is handled by ETL

Because concurrency is not as big an issue, denormalized data models are common. Data loading is handled by ETL (extract, transform, and load) processes that typically run on time intervals, and those processes will frequently store data in a redundant fashion that allows fast reads and minimal joins. These denormalized models are often highly indexed for fast retrieval, and index maintenance can be handled at ETL time. The indexing on an OLAP system is done with very high fill-factors resulting in compressed indexes for very fast reads. OLAP systems are often light on business rules, sometimes even ignoring such simple structures as basic relational integrity constraints. It is important to remember that your data warehouse may source data from many systems, each of which may have a completely different set of business rules, so the data warehouse itself will want to be as business rule agnostic as you can manage. Data cleansing can be done by the ETLs, so the rules do not need to be enforced again here. The job of the OLAP design is to take as much time as needed (within limits) to store data in such a 1way that, at data retrieval time, the database returns information to the user quickly. OLAP data models need not even be strictly relational data models. ROLAP (relational OLAP) is where you will start here, as in any case it is likely to be where you start. Once you have created a ROLAP database (whether you do so from a Kimball or an Inmon perspective), you can use that to build a MOLAP system in SSAS.
As you might expect, when you create a new database, it does not contain any data. In most cases, your application will require some data to be in the database before you can even begin to use it. One of the jobs of a SQL Server Database Administrator is to load data into a database and extract data from the database. Often times a system you will also need to perform a periodic data load to maintain the data that is needed for it to run. Alternately, you will need to perform data extracts, so you can archive old data out of the system to maintain the system’s performance.
The first module in this course will show you multiple ways of
  1. loading data into a newly created database, as well as
  2. extracting data from existing databases.
By the time you complete this module, you should be able to: