Extract Transform Load
to manage extract transform load and want practical standards and best
practices for business intelligence governance and accountability?
is information management?
management is a corporate management process that governs
accountability for the structure and design, storage, movement,
security, quality, delivery and usage of information required for
management and business intelligence purposes.
is data movement?
The extract transform load data movement discipline focuses on the movement of data between
systems. In this context, “systems” include external data
sources, operational systems, and analytic data stores.
movement encompasses the extract transform load (ETL) facilities used
for bulk data movement. It also includes mechanisms
continuous movement of discrete records, rows, or messages between
What are data
movement best practices?
Data movement ETL best practices
provide a guide for the analysis, design,
development of data movement processes that are consistent, usable and
of high quality.
Why have data movement best practices?
Common development best practices are
important for the following
Best practice goals
- They lead to greater consistency,
which subsequently leads to greater productivity;
- They reduce ongoing maintenance
- They improve readability of software, making it
for developers to understand new code more quickly.
What are some
data movement model best practices?
- Introduce common, consistent
movement analysis, design, and coding patterns;
enterprise-wide analysis, design, and construction components
through data movement modeling processes using data movement tools, to
ensure an acceptable level of data quality per business specifications;
- Introduce best practices and consistency
coding and naming standards;
- Reduce costs to
develop and maintain analysis,
design and source code deliverables; and
- Integrate controls
into the data movement
process to ensure data quality
enterprise conceptual data movement model
should be created as part of
the information management strategy. This model is part of the business
model and shows what data flows into, within, and out of the
is a “high-level” model that shows movement
of data from one application to another. Order entry systems may
customer name, address information, and later send it to the
application, which handles billing and accounting so that bills can be
sent out and
Think of the conceptual data model as
architect’s conceptual drawing of a house. It provides a good idea of
what is required with very little additional detail.
logical data movement model should show data movement
the dataset (entity/table)-level. It should detail the transformation
rules and target logical datasets (entity/tables). This model is still
considered technology independent.
interface between systems. The order entry system, which sends customer
name and address information to the billing application, may actually
send three or four files. Each file is called an interface. Each
be shown in the logical data movement model.
The focus at the logical level is on
the capture of actual source
tables and proposed target stores.
logical data movement model should be supported by a source to
mapping document, which should include business rules and
physical data movement model is a detailed representation
of the data
movement requirements at the dataset (table)-level that details the
transformation rules and target physical datasets (tables).
model is considered technology dependent. Best practice dictates
that there may be one-to-many physical models for each logical model.
Data movement model tools
of the extract transform load tools have the ability to create data flow
practices suggest that these tools be used to create and
the data movement logical and physical models.
What data movement best practices are
topics should be covered by
- Configuration management;
- Data movement software
and technologies should standardize
on a common data movement tool and ensure that this is included in the
information management technology standards list. Some
common tools include:
- Ab initio;
- Ascential datastage; and
source” ETL tools should also be considered.
on a job-scheduling
tool such as:
- Control-M for job scheduling; or
should be defined
and communicated to all stakeholders for each
key role in data
Data management is a sub-set of
information management that governs
organization and control of the structure and design, storage,
movement, security and quality of information.
Enterprise business intelligence requires
extract transform load data movement standards to ensure rapid project delivery and optimal
return on information management investment.