logo for information-management-architect.com
Home
Strategy
Framework
Business Case Analysis
Project Planning
Requirements Analysis
Architecture & Design
Build Phase
Quality Assurance
Transition to Production
Management Information
Business Intelligence
Data Warehouse
Tools
Jobs
Contact David Bowman
leftimage for information-management-architect.com

Data Aggregation

Need to understand information management data aggregation and want an overview of key architectural components?

What is data aggregation?

Summary data is a key part of data warehouse design. It provides a costData Aggregation effective means of improving query performance without the cost of investing in additional hardware and storage devices.

Creating summary table structure allows the data architect to design a solution that will rapidly satisfy common queries that access a sub-set of data.

What is a typical summary data requirement?

Suppose that a marketing manager wants to promote the sale of porterhouse steaks within a specific geographic area. This might involve looking at customer profiles to determine which type of customer might buy porterhouse steaks and by offering some form of incentive to switch from round steaks, perhaps fish or pork products.

A typical query might look at:
  • Customer income;
  • Customer spending habits over a reasonable period of time (Perhaps two or three years); and
  • Customer buying habits for meat; within
  • A specific geographic area e.g. all postal codes within a specific state or province.
If we look at the steps involved in executing the above query, we can see that is might be simpler to have a series of pre-aggregated summary tables. This might summarize sales of customers who purchase meat by:
  • Customer;
  • Location;
  • Income; and
  • Spending habits.
How are summary tables created?

Aggregations can be generated by:
  • Developing ETL software, to summarize data from data warehouse source tables and reload the summary tables each time a new data warehouse load is completed;
  • Creating views to summarize data from data warehouse source tables each time the view is called; and
  • Creating materialized views, to summarize from data warehouse source tables each time a new data warehouse load is completed.
What is involved with designing aggregation tables?

A data architect has to:
  • Determine which dimensions need aggregation;
  • Determine the aggregation of multiple values;
  • Aggregate multiple facts into summary tables;
  • Determine the level of aggregation;
  • Determine the extent of embedding dimension data into summary tables;
  • Add time into the summary table; and
  • Determine the best indexing strategy for the aggregation.
Summary...

Data aggregation is a key part of data warehouse design. It provides a cost effective means of improving query performance without the cost of investing in additional hardware and storage devices.

Creating summary table structure allows the data architect to design a solution that will rapidly satisfy common queries that access a sub-set of data.