type
Post
Created date
Aug 23, 2022 03:21 AM
category
Data Science
tags
Data Warehousing
status
Published
Language
English
From
School
summary
slug
password
Author
Jason Ching Yuen Siu
Priority
Featured
Featured
Cover
Origin
Type
URL
Youtube
Youtube
icon
Level of Aggregations and Data Warehousing Architecture
Summary
- The higher the level of aggregation, the more aggregation in the fact measure.
- A data warehouse is built primarily used for drilling down some interesting data for business decision.
- It is common when we design a data warehouse, we start from a high level of aggregation, where fact measures contain aggregated values. Lowering down the level of aggregation can be done by changing the granularity of the dimension, or by simply adding new dimensions.
- Determining whether a star schema is in Level-0 or not can be tricky. Not having an aggregated fact measure does not always mean that the star schema is on Level-0. Hence, it is important to understand the concept of transaction recorded in the E/R diagram of the operational database
What does levels of granularity imply?
Aggregate values have different levels of granularity.
- The lower the level of granularity:
- The higher the level of aggregation (e.g., level 0,1,2)
- The more specific information we can get. (越低越細)Level 0 is the highest level (Most detail).
In another words, the granularity (or details) of dim is low, so the level of agg is high.
Example:
- Total Sales per Year has lower granularity than Total Sales per Quarter.
- Number of Logins in the lab per Semester has different level of focus (or granularity) than Number of Logins in the lab per Month
How to lower down the level of aggregations
- Add a new dimension. When we add a new dimension, each value in the fact measure will literally be broken down more details on each record of the new dimension.
- Replace an existing dimension with a higher granularity dimension. The values of the fact measures will also be broken down more details because the fact measure has a lower detail dimension
How do we know if the schema is in its lowest level (i.e., Lv 0)?
- Op = Schema. Attributes are the same as the ones in the operation database.
- Unaggregated fact measures. Lv 0 has no aggregation so values of fact measures are 1.
What are the case when level of aggregation CANNOT be compared?
Case Studies
Case study 1: The number of logins per hour at night?
Star schema
Case study 2 (Lab) : Clothing Company
Case study 3 (Lab) : Toll Way
Star Schemas with No Aggregation (i.e., Level 0 )
Although dims cannot be broken down, having lowest dims does NOT gurantee that the values of FM is Unaggregated. So, these are NOT in lv 0.
This is not Level-0 because the fact measures do not capture both sides of the m-m relationship between Purchase Order and Item entities in the E/R diagram; rather, it captures one side only, which is from the Purchase Order side and not from the Item side.
There are three possible lv0 schemas
If we need to include all possible dimensions in the star schema, we can add Product Dimension to the star schema.
And Level-0 star schema must have the Purchase Order Dimension and Item Dimension as its core.
Since the star schemas below come with these two core dimensions, it is at lv 0.
The most complete star schema with Customer Dimension and Product Dimension
or with either Customer Dimension (Fig. 14.14) or Product Dimension only
Conclusion
The Level-0 star schema should focus on the transaction level of the E/R diagram, which is denoted by the m-m relationship.
Understanding the Relationship between Transactions and Fact Measures
- Author:Jason Siu
- URL:https://jason-siu.com/article/851e0123-12a5-4628-bda2-45f9f260dd29
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts