Welcome to dbFreaks.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Alternative Type 2 Slowly Changing Dimension

 
   Database Help (Home) -> OLAP RSS
Next:  Alternative Type 2 Slowly Changing Dimension  
Author Message
TP

External


Since: Jun 04, 2004
Posts: 2



(Msg. 1) Posted: Fri Jun 04, 2004 2:48 pm
Post subject: Alternative Type 2 Slowly Changing Dimension
Archived from groups: comp>databases>olap (more info?)

How Should We Implement A Slowly Changing Dimension?

Currently, our data warehouse has only Type 1 Slowly Changing
Dimensions (SCD). That is to say we overwrite the dimension record
with every update. The problem with that is when data changes, it
changes for all history while this is valid for data entry
corrections, it may not be valid for all data. An acceptable example
could be Customer Date of Birth. If the date of birth was changed,
chances are the reason was that their data was incorrect.

However, if the Customer address were changed, this may and probably
does mean the customer moved. If we simply overwrite the address then
all sales for that customer will belong to the new address. Suppose
the customer moved from Florida to Ohio. If we were trying to track
sales patterns by region, all of the customer's purchase that were
made in Florida would now appear to have been made in Ohio.


Type 1 Slowly Changing Dimension
Customer Dimension
ID CustKey Name DOB City State
1001 BS001 Bob Smith 6/8/1961 Jacksonville FL
1002 LJ004 Lisa Jones 10/15/1954 St. Augustine FL
   
   
Customer Dimension After Edits
ID CustKey Name DOB City State
1001 BS001 Bob Smith 6/8/1961 Dayton OH
1002 LJ004 Lisa Jones 10/15/1954 St. Augustine FL


In the example above, the DOB change doesn't affect any dimensional
reporting facts. However, the City, State change would have an
affect. Now all sales for Bob Smith would appear to come from Dayton,
Ohio rather than from Jacksonville, Florida.

  The solution we have chosen for solving this problem is to implement
a Type 2 slowly changing dimension. A Type 2 SCD records a separate
row each time a value is changed in the dimension. In our case, we
are declaring that we will only create a new dimension record when
certain columns are changed. In the example above, we would not
record a new record for the DOB change but we would for the address
change.


Type 2 Slowly Changing Dimension
Customer Dimension
ID CustKey Name DOB City State Current Effective Date
1001 BS001 Bob Smith 6/8/1961 Jacksonville FL Y 5/1/2004
1002 LJ004 Lisa Jones 10/15/1954 St. Augustine FL Y 5/2/2004
   
Customer Dimension After Edits
ID CustKey Name DOB City State Current Effective Date
1001 BS001 Bob Smith 6/8/1961 Jacksonville FL N 5/1/2004
1002 LJ004 Lisa Jones 10/15/1954 St. Augustine FL Y 5/2/2004
1003 BS001 Bob Smith 6/8/1961 Dayton OH Y 5/27/2004


As you can see, there are two dimension records for Bob Smith now.
They both have the same CustKey values, but the have different ID
values. All future fact table rows will use the new ID to link to the
Customer dimension. This is accomplished by the use of the Current
Flag. The ETL process looks only at the current flag when recording
new orders. However, in the case of an update to an order the
Effective Date must be used to determine which customer the update
applies to.

The primary issue with Type 2 SCD is the volume of data grows
exponentially as more changes are tracked. This can impact
performance in a star schema. The principle behind the star schema
design is that while facts are few columns, they have many rows but
they only have to perform single level joins to resolve their
dimensions. The assumption is that the dimensions have lots of
columns but relatively few rows. This allows for very fast joining of
data.
  Conforming Dimensions

For the purposes of this discussion conforming dimensions only need a
brief definition. Conforming dimensions are a feature of star schemas
that allow facts to share dimensional data. A conforming dimension
occurs when two dimensions share the same keys. Often they have
different attributes. The goal is to ensure that any fact table can
link to the conforming dimension and consume its data so long as the
dimension is relevant.


Conforming Dimension
Customer Dimension
ID CustKey Name DOB City State
1001 BS001 Bob Smith 6/8/1961 Jacksonville FL
1002 LJ004 Lisa Jones 10/15/1954 St. Augustine FL
   
   
Billing Dimension
ID Bill2Ky Name Account Type Credit Limit CustKey
1001 9211 Bob Smith Credit $ 10,000 BS001
1002 23421 Lisa Jones Cash $ 100 LJ004


In the example above, we could use the ID from the Customer dimension
in a fact and in the future a link to the Billing dimension could be
established without having to reload the data.

We are considering a slight modification to the standard Type 2 SCD.
The idea is to maintain two dimensions one as a Type 1 and one as a
Type 2. The problem with this is we lose the ability to use
conforming dimensions.


Type 2 and Type 1 Slowly Changing Dimension
Customer Dimension Type 1
ID CustKey Name DOB City State Current Effective Date
1001 BS001 Bob Smith 6/8/1961 Dayton OH Y 5/1/2004
1002 LJ004 Lisa Jones 10/15/1957 St. Augustine FL Y 5/2/2004
   
Customer Dimension Type 2
ID CustKey Name DOB City State Current Effective Date
1001 BS001 Bob Smith 6/8/1961 Jacksonville FL N 5/1/2004
1002 LJ004 Lisa Jones 10/15/1957 St. Augustine FL Y 5/2/2004
1003 BS001 Bob Smith 6/8/1961 Dayton OH Y 5/27/2004

As you can see, the current ID for Bob Smith in the Type 1 SCD is
1001, while it is 1003 in the Type 2 SCD. This is not conforming.

Our solution is to create a composite key for the Type 2 SCD.


Type 2 and Type 1 Slowly Changing Dimension
Customer Dimension Type 1
ID CustKey Name DOB City State
1001 BS001 Bob Smith 6/8/1961 Dayton OH
1002 LJ004 Lisa Jones 10/15/1957 St. Augustine FL
   
Customer Dimension Type 2
ID SubKey CustKey Name DOB City State Current Effective Date
1001 001 BS001 Bob Smith 6/8/1961 Jacksonville FL N 5/1/2004
1002 001 LJ004 Lisa Jones 10/15/1957 St. Augustine FL Y 5/2/2004
1001 002 BS001 Bob Smith 6/8/1961 Dayton OH Y 5/27/2004

In the example above, the Type 1 and the Type 2 dimensions conform on
the ID level. If a fact needs the historical data it will consume
both the ID and the SubKey.

 >> Stay informed about: Alternative Type 2 Slowly Changing Dimension 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
MicroStrategy Report grouping by two different dates - Hello: I need to do a MicroStrategy report like that: Year Items by DATE-1 Items by DATE-2 --- --------------- --------------- 2005 43 30 2004 124 96 2003 112 ...

Reducing Processing time of large catalog/cubes over the n.. - I have a catalog that we are do a full process on which consists of 7 cubes with about 75 millions rows of data(total). This catalog takes about 6 hours when processed locally. Recently we moved our datamart to another site(than the analysis server),..

Documeting Cognos Reportnet solution - Have been trying to find some best practices for documenting a reportnet based solution. We have recently upgraded to Cognos Reportnet and intend to rationalise and move over several reports which are running on impromptu to Reportnet. The problem I am....

MDX - Returning a fixed value - Hi , I need to create an MDX query that will return the value 0, is it possible to do this without the use of measures/dimensions etc? As an example what I want to recreate is the SQL below using MDX: select sum(sales), 0, 0 ,0 ,0 from SalesTable My....

Average function and MSSQL Analysis services - Hi All This should be pretty simple I think, but I can't make it work: Using MSSql 2000 with Analysis Services I have created a cube. In this cube I want to have a measure or calculated member that gives me the average instead of the sum. My data..
   Database Help (Home) -> OLAP All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]