Sunday, December 18, 2011

Multiple Calendars

Fielded a question the other day about needing multiple calendars in a dimensional model, and wanted to pass along some thoughts here.  The key requirement was to be able to analyze business events (fact data) by filtering and slicing by multiple calendars.  Pretty common requirement in many organizations.  Particularly ones that operate in multiple regions of the world and/or have business units that behave as separate entities.  Sometimes the subs are acquisitions that continue to operate independently.  Sometimes the subs operate in different markets that observe different holidays.  Whatever the case, if you are working on a dimensional modeling assignment and you have a requirement to consolidate fact data across multiple entities with different calendars like this, hopefully this post will be helpful.



Multiple calendar hierarchies embedded in main date dimension
("Alternate#" hierarchy name just a place holder for proper name)

There are several options for supporting analysis against a common date dimension, as well as multiple independent date dimensions.  One approach is to add the additional hierarchies to the main Date dimension directly.  If you only have one alternate calendar that you need to slice by, this is probably the simplest route to go and would work well for you.  Unfortunately, many businesses need more than one, and adding 2, 3 or 4 alternate hierarchies to your main date dimension will likely mean some maintenance headaches.  When the business needs another alternate calendar, you'll have to add the new hierarchy attributes to your relational structure, as well as expose them in your cubes and/or reporting layers.  There are more elegant and maintenance-friendly ways to handle these requirements.

One very common approach that requires no relational structure changes when a new alternate calendar is needed involves using a sub-dimension (snowflake) off the main date dimension.  This additional table would have a granularity of one row per date per unique calendar (or unique market or client or school or however you need it).  Then to be able to use the alternate calendar dimension's attributes in a query you must join to this dimension and constrain on a single calendar version.  The Kimball Group has some helpful writings in this area. Take a look at the Kimball Group Reader (p.376 - Multiple National Calendars; p.339 - Multi-Enterprise Calendar).  Both use the same approach as described above.  And this approach is relatively easy to query with SQL.
Sub-Dimension (snowflake) for additional calendars

Another approach that can solve this modeling challenge involves using a bridge table from the date dimension to a separate Calendar dimension. This approach would effectively map each date to as many versions of calendars as you need.  And the calendar dimension can be at a higher level of granularity to only include the groupings that the dates need to roll up to.  The bridge table itself would have the date-level granularity.  In essence, the sub-dimension approach described above is accomplishing the exact same thing, but it's simply mashing the bridge structures all into the a single snowflake table.  With the bridge approach, to reference a specific version of the Calendar dimension, simply constrain on the single version of the calendar dimension you need, and then drill across through the date dimension to your main fact.  This would enable slicing your fact data by any version of the Calendar you might need.
Bridge to capture multiple calendars for each date in DimDate
Either approach (sub-dimension or bridge) will give you the flexibility you need to handle multiple alternate calendars without having to continue to add columns to your base date dimension.  However, if you need to deliver your data mart to your users in an Analysis Services cube, I believe going with the bridge would make things cleaner in terms of relationships between structures and from a visual perspective in the DSV model.  In AS, the bridge design would be implemented via a many-to-many relationship between the Calendar dimension and the main fact table, with the bridge table acting as a fact table and as the intermediate measure group. 

Hope that is helpful. 

No comments:

Post a Comment