BI & Analytics: Classify before you analyze
According to Wikipedia, classification is the process in which ideas and objects are recognized, differentiated, and understood. In business intelligence, when performing reporting and analytics, companies need to go through the classification exercise before they start analyzing numbers-or, they must build hierarchies.
But what are hierarchies?
Hierarchies are often described as a system or organization in which elements are ranked, one above the other, according to status or authority. Hierarchy management is described as the process of keeping all those levels and members (or nodes) accurate and in the right place. Hierarchy management tools help organizations create and maintain hierarchies.
Being charitable those definitions are a tad fuzzy. So to begin a clarifying conversation on hierarchies I chose to gather up a few of the observations and develop the following hierarchy of hierarchies!
The goal is to introduce you to the wide array of hierarchy types and hierarchy management requirements so that you can better describe your own use cases and understand the claims, counter-claims, customer references and qualifications of your solution providers.
First off, what kinds of hierarchies should a solution support?
One thing I've noticed is that many tool discussions center on the morphology of the hierarchies. For example can the tools support hierarchies that are:
Balanced - each branch descend to the same level,
Unbalanced - branches can have varying numbers of levels, and
Level skipping, or ragged - The parent of at least one member of the hierarchy is not in the level immediately above the member
The only issue I have with focusing on structural requirements is that it ignores the purpose of the hierarchy.
Thinking about the purpose, or use case, is important because embedded in the use case are business rules and constraints that need to be addressed by the solution. This is why for our taxonomy we offer this use case-centric view.
Our Hierarchy of Hierarchies
Control - Describes ownership & accountability
Examples include: Legal entity hierarchies, organization charts, sales territories
In these hierarchies, the children of a specific member show its span of control, or scope of authority. Higher ranks (or more senior generations) have more control. Members in these kinds of hierarchies often have multiple parents. For example, in most legal entity hierarchies joint ventures have at least two parents, many subsidiaries (especially those domiciled in outside your home country) will have multiple parents.
Reporting - Describes rollups
Examples include: Chart of accounts, risk reports, reports by market segment
In these hierarchies, the children of a specific member are their parent's sub-accounts. The parent's line item amount is allocated across, or broken out, across all its the children. Often these hierarchies have exclusivity constraints, members can only have one parent, violating that constraint means you're double booking an amount. Also, alternate versions of these hierarchies have completeness constraints (as in all members must be used in every alternate version) otherwise you're comparing apples to oranges.
Classification - Groups like things together
Examples includes: Classifying accounts by regions, classifying assets by risk level, Organizing products by brand
In these hierarchies, the children are subdivisions, or further refinements of a specific category. These hierarchies usually require that members of a group or subgroup inherit attributes that are unique to that group. For example, in an employee hierarchy that classifies US-based employees by exempt vs. nonexempt, we would expect to see an an overtime rate for the non-exempt employee nodes.
Next, how are hierarchies managed?
When discussing hierarchy management I often hear that there are two main ways to create the hierarchy, either
Automatically, the hierarchy is derived from the underlying data relationships between the items you'd like to include, or
Manually, the hierarchy is explicitly created by a subject matter expert
Again, let's take a step back, and remember that the purpose of hierarchy management is making sure that members and levels are in the right place. What this means is that, whether node placement is derived from a data relationship, or an explicit decision by a user, at some point, an accountable party will need to determine if the hierarchy is accurate. This simple requirement has several big ramifications, some examples:
Who has the rights to create? review? approve?
How do you support workflows for review and approval?
Can your accountable parties use the tool to create, review, compare hierarchies?
What do you do with the retired versions?
Some of these issues around governance, workflow, user-friendliness, and versioning are essential to sort out when considering your hierarchy management strategy.
Finally, how do you ensure that your hierarchy members are accurate?
Well, ensuring that your hierarchy members and their attributes are accurate is actually part and parcel of keeping your dimensions conformed, which is a topic for another blog post.