How do we define data controls that adequately mitigate risk?
Data Governance & Control
In my previous article, I mentioned the importance of examining the data controls when assessing the maturity of an organisation’s data governance capabilities.
The manner in which controls are drafted and their execution can reveal a lot about the robustness of an organisation's data governance implementation.
In this 2-part article, we will therefore explore what makes a good control.
Part 1 will focus on why a control framework is important and how you can determine that you have the right controls in place. Part 2 will look at how to write a good control narrative together with some examples to get you started.
Why you need Data Controls
Many believe that IT has the responsibility for ensuring that the data in the warehouse and reports are of good quality. This is what I call the first myth in the world of data quality!
Whilst IT may have control of large segments of the architecture on which data resides, they are not responsible for its input nor much of its manipulation.
Let’s take a look at a couple of examples to illustrate this.
On one engagement, I was asked to look at the reasons behind the prevalent use of an internal business code on reports. Investigation revealed that it was used as a default when processing data upstream to short-cut the process.
In another organisation, transaction amounts were double-counted in the warehouse due to them being replicated on two separate systems. There was a manual process to prevent this but no clear controls to ensure it was followed.
Example 1 is clearly outside the control of IT.
Whilst the organisation in example 2 did have a need to rationalise their systems, they nevertheless needed an exception monitoring control in the business to ensure that the process was followed.
Controls vs Data Quality Metrics
There is also the view that once an organisation has defined, set up and regularly monitors DQ Metrics at the point of consumption, that the control ”box” is ticked. This is a close second to the first myth!
Let’s be clear; the identification of data quality criteria that are measurable as well as the ongoing monitoring of it, is an important quality control that we should continue to rely on.
I am not disputing this.
But…there is much more to a robust and comprehensive control framework than merely a set of DQ metrics.
The Data Supply Chain
In most organisations, data flows from upstream input systems to the point of consumption in downstream processes, where it may be augmented and manipulated.
Let’s illustrate this with an example from the insurance industry.
This is a very simplistic representation of how premium and claims data flow from the upstream booking systems, onto the data warehouse and then onto the Actuarial Reserving system, through some manual calculation spreadsheets, before finally arriving at the Capital Calculation Engine.
Along its journey, the following operations occur:
1. Manual input of the data into booking systems
2. Migration from one system to another
3. Joining of distinct data domains
4. Expert judgement augmentation
5. Use of End-User Applications
Every time that data is touched, moved, or augmented in some way, it represents a unique risk to that data.
The precise nature of the risk is dependent on the end use. However, a set of DQ Metrics positioned at the end of the process simply cannot account for all the inherent risks that apply.
What if, along the way, a number was incorrectly transposed on one of the Actuarial spreadsheets?
What if the feed on to the warehouse lost some of the data?
What if the Actuarial estimates were materially incorrect?
Would a set of metrics measuring Completeness or Accuracy be able to capture these instances?
The answer is of course a resounding No. Rather a range of controls are required across the lineage ranging from QA type checks, system reconciliations and management reviews.
A Robust Data Control Framework
To manage all the risks in the data flow you need to ensure that you do the following:
· Data Flow - understand the flow of data and principal operations at a sufficiently granular level
· Inherent Risk - identify the key inherent risks applicable to each operation in the flow
· Residual Risk - Assess the controls in place to determine residual risk
· Risk Appetite Assessment - assess against the risk appetite
Lineage
To be able to undertake this type of analysis, you need to have a robustly documented lineage. Here again, I’ve observed many organisations where lineage is defined as simply the flow of data from one system to another.
This is not enough!
You need to understand the business processes involved, their nature and whether they are manual or automated.
Only then can you start to review the unique risks posed and assess whether controls are in the right place.
Effort vs Reward
For you to go through this process for all data flows will clearly be a hugely time-consuming exercise.
A policy decision will need to be made on which data you carry out this exercise against.
However, you will need to carry out this type of exercise for data identified as critical to core processes and functions within your organisation if you want to ensure the robustness of that data.
Look out for Part 2 where we’ll look at what makes a good control narrative.
Subscribe to get future articles in this series.
--
Are you not seeing the expected change after initiating a data governance programme? Are you struggling to ensure that you have the right controls in place?
Book a call to discover how we can help you implement a robust data governance framework and mature your implementation.