Data governance plays an important role in how you handle the data that your organization collects. You must keep an eye on data quality, security, and more.
As data takes center stage in the push for organizations to adopt digital transformation, the need for data governance adoption has become more crucial than ever. For organizations to truly treat data as the asset that it is, a documented data governance framework is necessary. Without it, organizations will continue to struggle gleaning trustworthy information to make informed business decisions. This is especially true for organizations looking to adopt data lakes and advanced analytics in machine learning and AI.
Data governance helps define processes, policies, tools, and rules for the collection and management of data while ensuring consistency, accountability, security, and scalability. In short, it helps bring order to the chaos that many define as their data platform. With so much chaos, it can be difficult to find a place to start. Here are five key areas of data governance with simple examples to get organizations thinking about how to adopt.
Without data quality, business users are less likely to leverage data for strategic initiatives and operational efficiency improvements. As data governance should be a companywide initiative, all business users should strive to do their part in ensuring quality data. This is paramount to building a data-driven culture as well as keeping your data lake from becoming a data swamp. Without a clear understanding of the quality of the organization’s data, there really is no point in exploring machine learning or AI.
Data quality can be defined by five main categories:
The long-time adage of ‘garbage in, garbage out’ (GIGO) speaks to many of the data quality issues that organizations face. For many, the lack of clearly defined business processes is the source of GIGO. As a common example, many organizations have loosely defined processes for creation of source documents like expense reports or timesheets. But many don’t clearly define all fields that users should populate for consistency and completeness across the organization. This presents many challenges in reporting and analytics. For many, correcting this can be as simple as updating historical transactions to include the necessary data, as well as clearly (re)defining the business process for all relevant users.
Data security is not only tied to risk management, it also ensures business users have access to the necessary information to support informed business decisions. Organizations are adopting the use of self-service and big data now more than ever. Without clearly defining who should have access to what data across the organization, security can become a spider web mess that becomes more and more challenging to untangle by the day. The adoption of data tools across organizations is rapidly growing and without a clear security model, it can become impossible to support.
One basic example is the use, or lack thereof, of security groups. Many organizations are not leveraging these groups to ensure resources with similar roles are given the same level of access. This not only complicates the onboarding process and overall security management, it also creates the potential for resources to gain access to data they should not be privy to. The classic IT request of “can you match Suzy’s permissions” needs to die a hard death and be replaced with a more thought-out role-based approach.
Metadata management is a term used for the documentation of data. Business users need to understand what makes up any piece of data they consume. Where was it sourced? How has it been manipulated or transformed? When was it last updated? What assumptions should I apply? These are relevant questions that are important for anyone to understand as they leverage data for strategic advantages. Without having some way to answer these questions, organizations risk relying on incomplete or misinformed metrics.
Documenting the key performance indicators (KPIs) that drive an organization’s decision making is a starting point. The definitions of these should be available to all users and consumers of them to help in achieving a deep understanding of their meaning. For example, if a dashboard is reporting revenue, there should be documentation that clearly defines how that revenue is calculated. There are many types of revenue; in many cases not all revenue should be included when forming certain business decisions, like intercompany revenue. Without a data catalog or data discovery tool, like Azure Purview, it becomes difficult for consumers to understand what exactly they are analyzing.
Master Data Management
Master data is the identifier and attribute that defines an organization’s customers, vendors, products, resources, legal entities, chart of accounts, etc. The identification of this data can be straightforward, but the management of it can present challenges. This becomes more true as master data is used across many applications in the business.
A common example is maintaining two customer masters across customer relationship management (CRM) and enterprise resource planning (ERP) applications. If customer identifiers do not line up between the two systems, producing the complete customer view becomes unnecessarily challenging. In many cases, this proves the need for integration to improve efficiency and eliminate data duplication.
A data-driven culture is impossible without people. Data stewards are the resources within the organization that help drive data literacy. For most organizations, these are not dedicated resources, rather, resources with varying titles that lead or assist in data governance initiatives. These resources are imperative in all the areas of data governance discussed above.
There are many ways to setup data stewards across an organization. Some common examples are stewards by subject area or department, by business process, by system, by function, or even by project.
A starting point for many organizations is to use the departmental data stewards approach and have them regularly meet to start on activities around data quality improvement, master data management, and reviewing data security. Regardless of how data stewardship is established or which initiatives they start with, their role in the adoption of data governance will be key in defining the success of building a data-driven culture.
Regardless of an organization’s desire to take on digital transformation, there are many benefits for adopting data governance. Having confidence in the business data being produced across the organization is imperative to forming data-driven business decisions. For organizations taking on a digital transformation, it is the perfect opportunity to start the data governance journey and building the data-driven culture that everyone desires.
So, how does this relate to Fantasy Football? Let us take a few examples of how Fantasy Football platforms have implemented data governance based on the same five key areas.
1. Data Quality
Most fantasy platforms have made decisions around the accuracy and timing of data delivered to their users. In most cases, they have decided that the timing of data being delivered to the user is more important than the accuracy. This is why most, if not all, fantasy platforms have adjustments built into them. We have seen nearly a dozen fantasy matches lost on Tuesday after the conclusion of all games because of adjustments to a player’s catches, yards, fumbles, etc. Some of these adjustments come from changes in official scoring from the NFL, but many are simply because of data accuracy issues from the platform provider.
2. Data Security
Security is key to ensuring that your opponents cannot see players you are watching or outstanding waiver wire moves. Could you imagine the competitive advantage one player, even the league manager, had with that visibility?
3. Metadata Management
The simplest example of this is the scoring information made available in every league. On many platforms, the scoring will be broken down into how the total is accumulated across catches, yards, touchdowns, sacks, fumbles, etc. For those that don’t, having the scoring definitions for each occurrence of those still clearly defines how those values are calculated. Without this, there would be no way for fantasy managers to ensure accuracy in point totals and whether adjustments have been applied.
4. Master Data Management
Ensuring accurate metadata is crucial to help find players based on their current team, position, etc. This is especially true during a draft when you have limited time to find players to add to your team. In most cases, We use position and team filters to find the best available player. If Tom Brady is still listed with the Patriots, or better yet, in the wide receiver position, he may not have made my fantasy team.
5. Data Stewards
This one is not nearly as prevalent as the others. It is nearly impossible to believe that data stewardship does not exist in the Fantasy Football world as the entire game is based on data and statistics. A simple search of your favorite fantasy platform and data steward should in theory return an active or filled job listing.