Metadata management is a cross-organizational agreement on how to define informational assets for converting data into an enterprise asset. As data volumes and diversity grow, metadata management is even more critical to derive business value from the gigantic amounts of data.
What is Metadata?
Managing metadata begins with defining metadata correctly. Some choose to call metadata “data about data,” but that is only part of the picture. Some consider metadata as “what identifies data.” Again, a partial definition. In practical terms, metadata is much more than just the description of data. As the complexity of data increases, metadata assumes new roles every day. In some cases, metadata may be about the business perspective of quarterly sales. In other cases, it may describe source-to-target mappings for the data warehouse. It then all comes down to context.
Most of the metadata initiatives now build on a simple crisp definition: Metadata is data in the context of “who, what, where, why, when, and how.
|Who created this data?||What is the business definition of this data element?||Where is this data stored?||Why are we storing this data?||When was this data created?||How is this data formatted? (character, numeric, etc.)|
|Who is the Steward of this data?||What are the business rules for this data?||Where did this data come from?||What is its usage & purpose?||When was this data last updated?||How many databases or data sources store this data?|
|Who is using this data?||What is the security level or privacy level of this data?||Where is this data used & shared?||What are the business drivers for using this data?||How long should it be stored?|
|Who “owns” this data?||What is the abbreviation or acronym for this data element?||Where is the backup for this data?||When does it need to be purged/deleted?|
|Who is regulating or adding this data?||What are the technical naming standards for database implementation?||Are there regional privacy or security policies that regulate this data?|
If you think about metadata as the writing on the front of an envelope, the definition becomes more clear. The writing doesn’t tell you exactly what is inside the envelope, but it gives you a lot of context. It tells you where it is from, who is expected to receive it, how heavy it is, when was it sent, which courier carried it here, and maybe some sort of reference number for you to track the envelope. With this information, you can start to identify the content, what it may mean to you and how you can use it. Most digital metadata contain extra nuggets—the software you use, the top line of content, the IP address you can match with other online activity and more.
The context metadata conveys is both business as well as technical, which helps you understand the data and use it appropriately.
- Technical Metadata: Provides information on the format and structure of the data, such as data models, data lineage or access permissions.
- Business Metadata: Defines everyday business terms, such as table and column definitions, business rules, data sharing rules and data quality rules.
Why is metadata management important?
Metadata management is important because you can leverage metadata in understanding, aggregating, grouping and sorting data for use. You can also trace back many data quality problems to metadata. And as a CEO of an energy company puts succinctly, “quality metadata makes everything we do easier, from internal communications to planning new applications to making better decisions.”
Metadata users and metadata creators collaborate to improve data quality and data usability
The current need for metadata management is driven by the speed at which businesses are embracing the data culture. They generate enormous amounts of data and also consume it in massive quantities. Metadata management gives a clear and rich context to both the scenarios, about what data to produce and what data to consume, ensuring that data becomes a valuable enterprise asset.
“Metadata management continues to be a prevalent trend with 84% of respondents indicating that they currently had a metadata initiative in place or had plans for future efforts.”
2020 Trends in the Data Management,
Organizations need metadata management in their data management practice because there is:
- Increasing need for data governance, regulatory and compliance requirements and data enablement
- Increasing importance of higher data quality and trusted analytics driving business value from data
- Growing complexity of data, with new sources augmenting the traditional sources
- More business users actively interacting with data
- Increasing need to accelerate transformation efforts, such as digitization, omnichannel enablement and enterprise-resource-planning modernization
What is a metadata management tool?
A metadata management tool provides the common lexicon by which diverse groups of data citizens can converse and efficiently manage their work. For example,
- Business users work with organizational metrics and policies. At almost 80%, they are the largest group of metadata consumers, asking questions like the definition of regional sales or the current billing cycle.
- Business data stewards manage the business metadata, including glossary terms and business rules. They provide support for generating and consuming metadata.
- System data stewards manage the technical metadata and the alignment of systems with business rules.
- Data architects connect business metadata with technical metadata via conceptual and logical models. They are the second-largest group of metadata consumers at 70%, asking questions like the approved data structure for storing customer data or the source of report tables.
- Data engineers work with technical metadata and data type standards. They provide support for generating and consuming metadata.
Critical capabilities for metadata management tools
Business metadata such as glossary terms are often not formally documented and available only as tribal knowledge. Technical metadata is automatically loaded most of the time, and manually entered if necessary. At some level, both need to be aligned and made available uniformly. A metadata management tool handles this need efficiently, linking business and technical metadata in one place and providing the required context to all data citizens. It also enables metadata enrichment, which is essential to maximize the value of data.
Gartner notes that metadata management has shifted from focusing on reports, inventories and static impact analysis, toward intelligent optimization, data discovery and use-case analysis.
Some of the critical capabilities you need for a modern metadata management solution are:
- Data inventory with the capability to automatically identify similar attributes, resolve ambiguities and detect relationships with other data assets
- Data enrichment through automatic discovery and user tagging and rating (for example, tagging personally identifiable information or PII)
- Data lineage for identifying data provenance (a historical record of the data and its origins) and impact analysis to identify the downstream impact of a change on any metadata element
- Active metadata management which includes the extensive use of metadata leading to significant automation through AI/ML to support broader data management activities
- Intuitive user experience along with collaborative workflows to address a wide range of users such as data architects, data engineers, data stewards and data analysts
- Collaborative workflows to help metadata creators understand how the metadata will be consumed so that both metadata creators and consumers can contribute towards improving data quality
- Business semantics to support semantic variations in glossaries and terminology
- Business rules with visibility and the ability to identify exceptions
- Metadata exchange with third-party tools
- Support for security and privacy through rules and policy management
Implementing the right metadata management solution will help you move towards Data Intelligence, creating the inventory of enterprise data assets with pillars of data governance, data privacy, data catalog and data lineage.
Implementing metadata management
Implementing metadata management can be easy or complex depending on how you approach your use cases. Data governance and data analysis are identified as the most important use cases for metadata management solutions. Considering that metadata users and sources are very diverse, you will need to align metadata management with your data governance and data analysis strategies.
Data governance is a critical enabler for creating and managing metadata. Data governance goes hand in hand with metadata management to ensure access to trusted data that is correctly understood throughout the lifecycle and used in the right context. Automation and self-service can work only when high quality trusted data is available with a shared understanding of metadata.
The complexity of metadata management implementation varies according to the size and diversity of sources, use cases, users and their roles. It is also impacted by technology, which generates new sources and use cases and also creates opportunities to manage them better.
Metadata management works on three levels; it is important to focus on the linkage between all three.
- Terms: Common business language and definitions. Sources are industry standards, policy manuals, contracts, reference guides and handbooks.
- Attributes: Business resource-specific, such as system or reports. Sources include data dictionaries, system documentation, data models – enterprise, conceptual and logical.
- Elements: Data resource-specific, such as database tables or reports. Sources include database catalogs, spreadsheets and data models – physical.
Implementing metadata management requires that metadata is captured, stored and governed consistently at all three levels. Metadata linkage between levels and also with the top-level domains (such as customers, vendors or products) supports search, navigation and drill-down. Processes govern all the levels to change, review, validate and certify terms. You also need enterprise-level policy management to assure quality metadata at all levels.
Metadata management drives business value, improves innovation and collaboration, and helps mitigate risk. It also enables data citizens to access high-quality and trusted data, thus ensuring that they work with the right data to deliver accurate insights.
In the next blog, we’ll focus on the best practices of metadata management to ensure continued value to your organization.