MindMap Gallery DAMA-CDGA Data Governance Engineer-12. Metadata Management
Metadata management can help organizations understand their own data, systems, and processes, while helping users evaluate data quality. It is indispensable for the management of databases and other applications. It helps with processing, maintenance, integration, protection, and governance. Other data.
Edited at 2024-03-05 20:29:49Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
12. Metadata Management
introduction
Overview
1. Technical perspective: metadata
2. Business perspective: data resource directory
3. Data resource directory ≠ data asset directory
4. Metadata management principles: return everything that is due and collect everything that is due, that is, the resource directory must be complete
definition
is "data of data"
describe
data itself
Database, data elements, data model
Data representation concepts
Business processes, application systems, software code, technical infrastructure
Connections between data and concepts
relation
significance
Metadata can help organizations understand their own data, systems, and processes, while helping users evaluate data quality. It is indispensable for the management of databases and other applications.
It helps process, maintain, integrate, protect and govern other data
Without reliable metadata, an organization doesn't know what data it has, what the data represents, where it comes from, how it moves through the system, who has access to it, and what it means to maintain high quality.
Without metadata, organizations cannot manage their data as an asset
In fact, without metadata, organizations may not be able to manage their data at all
business drivers
Data management requires metadata, and metadata itself also needs to be managed
Good management of metadata helps
Improve data trustworthiness by providing context and performing data quality checks
Increase the value of strategic information, such as master data, by extending its use
Improve operational efficiency by identifying redundant data and processes
Prevent the use of outdated or incorrect data
Reduce data research time
Improve communication between data users and IT professionals
Create accurate impact analysis to reduce the risk of project failure
Reduce time to market by shortening system development life cycle time
Reduce training costs and the impact of employee turnover by comprehensively documenting data context, history and provenance
Meet regulatory compliance
Improper metadata management can easily lead to the following problems
Redundant data and data management processes
Duplicate and redundant dictionaries, repositories, and other metadata stores
Inconsistent data element definitions and risk of data misuse
Different versions of metadata are contradictory and conflicting, reducing the confidence of data users
Doubt the reliability of metadata and data
Good metadata management can ensure a consistent understanding of data resources and more efficient development and use across organizations.
goals and principles
Ultimate Goal: Query and Analysis
Target
Document and manage the body of knowledge of data-related business terms to ensure that people understand and use data content consistently
Collect and integrate metadata from different sources to ensure people understand the similarities and differences between data from different parts of the organization
Ensure metadata quality, consistency, timeliness and security
Provide a standard way for metadata consumers to access metadata
Promote or enforce the use of technical metadata standards to enable data exchange
in principle
Organizational commitment
strategy
Metadata strategy must be aligned with business priorities
Enterprise perspective
Ensure future scalability from an enterprise perspective, achieved through iterative and incremental delivery
Subtly
Impressing its value will encourage businesses to use metadata while providing knowledge assistance to businesses.
access
Ensure employees understand how to access and use metadata
quality
Metadata is usually generated through existing processes (data modeling, SDLC, business process definition), so the process owner is responsible for the quality of the metadata
audit
Develop, implement and review metadata standards to simplify the integration and use of metadata
Improve
Create a feedback mechanism so that data users can report incorrect or outdated metadata to the metadata management team
basic concept
Metadata vs. Data
Metadata is also a kind of data and should be managed using data management methods.
Type of metadata
business metadata
Focuses primarily on the content and conditions of the data, but also includes details related to data governance
technical metadata
Provides information about the technical details of the data, the systems in which the data is stored, and the processes by which data flows within and between systems
Manipulate metadata
Describes the details of processing and accessing data
ISO/IEC11179 metadata registration standard
Provides a framework for defining metadata registration
Metadata for unstructured data
Essentially, all data has a certain structure, but not all data is recorded in the form of rows and columns in the familiar relational database.
Any data that is not in a database or data file is considered unstructured data
include
Description metadata
Structural metadata
Manage metadata
bibliographic metadata
Record metadata
Save metadata
Organizations looking to leverage data lakes and use big data platforms such as Hadoop are finding that they must catalog the data they collect so that it can be accessed later.
In most cases, collecting metadata as part of the data collection process requires collecting a minimal set of metadata attributes (such as name, format, source, version, date received, etc.) Generate a directory for data lake content
Source of metadata
In-application metadata repository
Metadata repository refers to the physical tables that store metadata, often built into modeling tools, BI tools, and other applications
business glossary
The purpose of a business glossary is to record and store an organization's business concepts, terms, definitions, and the relationships between these terms
As with all data-oriented systems, designing a business glossary should consider hardware, software, databases, processes, and human resources with different roles and responsibilities
The Business Glossary application needs to be built to meet the functional needs of three core users
business user
Data analysts, research analysts, managers, and others who use business glossaries to understand terminology and data
technical user
Technical users use the business glossary to design architecture, design systems and development decisions and conduct impact analysis
Data Management Specialist
Data Management Specialists use business glossaries to manage and define the life cycle of terms and enhance enterprise knowledge by linking data assets to the glossary
business intelligence tools
Business intelligence tools generate various metadata related to business intelligence design
Configuration management tools
A configuration management tool or database CMDB provides the functionality to manage and maintain metadata related to IT assets, their relationships, and the contract details of the assets.
Data Dictionary
A data dictionary defines the structure and content of a data set, typically for a single database, application, or data warehouse
There is one data dictionary for each database, and the data dictionary in each database is not universal.
Data integration tools
Many data integration tools use executables to move data from one system to another, or between different modules within the same system
Database management and system catalog
Database catalogs are an important source of metadata. They describe the contents of the database, information size, software version, and other operational metadata attributes.
The most common form of database is relational, which manages data as a set of tables and columns
Metadata solutions should be able to connect to various databases and datasets and read all metadata exposed by the database
Data mapping management tool
Mapping management tools are used during the analysis and design phases of projects, converting requirements into mapping specifications, which are then used directly by data integration tools or used by developers to generate data integration code
Data quality tools
Data quality tools assess data quality by validating plans
dictionaries and directories
Data dictionaries and glossaries contain detailed information about terms, tables, and fields, but dictionaries or catalogs contain information about the systems, sources, and locations of relevant organizational data.
event messaging tool
Event messaging tools move data between different systems, require extensive metadata, and generate metadata that describes the movement
Modeling tools and repositories
Data modeling tools are used to build various types of data models: conceptual, logical, and physical models
Reference database
Reference data records the business value and description of various types of enumerated data for use within the context of the system
Registration service
Other metadata stores
Type of metadata schema
Centralized metadata architecture
A centralized metadata architecture consists of a single metadata repository that includes copies of metadata from separate sources
Organizations with limited IT resources, or those seeking to automate as much as possible, may choose to avoid this architectural option
Organizations seeking a high degree of consistency in a public metadata repository can benefit from a centralized metadata architecture
advantage
Highly available because it is independent of the source system
Fast metadata retrieval because repository and query functions are together
Resolves database structure issues so that they are not affected by properties unique to third parties or commercial systems
When extracting metadata, you can convert, customize, or supplement it with metadata from other source systems to improve metadata quality.
shortcoming
Complex processes must be used to ensure that changes in the metadata source are quickly synchronized to the repository
Maintaining a centralized repository can be costly
Extraction of metadata may require custom modules or middleware
process
Centralized repository exposes a portal for end users to submit queries
The metadata portal passes the request to the centralized metadata repository, which will fulfill the request with the collected metadata
Since various metadata is collected in a centralized repository, metadata collected by various tools can be globally searched
Distributed metadata architecture
A fully distributed architecture maintains a single access point
Metadata search engines respond to user requests by retrieving data from source systems in real time
Distributed metadata architecture without persistent database
advantage
Metadata is always as up-to-date and valid as possible because it is retrieved directly from its data source
Querying is distributed, potentially improving response and processing efficiency
Metadata requests from proprietary systems are limited to query processing without requiring detailed knowledge of proprietary data structures, thus minimizing the effort required to implement and maintain
Automated metadata query processing may be simpler to develop, requiring only minimal human intervention
Reduced batch processing, no metadata copying and synchronization process
shortcoming
User-defined or manually inserted metadata items cannot be supported because there is no repository to place these additions
Metadata from different systems needs to be presented in a unified and standardized display method
Query functionality is affected by source system availability
Governance of metadata depends entirely on the source system
process
There is no centralized metadata repository, the portal passes user requests to the appropriate tool for execution
Since metadata is not collected from various tools for centralized storage and each request must be delegated to the source system, there is no capability for global search across various metadata sources.
Hybrid metadata architecture
Hybrid architecture combines the characteristics of centralized and distributed architectures, metadata still moves directly from the source system to the centralized repository, but the repository design only considers user-added metadata, important standardized metadata, and added through manual sources Metadata
The architecture benefits from near real-time retrieval and augmentation of metadata from the source to best serve user needs when needed.
Based on user priorities and requirements, metadata is used as up-to-date and efficient as possible
Hybrid architecture does not improve system availability
Beneficial for organizations with rapidly changing operational metadata that require a consistent, unified metadata organization, and where metadata and metadata sources are growing substantially
For organizations with mostly static metadata or small metadata increments, their potential may not be realized.
Two-way metadata architecture
It allows metadata to be changed in any part of the schema (source, data integration, user interface) and then synchronizes the changes from the repository (agent) to its original source to enable feedback
Activity
Define a metadata strategy
Understand metadata requirements
Define metadata schema
Create metamodel
Creating a data model for a metadata repository, also called metadata, is the first design step after defining a metadata strategy and understanding business requirements.
Apply metadata standards
Manage metadata storage
Create and maintain metadata
Integrate metadata
Scan of metadata repository There are two different ways
dedicated interface
Single step approach
The scanner collects metadata from the source system and directly calls the format-specific loader to load the metadata into the metadata store.
In this process, there is no need to output any intermediate metadata files, and the collection and loading of metadata is also completed in one step.
semi-private interface
Use a two-step approach
The scanner collects metadata from the source system and outputs it to a data file in a specific format
The scanner only produces data files that the target repository can read and load correctly
Data files can be read in multiple ways, so the architecture of this interface is more open
A non-persistent metadata staging area can be used to store temporary and backup files. The staging area should support rollback and recovery processing and provide temporary audit trail information to help repository administrators track the source of metadata or Quality issues
The temporary storage area can be in the form of a file directory or database
Distribute and deliver metadata
Query, report, and analyze metadata
tool
The primary tool for managing metadata is the metadata repository
Metadata management tools provide the ability to manage metadata in a centralized repository location
Metadata can be entered manually or extracted from various other sources through specialized connectors
The metadata repository also provides functionality for exchanging metadata with other systems
method
Data lineage and impact analysis
An important aspect of discovering and recording metadata for data assets is that it provides information about how data moves between systems.
The limitation of data lineage creation lies in the coverage of the metadata management system
Function-specific metadata repositories or data visualization tools provide data lineage information within their management scope, and will not be able to provide relevant information beyond their management scope.
The metadata management system imports "implementation lineage" through tools that can provide data lineage details, and supplements it by obtaining implementation details from "design lineage" files that cannot be automatically extracted.
The process of joining the various parts of the data lineage is called "stitching", and the result of the stitching is a panoramic view that represents the movement of the data from its original location to its final location.
To successfully discover data kinship relationships, you need to take into account both business focus and technical focus.
business focus
Find the blood relationship of data elements based on business priorities
Trace back from the target location to the source system where the specific data originated
technology focus
Start with the source system to identify directly related data users, and then identify indirect data users until all systems are identified.
Data lineage
from bottom to top
Impact Analysis
from top to bottom
Metadata applied to big data collection
Whether internal or external, there is no need to move data to the same physical location
Through new technologies, programs will revolve around data rather than moving data into programs, which can reduce large amounts of data movement and increase program execution speed.
Implementation Guide
Readiness Assessment/Risk Assessment
Organizational and cultural change
subtopic
Metadata governance
process control
Documentation for Metadata Solutions
Metadata Standards and Guidelines
Metrics
Metadata repository integrity
Metadata management maturity
Dedicated staffing
Metadata usage
business terminology activities
Master Data Services Data Compliance
Metadata document quality
Metadata repository availability