MindMap Gallery DAMA-CDGA Data Governance Engineer-10. Reference Data and Master Data
Master data and reference data management ensures that the organization has complete, consistent, up-to-date and authoritative reference data and master data in each process, and reduces the cost of data usage and data integration by adopting standard and common data models and integration modes. and complexity.
Edited at 2024-03-05 20:27:15Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
10. Reference data and master data
introduction
Overview
1. Master data: For shared data, improve data quality by establishing data standards.
2. Difficulty of master data: how to identify master data
3. How to identify
Whether the entity is shared
important, relatively stable attributes
background
In any organization, there is data that needs to be used across business areas, processes and systems
If this data is shared, the entire organization and its customers benefit
Data-driven organizational activities often focus on transactional data (increase sales or market share, reduce costs, demonstrate compliance, etc.), but the ability to leverage such transactional data is highly dependent on the availability and quality of reference and master data
Master Data Management Drivers
Meet organizational data needs
Multiple business areas in the organization need access to the same data sets and they trust that these data sets are complete, up-to-date, and consistent
Master data is the basis for these data sets
Manage data quality
Master data management defines entities that are critical to the organization through the use of a unified identity
Manage data integration costs
Integrating new data sources into an already complex environment is more costly without master data
This reduces additional costs arising from changes in how key entities are defined and identified
reduce risk
Master data simplifies data sharing architecture, thereby reducing risk
Reference Data Management Drivers
Meet the data needs of multiple projects and reduce the risk and cost of data integration by using consistent reference data
Improve data quality
Target
Ensure the organization has complete, consistent, up-to-date and authoritative reference and master data across all processes
Encourage enterprises to share reference data and master data among various business units and application systems
Reduce the cost and complexity of data usage and data integration by adopting standard, common data models and integration modes
in principle
Share data
In order for reference data and master data to be shared across an organization, these data must be managed
ownership
Reference data and master data ownership should belong to the organization, not to a system or department
Because it needs to be widely shared, global organizational management is needed
quality
Reference data and master data require continuous data quality monitoring and quality
Management responsibilities
Business data management specialists are responsible for controlling and ensuring the quality of reference data
Control changes
At a given point in time, master data values should represent the organization's best understanding of what is accurate and up-to-date
Matching rules for changing data values should be used with caution and under relevant supervision
Any operation that merges or splits master and reference data should be traceable
Changes to reference data should follow a clear process: changes should be communicated and approved before they are implemented
Permissions
Master data values should only be copied from the system of record
In order to achieve cross-organizational sharing of master data, a reference data management system may need to be established
basic concept
The difference between master data and reference data
main data
Master data requires identifying and developing a trusted instance version for each instance of a conceptual entity and maintaining the currency of that version
The challenge with master data is entity resolution, which is the process of identifying and managing associations between data from disparate systems and processes
The entities and instances represented by each row of master data have different expressions in different systems.
Master data management is to eliminate these differences so that individual entities and instances can be consistently identified in different environments.
It should be noted that this process must be continuously managed to keep the identities of these master data entities and instances consistent.
resemblance
Both provide important contextual information for the creation and use of transactional data (reference data also provides context for primary data) in order to understand the meaning of the data
Both are shared resources managed at the enterprise level
Having multiple instances of the same reference data will reduce efficiency and inevitably lead to inconsistencies between instances. Inconsistency will lead to ambiguity, and ambiguity will bring risks to the organization.
different
Reference data does not change easily, and its data set is usually smaller, less complex, and has fewer columns and rows than a transaction data set or a master data set.
Reference data management does not include the challenges of entity resolution
Different management priorities
Reference data management
Need to control the defined domain values and their definitions
The goal is to ensure that organizations have access to a complete set of accurate and up-to-date values for each concept
Master data management
The values and identifiers of master data need to be controlled so that the most accurate and timely data from core business entities can be used consistently across systems
Goals include ensuring the accuracy and usability of current values while mitigating the risks associated with ambiguous identifiers
One challenge in reference data management is who leads or is responsible for the definition and maintenance of reference data
Some reference data is sourced outside the organization using it, They cross boundaries within the organization and are owned by more than just one department
Other reference data may be created and maintained within a department, but has potential value in other parts of the organization
Identifying responsibilities for acquiring data and managing updates is part of reference data management
Lack of maintenance accountability creates risks as discrepancies in reference data may lead to misunderstanding of data context
Because master and reference data provide contextual information to transactions, they shape the transactional data that comes into the organization as it operates and support the framework analysis of transactional data.
Reference data
Any data that can be used to describe or classify other data, or to connect data to information outside the organization
Reference data management requires the control and maintenance of defined domain values, definitions, and values between domain values
The goal of reference data management is to ensure that referenced values across different functions are consistent, up-to-date, and accessible within the organization
Like other data, reference data requires metadata
An important metadata attribute of reference data is its source, such as the governing body for industry standard reference data.
Reference data structure
list
The simplest reference data is a list of code values and code descriptions
Cross-reference data list
Different applications can use different code sets to represent the same concept
Cross-reference data sets can convert between code values
taxonomy
The classified reference data system obtains information based on differences at different levels.
Categorical reference data can be stored in a recursive relationship
Ontology
Some organizations use the ontologies used to manage website content as part of their reference data. This is because ontology models are also used to describe other data or to connect organizational data with information outside the organizational boundaries.
Ontology model can be understood as a form of metadata
Best practices for maintaining ontologies are similar to best practices for reference data management
One of the main use cases for Ontology is content management
Proprietary or internal reference data
Many organizations create reference data to support internal processes and applications
Industry reference data
Used to describe a data set created and maintained by an industry association or government agency rather than by an organization to provide a common standard for encoding important concepts
For example, the International Classification of Diseases codes (ICD) provide a common way to classify health conditions and treatments.
Geographic or geostatistical reference data
Can be classified or analyzed based on geographic information
For example, Census Bureau reports on population density, translation of historical meteorological information into strict geographic categories
Calculate reference data
Many commercial activities rely on the use of some common, continuously calculated data
For example, foreign exchange calculations rely on well-managed, up-to-date exchange rate tables
The main difference between computational reference data and other types of reference data is how often it changes
Metadata for standard reference data sets
Like other data, reference data changes over time
Because it is commonly used in a variety of organizations, it is important to maintain key data in reference data sets to ensure that their lineage and circulation processes are understood and maintained.
main data
Master data is data about business entities that provide contextual information for business transactions and analysis
Entities are objects in the objective world
Entities are represented by entities and instances in the form of data and records.
Master data should represent the authoritative, most accurate data related to key business entities
system of record, system of reference
When different versions of a "truth" may exist, it is necessary to distinguish between them
In order to do this, it is necessary to know where the data comes from or is accessed, and the specific use and purpose for which it was prepared.
A system of record is an authoritative system that creates, captures, and maintains data using a defined set of rules and expectations
A reference system is also an authoritative system. Data consumers can retrieve reliable data from the reference system to support transactions and analysis, even if the information does not originate from the reference system.
Master data management applications MDM, data sharing center DSH and data warehouse DW are often used as reference systems
Trusted Source, Golden Record
Trusted sources are considered the “best version of the truth”
Among trusted sources, the record that represents the most accurate data for an entity or instance can be called a golden record
Master data management
Master data management can only ensure the consistent use of the most accurate and timely data for core business entities across systems by controlling master data values and identifiers.
Goals include ensuring the availability of accurate, up-to-date values while mitigating the risk of ambiguous identifiers
step
Identify candidate data sources that provide a comprehensive view of master data entities
Develop rules for exact matching and merging of entities and instances
Establish methods to identify and recover data that is not properly matched or merged
Establish a way to distribute trusted data to systems across the enterprise
Key Processing Steps in Master Data Management
Data model management
data collection
Data validation, standardization and data enrichment
Entity resolution and identifier management
Party master data
is data about individuals, organizations and their roles in business relationships
Financial master data
Includes data about business units, cost centers, profit centers, general ledger accounts, budgets, plans, and projects
Legal master data
Includes data regarding contracts, regulations and other legal matters
Product master data
Product Lifecycle Management PLM
Product Data Management PDM
Enterprise Resource PlanningERP
Manufacturing Execution System MES
Customer Relationship Management CRM
location master data
Provides the ability to track and share geographic information and create hierarchies or maps based on geographic information
Industry Master Data--Reference Directory
A reference catalog is the authoritative list of master data entities (companies, people, products, etc.) that organizations can purchase and use as the basis for transactions
data sharing architecture
Registry
The registry is an index to master data records in various systems of record
Record the master data local to the system management application and access the master data based on the master index
A registry is relatively easy to implement because it requires few changes to the system of record
Trading Center
In this approach, each application interacts with a central system to access and update master data.
Master data exists within Trading Center and not in any other application
Transaction Center is the system of record for master data
blend mode
Hybrid mode is a mix of Registry and Transaction Center
Record master data local to system management applications
Master data is consolidated in a common repository and shared through a data sharing center, eliminating the need for direct access from systems of record
Activity
Master data management activities
Identify drivers and needs
Evaluate and evaluate data sources
Define architectural approach
Modeling master data
Define management responsibilities and maintenance processes
Establish a governance system to promote the use of master data
Reference data management activities
Define drivers and needs
Evaluate data sources
Define architectural approach
Modeling reference data
Define management responsibilities and maintenance processes
Establish a reference data governance system
Tools and methods
Master data management can be achieved through data integration tools, data repair tools, operational data storage ODS, data sharing center DSH or specialized master data management applications
Implementation Guide
Follow the master data architecture
Establishing and following an appropriate reference architecture is critical to managing and sharing master data across the organization
Detect data flow
As data flows across reference and master data shared environments, the associated data flows should be monitored for the following purposes
Show how data is shared and used across the organization
Identify lineage of data in management systems and applications
Assist in root cause analysis of problems
Demonstrate the effectiveness of data integration and consumption integration techniques
Exhibiting data value latency from source systems through data consumption
Determine the validity of business rules and transformations performed in integration components
Manage reference data changes
Requests for changes to reference data should follow an established process
data sharing agreement
To ensure appropriate access and use, a sharing agreement should be established that stipulates which data can be shared and under what conditions
Organizational and cultural change
Reference data and master data governance
Governance process decisions
Metrics
Data quality and compliance
data change activity
Auditing the lineage of trusted data is necessary to improve data governance in data sharing environments
Data acquisition and consumption
These metrics should show and track which systems are contributing data and which business areas are subscribing to data in a shared environment
service level agreement
SLAs should be established and communicated to contributors and subscribers to ensure usage and adoption across the data sharing environment
Data Management Specialist Coverage
These metrics should focus on the person or team responsible for the data content and show how often coverage is measured
total cost of ownership
Costs can include environmental infrastructure, software licenses, support staff, consulting fees, training, etc.
Data sharing volume and usage
Data volumes and usage incorporated into master data need to be tracked to determine the effectiveness of the data environment