MindMap Gallery Data quality
;Dama knowledge system, data quality is the extent to which data meets the purpose of data consumers and can meet the specific needs of business scenarios in a business environment.
Edited at 2024-04-07 10:12:18Ce calendrier annuel, créé avec EdrawMax, présente une disposition claire et organisée des mois de janvier à décembre. Chaque mois est affiché dans un cadre distinct, montrant les jours de la semaine et les dates correspondantes. Les weekends (samedis et dimanches) sont mis en évidence pour une meilleure visibilité. Ce format est idéal pour la planification et l'organisation des activités tout au long de l'année, offrant une vue d'ensemble rapide et facile à consulter.
This quarterly calendar overview for 2026, created with EdrawMax, presents a structured and colorful layout of the entire year divided into four quarters. Each quarter is displayed in a separate column, showcasing the months within that quarter in a clear grid format. The days of the week are labeled, and each date is marked within its respective cell, allowing for easy identification of dates across the year. This calendar is an excellent tool for long-term planning, providing a comprehensive view of the year at a glance.
This weekly calendar for 2026 is designed using EdrawMax to provide a detailed and organized view of each week, starting from January. The left side features a mini monthly calendar for quick reference, highlighting the current week in yellow. Below it, there's a section for weekly goals to help prioritize tasks. The main area is a time-grid from 6:00 AM to 12:00 AM, divided into half-hour slots, allowing for precise scheduling of daily activities throughout the week. This layout is ideal for managing a busy schedule efficiently.
Ce calendrier annuel, créé avec EdrawMax, présente une disposition claire et organisée des mois de janvier à décembre. Chaque mois est affiché dans un cadre distinct, montrant les jours de la semaine et les dates correspondantes. Les weekends (samedis et dimanches) sont mis en évidence pour une meilleure visibilité. Ce format est idéal pour la planification et l'organisation des activités tout au long de l'année, offrant une vue d'ensemble rapide et facile à consulter.
This quarterly calendar overview for 2026, created with EdrawMax, presents a structured and colorful layout of the entire year divided into four quarters. Each quarter is displayed in a separate column, showcasing the months within that quarter in a clear grid format. The days of the week are labeled, and each date is marked within its respective cell, allowing for easy identification of dates across the year. This calendar is an excellent tool for long-term planning, providing a comprehensive view of the year at a glance.
This weekly calendar for 2026 is designed using EdrawMax to provide a detailed and organized view of each week, starting from January. The left side features a mini monthly calendar for quick reference, highlighting the current week in yellow. Below it, there's a section for weekly goals to help prioritize tasks. The main area is a time-grid from 6:00 AM to 12:00 AM, divided into half-hour slots, allowing for precise scheduling of daily activities throughout the week. This layout is ideal for managing a busy schedule efficiently.
Data quality
introduction
Factors leading to low-quality data
1. Organizations lack understanding of the impact of low-quality data
2. Lack of planning
3. Island system involves
4. Inconsistent development process
5. Incomplete documentation
6. Lack of standards or lack of governance, etc.
Data quality management is not a project, but an ongoing effort
context diagram
definition
To ensure that the needs of data consumers are met, apply data management techniques for management activities such as planning, implementation and control
Target
Develop a management approach that makes data fit for purpose based on the needs of data users
Define standards, requirements and specifications for data quality control as part of the data lifecycle
Define and implement processes for measuring, monitoring and reporting data quality levels
Identify and improve data quality opportunities through process and system improvements
enter
Data policies and standards
Data quality expectations
Business needs
Business Rules
Data requirements
business metadata
technical metadata
Data sources and data storage
Data lineage
Activity
1. Regular high-quality data (P)
2. Define data quality strategy (P)
3. Identify key data and business rules (P)
1) Identify key data
2) Identify existing rules and patterns
4. Perform an initial data quality assessment (P)
1) Identify and prioritize problems
2) Execution problem root cause analysis
5. Identify and prioritize improvements
1) Prioritize actions based on business impact
2) Develop preventive and corrective measures
3) Confirm planned actions
6. Define data quality operations (P)
7. Develop and deploy data quality operations (D)
1) Develop data quality operating rules
2) Correct data quality defects
3) Measure and monitor data quality
4) Report data quality levels and findings
Deliverables
Data quality strategy and framework
Data Quality Planning Organization
Data profiling
Recommendations based on problem root cause analysis
Data quality management procedures
Data quality report
Data Quality Governance Report
Data Quality Service Level Agreement
Data policies and guidelines
method
Multiple cross-checks by yourself
Label and annotate data issues
Root Cause Analysis
statistical process control
tool
Data analysis and query tools
Data quality rules template
QA and audit code module
metadata repository
Metrics
Governance Consistency Metrics
Data quality measurements
Data quality trends
Data Issue Management Metrics
Contextual Mapping—Data Quality
business drivers
1. Opportunities to increase organizational data value and data utilization
2. Reduce risks caused by low-quality data
3. Improve organizational efficiency and productivity
4. Protect and enhance the organization’s reputation
goal principle
management objectives
1) Based on the needs of data consumers, develop a managed approach to adapt data to requirements.
2) Define standards and specifications for data quality control and make them part of the entire data life cycle.
3) Define and implement processes for measuring, monitoring, and reporting data quality levels.
Management follows principles
1) Importance.
Data quality management should focus on the data that is most important to the business and its customers, and improvements should be prioritized based on the importance of the data and the level of risk if the data is incorrect.
2) Full life cycle management.
Data quality management should cover the entire life cycle of data from creation or creation to disposal, including data management when it flows within and between systems (each link in the data chain should ensure that data has high-quality output) .
3) Prevention.
The focus of a data quality program should be on preventing data errors and situations that reduce data availability, not on simply correcting records.
4) Root cause correction.
Improving data quality is more than just correcting errors. Because data quality problems are often related to process or system design, improving data quality often requires changes to processes and the systems that support them, not just understanding and solving them.
5) Governance.
Data governance activities must support the development of high-quality data, and data quality planning activities must support and sustain a governed data environment.
6) Standard driver.
All stakeholders in the data lifecycle will have data quality requirements. Where possible, quantifiable data quality requirements should be defined in the form of measurable standards and expectations.
7) Objective measurement and transparency.
Data quality levels need to be measured objectively and consistently. Measurement processes and measurement methods should be discussed and shared with stakeholders as they are the arbiters of quality.
8) Embed business processes.
Business process owners are responsible for the quality of the data generated through their processes, and they must implement data quality standards into their processes.
9) System enforcement.
System owners must enforce data quality requirements on the system.
10) Related to service level.
Data quality reporting and issue management should be integrated into service level agreements (SLAs)
basic concept
Data quality
The term data quality refers both to the relevant characteristics of high-quality data and to the processes used to measure or improve data quality.
key data
Most organizations have large amounts of data, but not all data is equally important. A principle of data quality management is to focus improvements on the data that is most important to the organization and its customers
Assess critical data requirements
1. Regulatory reporting
2. Financial reporting
3. Business policy
4. Going concern
5. Business strategy, especially differentiated competitive strategy
Data quality dimensions
A data quality dimension is a measurable characteristic of the data
Three most impactful missions
The Strong-Wang framework focuses on data consumers’ perspectives on data
In his book "Data Quality in the Information Age", Thomas Redman developed a set of data quality dimensions based on data structure. Redman defines a data item as a "representable triple": a collection of entity attributes and values.
In his book "Improving Data Warehousing and Business Information Quality", Larry English proposed a set of comprehensive indicators, divided into two categories: inherent characteristics and practical characteristics. Intrinsic features have nothing to do with data usage. Practical features are dynamic and related to data expression. Their quality value depends on the use of data.
6 core dimensions of data quality
1) Completeness. Stored data volume as a percentage of potential data volume
2) Uniqueness. Entity instances (things) should not be recorded multiple times on the basis of satisfying object recognition.
3) Timeliness. The extent to which the data represents reality from the requested point in time.
4) Effectiveness. Data is valid if it conforms to its defined syntax (format, type, range).
5) Accuracy. The extent to which the data accurately describes the "real world" object or event being described.
6) Consistency. Compare the differences between various expressions and definitions of things.
Data quality and metadata
Metadata is critical to managing data quality. Data quality is about meeting expectations, and metadata is the primary means of clarifying expectations.
Data quality improvement life cycle
Common methods to improve data quality
The Deming Circle is a problem-solving model known as "plan-do-check-act"
P: plan
D: Execute
C: Check
A: Action/process
Data quality business rule types
1) Define consistency.
Confirm that data definitions are understood identically and are implemented and used correctly throughout the organization;
2) Value existence and record completeness.
Rules defining whether missing values are acceptable
3) Format compliance.
A value assigned to a data element according to a specified pattern, such as a standard for formatting phone numbers
4) Value range matching.
The assignment of the specified data element must be included in the enumeration value of a certain data value field. For example, the reasonable value of the state field is the 2-character U.S. zip code.
5) Scope consistency.
The data element assignment must be within a defined number, dictionary or time range, such as a number range greater than 0 and less than 100.
6) Mapping consistency.
Represents a value assigned to a data element that must correspond to a selected value that maps to another equivalent corresponding value range.
7) Consistency rules.
Refers to the conditional determination of the relationship between two (or more) attributes based on the actual values of these attributes.
8) Accuracy verification.
Compare the data value to the corresponding value in the system of record or other verification source (such as marketing data purchased from the vendor) to verify that the values match.
9) Accuracy verification.
Rules that specify which entities must have unique representations and have exactly one record for each representation of a real-world object.
10) Timeliness verification.
Rules indicating characteristics expected to be associated with data accessibility and usability.
Common causes of data quality issues
1) Problems caused by lack of leadership
2) Problems caused by data entry process
3) Problems caused by data processing functions
4) Problems caused by system design
5) Solve the problems caused by the problem
Data Profiling: Data Profiling is a form of data analysis used to examine data and assess quality
method
Corrective Action
Perform data correction methods
1. Automatic correction
2. Manual inspection and correction
3. Manual correction
Effective data quality metrics
feature
1) Measurability.
2) Business relevance.
3) Acceptability.
4) Accountability/management system.
5) Controllability.
6) Trend analysis.
data qualitydata governance
Metrics
High-level indicators of data quality
1) Return on investment
2) Quality level
3) Data quality trends
4) Data problem management indicators
5) Consistency of service levels
6) Schematic diagram of data quality plan