MindMap Gallery Process and methods
Chapter 3 of Data Science Theory and Practice includes data processing, data auditing, data analysis, data visualization, data storytelling, and data science project management.
Edited at 2023-10-15 10:50:07This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
Process and methods
Basic process
Digitization
The process of capturing people's lives, business or social activities and converting them into data
Data processing and regularization
clean data
Organize data
Two basic issues in data processing
exploratory data analysis
EDA method
Resistance
residual
re-express
Data analysis and insights
descriptive analysis
Predictive analytics
normative analysis
Results show
Provision of data products
data processing
Data processing refers to a series of processing activities that audit, clean, transform, integrate, desensitize, reduce and label the original data set according to the needs of subsequent data calculations before the data is formally processed.
Data quality requirements, data calculation requirements
Data cleaning
Missing data handling
Redundant data processing
Noisy data processing
According to the binning strategy of the original data set
Replacement method based on member data in each box
data transformation
Smoothing
Feature construction
gather
standardization
discretization
data integration
Content integration
structural integration
Pattern integration
Data redundancy
Conflict detection and elimination
Data desensitization
Unidirectionality
No residue
easy to accomplish
data reduction
Dimension reduction
value reduction
Data annotation
Grammar annotation
Semantic annotation
Data audit
According to the general regulations and evaluation methods of data quality, audit the data content and its elements to identify problems.
Missing values, noise values, inconsistent values, incomplete values
Predefined audits
Data Dictionary
User-defined integrity constraints
self-descriptive information about data
domain value of attribute
Data self-contained associated information
Custom audit
Variable definition rules
Function definition rules
Common techniques for data auditing
first law of numbers
small probability principle
linguistic rules
data continuity theory
data authentication technology
Visual audit
data analysis
descriptive analysis
Focus on the past and answer what has happened
The first step in data analysis
Descriptive statistical analysis methods
diagnostic analysis
Focus on the past and answer why it happened
Correlation analysis and causal analysis
Predictive analytics
Focus on the future and answer what will happen
Use classification analysis and trend analysis
is the basis for normative analysis
normative analysis
Pay attention to the problems of simulation and optimization, and how to optimize the problems that will occur
Using operations research, simulation and emulation techniques
Can directly generate industrial value
data visualization
basic type
scientific visualization
information visualization
visual analytics
visual analytics
information visualization
data mining
Statistical Analysis
analytical reasoning
human-computer interaction
Visual analytics model
Emphasis on the process of converting data into knowledge
Emphasis on the interplay between visual analytics and automated modeling
Emphasize the importance of data mapping and data mining
Emphasis on the need for data processing
Emphasis on the importance of human-computer interaction
Methodology
Methodological basis
basic method
domain methods
Visual perception and visual cognition
visual perception
The process by which objective things produce direct reactions in the human brain through visual sensory organs
visual cognition
The further processing of visual perception information by individuals
Data types from a visual perspective
Categorized data
ordinal data
interval data
Ratio data
Visual channel selection method
Accuracy
legibility
visual artifact
Refers to a false or inaccurate visual perception produced by the target user that is inconsistent with the intent of the data visualizer or the reality of the data itself
Visualizing the surrounding environment of Yamashita where the apostle is located may cause visual artifacts
The human eye's relative judgment of brightness and color can easily lead to visual illusions
The experience and experience of the target user may cause visual artifacts
Six famous practices in data visualization and their source codes
Calculate the age of the universe
Render the moon with Earth's colors
1.3 billion taxi trips in New York City
See the world through 17,000 itineraries
Eclipse formatting
The Jimi Hendrix Experience
Data storytelling
Definition: The process of transforming data into data stories is called data storytelling
easy to remember
Easy to recognize
Easy to experience
Data story model
Business needs
data
Analytical Insights
story model
Storytelling
audience behavior
Related terms for data storytelling
Data-driven storytelling
visual storytelling
Analytical Storytelling
Interactive storytelling
Tell stories with data
digital storytelling
The role of data stories
attract
explain
Inspire
Understanding data stories
Data story perception
The narrator's storytelling narration produces a direct response in the human brain through the visual sense organs.
Understanding data stories
The audience’s further processing of story-based sensory information
Data Stories in Action
The actions audiences take after listening to data stories
Data Science Project Management
main character
Project sponsor
project manager
client
data scientist
data engineer
operator
Basic process
Definition of project goals
Data acquisition and management
patterns, insights from models
Patterns, model validation and optimization
Visualization and documentation of results
Patterns, application and maintenance of models
Common mistakes in data science projects
Analyzing data without checking it
Analyze data without understanding it
Putting the model into use without testing it
Data science analysis work only has goals and no research hypotheses
The data model is not updated simultaneously with the data and uses an outdated model.
Draw conclusions casually without discussing the results of data analysis
Lack of involvement of business experts
Adopt or train overly complex model algorithms
The existence of data bias
Not enough attention is paid to the presentation effect of data analysis project results
Insufficient emphasis on user experience of data science products
Overestimating or underestimating the target user’s ability to understand