MindMap Gallery Chapter 5, Data Product Development
Data Science Theory and Practice Chapter 5 Data product development refers to products that can help users achieve a certain goal through data. Data product development involves all activities of the data science project process.
Edited at 2023-10-22 15:16:23Data product development
definition
It refers to a product that can help users achieve one of their goals through data.
Data product development involves all activities of the data science project process
Including not only human users, but also computers and other software and hardware systems
Data products exist in many forms
data processing
Single latitude conversion
Multidimensional conversion
key links
data jujitsu
Key technologies
Main features
data-centric
data driven
data intensive
data paradigm
70+ Genders on Facebook—The Difference Between Data Paradigms and Knowledge Paradigms
Diversity
Data products
Information products
Knowledge products
Smart products
Hierarchy
Content products
Application products
Service products
Decision-making products
Value-added
work creatively
think critically
Ask curiously
Data object encapsulation
Data system development
Integrated applications
Ancillary services
Derived services
key activities
The basic principle
Three points for technology, seven points for management, and 12 points for data.
Data is the raw material for the development of data industry
The wisdom of data scientists is the main source of added value in data product development
User experience is the main evaluation indicator of data products
activity elements
creative design
data insights
Visualization
story description
Virtualization
on demand services
Personalized service
Security and privacy protection
user experience
policy Analysis
data jujitsu
The art of turning data into products
Product development must have high artistic quality
Target user-centered product development
D.J. Patil
Introduce design thinking
drop-down list
single button
Smart reminder
Other solutions
Support human-machine collaboration
Amazon Mechanical Turk
The long tail of participants
Capturing labor flexibility
small task
Pay later model
Qualification review
Low data processing costs
Good at retaining users
Outstanding product design
Data, taken from the people, used by the people
Avoid causing data nausea
Estimate possible by-products or negative impacts
Correctly handle the relationship between recall rate, precision rate and response time
Return results in search engines
Catering advertising information in search engines
Book advertising information in search engines
The importance of user experience
Pay attention to the subjectivity of user cognitive behavior
Errors and nonsense are often more likely to be perceived by target users and create a wrong perception of the entire data product.
Recruit more users and obtain effective data
What information needs to be provided by users and whether this information meets the needs of data product development
When requiring users to provide personal information, the scope of collection, purpose, promised utilization methods, and future services returned to users should be clearly informed.
Anticipate failure and ensure a good user experience
Data capabilities
Data management
Various forms of existence such as data acquisition, storage, integration, analysis, application, presentation, archiving and destruction. process of evolution
data governance
A collection of related control activity performance and risk management during data resources and their application process
data processing
System execution of data operations
data strategy
The organization’s vision, purpose, goals and principles for conducting data work
data architecture
A framework for abstractions such as data element structures and interfaces and their interrelationships
Data life cycle
A set of processes that transform raw data into action and knowledge
metadata
Data about the data or data element
data element
There is a set of attribute provisions that define data units that identify representations and allowable values
main data
Core business entity data that needs to be shared across systems and departments in the organization
Data management principles
Data is valuable
Data management needs align with business needs
Data management relies on multiple skills
Data management is life cycle management
CMM
key process areas
data strategy
data governance
Data quality
Data operations
Platform and architecture
auxiliary process
maturity level
Executed level
Already managed
level defined
Measured level
Optimized level
Maturity assessment
start up
diagnosis
Establish
action
study
data strategy
Data strategy is the unified management of an organization’s data management vision and functional blueprint
U.S. Department of Defense Data Strategy Framework
visible
accessible
understandable
association
trustworthy
interoperable
Safety
Data strategy positioning
A data strategy not only needs to define the goals of data management, but also needs to provide specific action plans on how to achieve these management goals, as well as a mechanism for dynamically adjusting data management goals.
Data strategy goals
Define a data-driven organization or cultivate a data-driven culture, use data as the driving factor for the organization's decision-making activities, enhance the organization's agility, and thereby improve the organization's core competitiveness.
The focus of data strategy
data intensive problem
The scope of data strategy
China
Europe
USA
U.K.
Germany
Japan
Action Plan to Promote the Development of Big Data
data governance
Management of data management
main content
Understand your data
The scope of enterprise data management proposed by IBM
transaction data
main data
metadata
relational data
Identification and analysis of data stakeholders
Establishment of data department
Formulation of codes of conduct
Determination of data management policies and objectives
Definition of job responsibilities
Emergency Plan and Emergency Management
Level protection and classification management
Effective supervision and dynamic optimization
basic process
plan
implement
examine
Improve
DGI data governance framework
Actively define or sequence rules
Provide ongoing, cross-functional protection and services to data stakeholders
Respond to and resolve problems arising from non-compliance with rules
Data Security, Privacy, Morality and Ethics
Data Security
key resources
Ability to restore partial functionality after system damage
important resources
Able to discover important security vulnerabilities and security incidents, and be able to restore some functions within a period of time after the system is compromised.
primary resources
Able to discover security vulnerabilities and security incidents, and be able to quickly restore most functions after the system is damaged.
All resources
Able to detect security vulnerabilities and security incidents, and quickly restore all functions after the system is compromised
P^2DR model
data bias
Data sources, selection bias
survivorship bias
Data Processing and Preparation Bias
Berkson's Paradox
Algorithm and model selection bias
A/B testing
Bias in the interpretation and presentation of analytical results
Simpson's Paradox
algorithmic discrimination
Big data familiarity
data attack
Data attacks and Google bombs
Google bombing refers to artificially maliciously constructing anchor text to increase the click-through rate of articles or web pages about other people's unfavorable reports in search engines. Even though these articles or websites may not be relevant to the search topic
privacy protection
Cambridge Analytica data scandal