Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

HCIA-GaussDB

HCIA-GaussDB is one of Huawei certifications, and its full name is Huawei Certified GaussDB Database Engineer. This certification is mainly intended for users, partner engineers, internal engineers, college students, and ICT practitioners who use Huawei's GaussDB database product. It is one of the engineer-level certifications in Huawei certification.

Edited at 2024-01-19 10:24:00

PlotWizard

Recent works View more works>>

HCIA-GaussDB

PlotWizard

Recent works View more works>>

Recommended to you
Outline

HCIA-GaussDB

Database introduction

Database Technology Overview

Database Technology

Data Data

Record Record

Database Database DB

Database management system DBMS

Database system DBS

view

History of database technology development

Database technology emerges

The emergence and development of database technology

view

Comparison of three stages of data management

view

Database system advantages

Overall data structuring

Data is highly shareable, low in redundancy and easy to expand

High data independence

Physical independence: The application and the physical storage of data in the database are independent of each other

Logical independence: The logical structures of the application and the database are independent of each other

Unified management and control

Data security protection

Data integrity check

Concurrency control

Database recovery

Database system development characteristics

view

Hierarchical, network, relational models

view

hierarchical model

There is and is only one node that has no parents. This node is called the root node (root).

Nodes other than the root node have one and only one parent node

mesh model

Allow more than one node to have no parents

A node can have more than one parent

relational model

Built on rigorous data concepts

Relationships must be normalized

The component of the relation must be an indivisible data item

Comparison of hierarchical, network, and relational models

view

Structured Query Language-Structured Query Language (SQL language)

High-level non-procedural programming language that allows users to work on high-level data structures

Does not require users to specify data storage methods

Users are not required to know the specific data storage method

Various relational database systems with completely different underlying structures can use the same SQL language as the interface for data operation and management.

Other data models

Object-oriented data model (ObjectOrientedDataModel, OO model)

XML data model

Extensible markup language (XML for short)

RDF data model

Resource Description Framework (RDF)

New challenges in data management technology

view

5V characteristics

Volume quantity

Variety

Veracity speed

Velocity real

Value value

NoSQL technology characteristics and types

NoSQL(NotOnlySQL)

A type of data management system that is non-relational, distributed and does not guarantee to meet ACID characteristics.

Technical features

Partitioning the data, using a large number of nodes for parallel processing to achieve high performance, and being able to scale out at the same time

Reduce ACID consistency constraints, allow temporary inconsistencies, and accept eventual consistency. Follow CAP theory and BASE principles

Each data partition provides backup (usually three copies) to cope with node failures and improve system availability.

Introduction to major NoSQL databases

view

NoSQL is not meant to replace RDBMS

The advantages are obvious, but the disadvantages are also obvious

Build a complete database ecosystem with RDBMS

A brief discussion of NewSQL

NewSQL

Refers to a relational database system that pursues the scalability of NoSQL while supporting relational models (including ACID features), mainly for OLTP scenarios.

Ability to support SQL as the primary language used

NewSQL classification

Rebuilding the product using a new architecture

Shared-Nothing, multi-node concurrency control, distributed processing, fault tolerance using replication, flow control and other technical architectures

Google Spanner, H-Store, VoltDB, etc.

Using Transparent Sharding middleware technology

The process of data sharding is transparent to users, and users’ applications do not need to make changes.

OracleMySQL Proxy, MariaDBMaxSacle, etc.

DAAS (Database-as-a-Service, database as a service)

Database products provided by cloud service providers. Cloud service providers provide database products with NewSQL features.

Amazon Aurora, Alibaba Cloud’s Oceanbase, Tencent Cloud’s CynosDB

Cloud database

Cloud database refers to a database that is optimized or deployed into a virtual computing environment

Traditional database VS cloud database(1)

view

Traditional database VS cloud database (2)

view

relational database architecture

Database architecture development

view

Stand-alone architecture

In order to avoid competition for resources between application services and database services, the stand-alone architecture has also evolved from the early single-host mode to the database independent host mode, which separates applications and data services. Application services can increase the number of servers, perform load balancing, and increase system concurrency capabilities.

advantage

Centralized deployment and convenient operation and maintenance

shortcoming

Scalability

The scalability of the database stand-alone architecture is only vertical expansion (Scale-up). Improve performance by increasing hardware configuration, but the hardware configurable resources of a single host will reach an upper limit

There is a single point of failure

When expanding the capacity, it is often necessary to stop the expansion and stop the service.

Hardware failure results in unavailability of the entire service or even data loss

A single machine encounters a performance bottleneck

grouping architecture

Primary and secondary

Main and backup machine architecture

The database is deployed on two servers, and the server responsible for data reading and writing services is called the "host"

Another server uses the data synchronization mechanism to copy the host's data, which is called a "standby server"

At the same time, only one server provides data services to the outside world.

advantage

Applications do not require increased development in response to database failures

Improved data fault tolerance compared to single-machine architecture

shortcoming

Waste of resources. The standby machine and the main machine are configured equally, but resources are basically limited in the long term and cannot be utilized.

Performance pressure is still concentrated on a single machine, and performance bottlenecks cannot be solved.

When a failure occurs, the switchover between the active and standby machines requires certain manual intervention or monitoring.

Master-slave

Master-slave architecture

The deployment mode is similar to the active and backup mode. The backup machine is promoted to a slave machine (Slave) and provides certain data services to the outside world.

Distribute pressure by separating reading and writing

Writing, modifying, and deleting operations are completed on the writing library (host)

Assign the query request to the reading library (slave machine)

advantage

Improve resource utilization, suitable for application scenarios with more reading and less writing

In the usage scenario of large concurrent reading, load balancing can be used to balance among multiple slave machines.

The scalability of the slave machine is relatively flexible, and the expansion operation will not affect the business process.

shortcoming

Latency problem, there will be a delay when data is synchronized to the slave database, so the application must be able to tolerate short-term inconsistency; it is not suitable for scenarios with very high consistency requirements.

The performance pressure of write operations is still concentrated on the host

If the host fails, master-slave switching needs to be implemented. Manual intervention requires response time, and automatic switching is more complex.

Multiple owners

Multi-master architecture

The database servers are master-slave to each other and provide complete data services to the outside world at the same time.

advantage

Higher resource utilization while reducing the risk of single points of failure

shortcoming

Both hosts accept write data, and two-way data synchronization is required. Bidirectional replication also brings latency issues, and in extreme cases data loss may occur.

The increase in the number of databases will cause data synchronization problems to become extremely complex. Dual-machine mode is often seen in practical applications.

Shared storage multi-active architecture

Multi-active architecture of shared storage (Shared-Disk)

A special multi-master architecture

Database servers share data storage, and multiple servers balance the load

advantage

Multiple computing servers provide high-availability services, providing a high level of availability. Scalability avoids the single point of failure problem of server clusters

Convenient horizontal expansion can increase the parallel processing capabilities of the overall system

shortcoming

Technically difficult to implement

Sharding architecture

The main form of sharding architecture is horizontal data sharding architecture

A sharding scheme that distributes data across multiple nodes. Each shard includes a part of the database, called a shard.

Multiple nodes have the same database structure, but there is no intersection between data from different shards. The union of all partition data constitutes the overall data.

Common sharding algorithms include: data sharding based on list values, range values and Hash values

advantage

Data is scattered on various nodes in the cluster, and all nodes can work independently

Shared-Nothing architecture

shared nothing architecture

Each node (processing unit) in the cluster has its own independent CPU/memory/storage, and there are no shared resources.

Each node (processing unit) processes its own local data, and the processing results can be summarized to the upper layer or transferred between nodes through communication protocols.

Nodes are independent of each other and have strong scalability. The entire cluster has powerful parallel processing capabilities

MPP architecture (Massively Parallel Processing)

MPP: Massively Parallel Processing (MassivelyParallelProcessing)

MPP distributes tasks to multiple servers and nodes in parallel. After the calculation is completed on each node, the results of each part are summarized together to obtain the final result.

feature

Task parallel execution, distributed computing

Common MPP products

No shared master: Vertica, Teradata

Shared Master: Greenplum, Netezza

Comparison of database architecture features

Mainstream application scenarios of relational databases

OnLineTransaction Processing

OLTP is the main application of traditional relational databases

For basic, daily transaction processing, such as bank deposit and withdrawal transactions, transfer transactions, etc.

Features

High throughput: a large number of short online transactions (inserts, updates, deletes), very fast query processing

High concurrency, (quasi) real-time response

Typical OLTP scenario

retail system

financial trading system

Train ticket sales system

flash sale activity

OnLineAnalytical Processing

OLAP

The concept of online analytical processing was first proposed by E.F. Codd in 1993 for OLTP systems.

It refers to the query and analysis operations of data, usually querying and analyzing a large amount of historical data. The historical period involved is relatively long, the amount of data is large, and the aggregation and aggregation operations at different levels make transaction processing operations more complex.

Features

Mainly focused on complex queries and answering some "strategic" questions

Data processing focuses on "analytical" data processing and operations such as data aggregation, summary, group calculation, and window calculation.

Use and analyze data from multiple dimensions

Typical OLAP scenario

Reporting system, CRM system

Financial risk prediction and early warning system, anti-money laundering system

data mart, data warehouse

Comparative analysis of OLTP and OLAP

Database performance metrics

TPC (TransactionProcessingPerformanceCouncil, Transaction Processing Performance Council)

Responsibility is to develop specifications, performance and price metrics for business application benchmarks (Benchmark), and manage the release of test results

What is formulated is a standard specification rather than a code. Any manufacturer can optimally construct its own system for evaluation based on the specification.

Many benchmark testing standards have been launched, including two specifications for OLTP and OLAP.

TPC-C specification

For OLTP systems, it mainly includes two indicators

Traffic indicator: tpmC (tpm–transactions per minute, that is, the number of transactions processed by the test system per minute)

Cost-effectiveness indicator: Price (test system price)/tmpC

TPC-H specification

For OLAP type systems

Traffic indicator: qphH–Query per hour, that is, the number of complex queries processed per hour

It is necessary to consider the size of the test data set, which is divided into different test data sets. 22 query statements are specified, which can be fine-tuned according to the product.

Test scenarios: data loading, Power capability test and Througput test

Database basics

Introduction to database management

Database management and its scope of work

Database Admin

Database management

Database management work is the work of managing and maintaining the database management system

The core goal is to ensure that the database management system

stability

safety

data consistency

High performance of the system

Database Administrator (DatabaseAdministrator)

The collective name for relevant personnel engaged in the management and maintenance of database management systems

Database management work scope

Database object management

physical design work

Physical implementation work

Database security management

Prevent unauthorized access and avoid disclosure of protected information

Prevent security breaches and inappropriate data modification

Ensure data is only available to authorized users

Backup and recovery management

Develop a reasonable backup strategy to implement regular data backup functions

Ensure that the database system can achieve the fastest recovery and minimum loss when a disaster occurs

Database performance management

Monitor and optimize factors affecting database performance

Optimize the resources available to the database to increase system throughput, reduce contention, and maximize workload processing

Database environment management

Database operation and maintenance management, including installation, configuration, upgrade, migration, etc.

Ensure the normal operation of IT infrastructure including database systems

Object management

What is a database object

A general term for various concepts and structures used to store and point to data in a database

Object management is the management process of creating, modifying or deleting various database objects using object definition languages or tools.

Common basic database objects

Develop database object naming conventions

Backup and recovery management

database backup

Backing up the database is to save the data in the database and the relevant information to ensure the normal operation of the database system, so that it can be used to restore the database after a system failure.

Backup objects, including but not limited to

data itself

Database objects related to data

Users and permissions

Database environment, such as configuration files, scheduled tasks, etc.

Database recovery

Activities that restore a database system from a failed or paralyzed state to normal operation and restore data to an acceptable state

disaster recovery

Enterprise-level disaster recovery

For enterprises and units, the database system and other application systems constitute a larger information system platform, so database backup and recovery is not an isolated function point. The disaster recovery performance of the entire information system platform must be considered together with other application systems.

Disaster backup

The process of backing up data, data processing systems, network systems, infrastructure, professional technical capabilities, and operational management capabilities for disaster recovery

Recovery Time Objective (RTO)

After a disaster occurs, the time required for an information system or business function to be restored from a standstill to the time it must be restored

Recovery Point Objective (RPO)

Requirements for the point in time to which systems and data must be restored after a disaster occurs

Disaster recovery level

The relationship between RTO/RPO and disaster recovery capability level in a certain industry

Backup method

Based on the scope of the backed up data collection

Full backup

Also known as full backup

Make a complete backup of all data and corresponding structures at a specified point in time

Features

The most complete data

Highest security

Backup and recovery times increase significantly with data size

Very important, it is the basis of differential backup and incremental backup

The backup period will have a certain impact on system performance.

differential backup

Differential backup refers to the backup of data that has changed since the last full backup.

incremental backup

Incremental backup refers to the backup of data that has changed since the last backup.

Comparison chart

Depending on whether to deactivate the database

Hot Standby

Back up the database while it is running normally

During the backup period, database reading and writing can proceed normally.

Warm preparation

Database availability is weaker than hot standby. During backup, the database can only perform read operations and cannot perform write operations.

cold standby

During the backup period, the application's read and write operations are not available

The backed up data has the highest reliability

According to the backup content

physical backup

Directly back up the data files corresponding to the database or even the entire disk

logical backup

Export data from the database and archive the exported data

Comparison chart

Security management

Database system security framework

Broadly speaking, the database security framework can be divided into three levels

network level security

From a technical perspective, network system level security method technologies mainly include encryption technology, digital signature technology, firewall technology and intrusion detection technology, etc.

Operating system level security

The core is to ensure the security of the server, which is mainly reflected in the server's user account, password, access rights, etc.

Data security is mainly reflected in encryption technology, data storage security, data transmission security, etc., such as Kerberos, IPsec, SSL and VPN technologies

Data management system layer security

Database encryption

Data access control

security audit

data backup

branch topic

security control model

safely control

Provide security protection against intentional and unintentional damages at different levels of the database application system, such as:

Encrypted access to data -> Intention of illegal activities

User authentication, restricted operation permissions -> intentional illegal operations

Improve system reliability and data backup -> unintentional damaging behavior

security control model

branch topic

Authentication

Database user authentication is the outermost security protection measure provided by DBMS

Block access by unauthorized users

For database applications, the username and password verification mode is currently commonly used, so it is necessary to strengthen the password strength.

Use a longer string, such as 8-20 characters

Passwords that mix numbers, letters and symbols

Change password regularly

Password cannot be reused

In the developed code or script, the clear text of the password of the database user is prohibited.

Access control

Access control is the most effective method in database security but also the most prone to problems.

The basic principle

Different permissions are given to different users based on the classification requirements of sensitive data.

principle of least privilege

Check key permissions

Check permissions on key database objects

Role-based permission management

For large database systems or systems with a large number of users, role-based access control (RBAC) is mainly used for permission management.

Enable auditing

Auditing can help database administrators discover vulnerabilities in existing architecture and usage

Database audit levels

Access and authentication audit, database user login (logon), logout (logoff) related information, such as login and logout time, connection method and parameter information, login method, etc.

User and Admin Auditing: Analysis and reporting on activities performed by users and administrators

Security activity monitoring: record any unauthorized or suspicious activities in the database and generate audit reports

Vulnerability and Threat Audit: Discover possible vulnerabilities in the database and the "users" who want to exploit these vulnerabilities

Database encryption

Different levels of database encryption

DBMS kernel layer

Data is encrypted/decrypted before physical access

It is transparent and invisible to database users.

Encrypted storage is used, and the encryption operation is run on the server side, which will increase the load on the server to a certain extent.

DBMS outer encryption

Develop special encryption and decryption tools, or define encryption and decryption methods

You can control the granularity of encryption objects and perform encryption and decryption at the table or field level.

Users only need to pay attention to the scope of sensitive information

Performance management

resource

Supply resources

This type of resources is also called basic resources, which are resources corresponding to computer hardware.

Resources managed by the operating system

Processing power: CPU>Memory>>Disk≈Network

Concurrency control resources

Such resources include but are not limited to: locks, queues, caches, mutual exclusion signals, etc.

Resources managed by database systems

Basic Principles of Performance Management

Make full use of resources without wasting them

The meaning of performance management

Efficient use of resources

Databases actually always operate in a restricted environment

Effective management of resources ensures that the database system can meet user performance requirements for the system during peak periods

Detect system problems

Real-time system performance monitoring (real-time monitoring of system performance through logs or tools provided by the database)

System historical performance data tracking (analysis of historical performance data)

capacity planning

The data collected by performance management is the basis for system capacity planning and other forward-looking planning

Speak with facts rather than feelings

Performance management goals

Basic indicators of database system

Throughput

Response time

OLTP

Provide the highest possible throughput within acceptable response times

Reduce unit resource consumption, quickly pass through concurrent shared areas, and reduce bottleneck constraints

OLAP

Minimize response time within limited resources

A transaction should fully utilize resources to speed up processing time

Some scenarios for performance optimization work

Online optimization or performance optimization that does not meet performance expectations

System optimization for gradually slower response times

System optimization (emergency treatment) when the system suddenly slows down during operation

Slows down suddenly, then returns to normal after continuing for a while

System optimization based on reducing resource consumption

Preventive daily inspection work

Data that needs to be collected for performance management

The scope of data that needs to be collected for performance management includes but is not limited to

CPU usage data

space usage

Users and roles using the database system

Heartbeat query response time

SQL submitted to the database is the basic unit of performance data

Job-related performance data submitted by the database tool (such as loading, unloading, backup, recovery, etc.)

time range of concern

Daily scope: peak hours of the week; end of month; seasonal variation data

Within a day: time period when users intensively use the system; time period when system pressure is relatively high, etc.

Create performance reports

The database system has many built-in monitoring reports

Extract performance-related data to create regular performance reports (daily, weekly, monthly reports)

Establish performance trend analysis reports for common indicators, which can provide an intuitive display of current system performance.

Reports on specific trend types, including but not limited to

Reports based on abnormal events

SQL or jobs that consume a lot of resources

Resource consumption reports for specific users and user groups

Resource consumption reports for specific applications

Operation and maintenance management

Database installation

Database unloading

Database migration

Migration plans need to be designed based on the needs of different migration scenarios.

factors to consider

Time window available for migration

Tools you can use for migration

Whether the data source system stops writing operations during the migration process

What is the network condition between the data source system and the target system of the migration process?

Estimate backup/restore time based on amount of data migrated

Post-migration, data consistency audit between source and target database systems

Database expansion

The capacity of any database system is determined by estimating the amount of data in the future based on a certain point in time. Capacity is not only the amount of data storage, but also needs to consider the following aspects:

Insufficient computing power (average daily CPU busyness of the entire system is >90%)

Insufficient response/concurrency capabilities (QPS, TPS dropped significantly, unable to meet SLA)

Insufficient data capacity (available data space is less than 15%)

Selection of expansion plans

vertical expansion

Vertical expansion is to add database server hardware, such as increasing memory, increasing storage, increasing network bandwidth, and improving the performance configuration of stand-alone hardware. This method is relatively simple, but it will encounter a single-machine hardware performance bottleneck.

Horizontal expansion

Increase the number of servers horizontally and take advantage of the number of servers in the cluster to increase the performance of the overall system

Downtime expansion

Simple, but the time window is limited, and problems will cause expansion failure. And if it takes too long, it will not be easily accepted by customers.

Smooth expansion

No impact on database services

The technical solution is relatively complex, especially as the number of database servers increases, the complexity of expansion increases sharply.

Routine maintenance work

Database troubleshooting

Configure database monitoring indicators and alarm thresholds

Set the alarm notification process according to the level of fault events

After accepting the alarm information, locate the fault based on the logs.

For problems encountered, the original information should be recorded in detail

Strictly abide by operating procedures and industry safety regulations

For major operations, the feasibility of the operation must be confirmed before the operation, and corresponding backup, emergency and safety measures must be taken before the operation is performed by authorized operators.

Database health inspection

View health check tasks

Manage health check reports

Modify health check configuration

Important database concepts

Databases and database instances

Database

A collection of physical operating system files or disk data blocks

Such as data files, index files, structure files

Not all database systems are file-based, and there are also forms that write data directly to data storage.

DatabaseInstance

An instance refers to a series of processes in the operating system and the memory blocks allocated for these processes.

A database instance is a channel for accessing a database

Generally speaking, one database instance corresponds to one database

Multi-instance operation can make full use of hardware resources and maximize server performance.

Distributed cluster

A cluster is a group of independent servers that form a computer system through a high-speed network.

In a distributed cluster, each server may have a complete copy or a partial copy of the database. All servers are connected to each other through the network to form a complete, global, logically centralized, and physically distributed large-scale database.

Database connections and sessions

Database connection(Connection)

The physical level communication connection refers to a network connection between a client and a dedicated server (Dedicated Server) or scheduler (Shared Server) established through the network.

Specify connection parameters when establishing a connection, such as server host name or IP, port number, connection user name and password, etc.

Database session (Session)

Logical concept of communication between client and database

A context (Context) between the communicating parties from the beginning of communication to the end of communication. This context is a piece of memory located on the server side: it records the client machine of this connection, the corresponding application process number, the corresponding user login and other information.

Database connection pool

Establishing a database connection comes at a cost

Frequently establishing and closing database connections will make the allocation and release of connection resources a bottleneck in the database, thereby reducing the performance of the database system.

Connection pool: reuse of database connections

Responsible for allocating, managing, and releasing database connections. It allows applications to reuse an existing database connection instead of establishing a new one.

Database connections can be reused efficiently and securely

Schema

Schema is a structure described in a database formal language and is a collection of objects.

Allow multiple users to use a database without interfering with other users

Organize database objects into logical groups to make them easier to manage

Form a namespace to avoid object name conflicts

Schema includes tables and other database objects, data types, functions, operators, etc.

Tablespace

A table space is composed of one or more data files

Define the storage location of database object files through table spaces

All objects in the database are logically stored in table spaces

Physically stored in the data file to which the table space belongs

Table space function

Arrange the physical storage location of data according to the database object usage pattern to improve performance

Frequently used indexes are placed on disks with stable performance and fast computing speed.

Archive data, tables with low usage frequency and low access performance requirements are stored on slow disks.

Specify the physical disk space occupied by the data through the table space

Limit the upper limit of physical space usage through table space to avoid running out of disk space

Table

Temporary tables

How tables are stored

Choice of storage method

List suitable scenarios

Statistical analysis queries (scenarios with many groups and joins)

Suitable for application queries such as OLAP, data mining and other large-scale queries

Scenarios suitable for bank deposits

Point query (return few records, simple query based on index)

Suitable for OLTP, this kind of lightweight transactions, large number of write operations, and scenarios with a lot of data additions, deletions and modifications

Partition

A partitioned table divides the data of a large table into many small data subsets, called partitions.

range partition table

list partition table

Hash partition table

interval partition table

Partition table benefits

Improve query performance

Enhance usability

Easy maintenance

Balanced I/O

The principle of partition pruning

Partition pruning

When querying partition objects, you can only search the partitions you care about to improve retrieval efficiency.

Applicable scenarios for partitioning

Data distribution

Data strategy selection

Distribution column selection principles

type of data

Field design suggestions

Try to use efficient data types

Try to use data types with higher execution efficiency

Try to use short field data types

Use consistent data types

When multiple tables have logical relationships, fields with the same meaning should use the same data type.

For string data, it is recommended to use variable length string data type and specify the maximum length

View

A view is different from a basic table. It does not actually exist physically. It is a virtual table.

The role of views

View function

Simplify operations and define frequently used data as views

Security, users can only query and modify visible data

Logical independence shields the impact of the structure of the real table

restrictive

Performance issues: The query may be simple, but the encapsulated view statement is complex

Modification restrictions: For complex views, users cannot modify base table data through the view

Index

An index provides a pointer to the data value stored in a specified column of the table, like a book's table of contents, which can speed up querying of the table, but also increases the processing time of insert, update and delete operations.

When creating indexes, the following recommendations serve as a reference

Creating indexes on columns that are frequently searched can speed up searches

Create an index on a column that serves as the primary key to enforce the uniqueness of the column and organize the arrangement of the data in the table

Create indexes on columns that frequently need to be searched based on ranges because the index is already sorted and its specified range is contiguous

Create indexes on columns that often need to be sorted, because the index is already sorted, so that queries can take advantage of the sorting of the index to speed up sorting query times.

Create indexes on columns that frequently use WHERE clauses to speed up the judgment of conditions.

Create indexes for fields that often appear after the keywords ORDER BY, GROUP BY, and DISTINCT

valid index

Create index ≠ index must be used

After the index is successfully created, the system will automatically determine when to reference the index. Indexes are used when the system believes that using an index is faster than sequential scanning

After the index is successfully created, it must be synchronized with the table to ensure that new data can be found accurately, which increases the load of data operations.

Useless indexes need to be deleted regularly

Judgment method

Check the execution plan by executing the explain statement to determine whether to use an index.

Index mode

constraint

Data integrity refers to the correctness and consistency of data. Integrity constraints can be defined when defining a table.

Integrity constraints are rules that do not occupy database space.

Integrity constraints are stored in the data dictionary along with the table structure definition

Common types of constraints

Uniqueness and primary key constraints (UNIQUE/PRIMARY KEY)

Foreign key constraints (FOREIGN KEY)

Check constraints (CHECK)

Not NULL constraint (NOT NULL)

Default constraints (DEFAULT)

branch topic

Relationships between database objects

Transaction

affairs

A transaction is a user-defined series of data operations that are performed as a complete unit of work

Atomicity: A transaction is a logical unit of work in a database. All operations in a transaction must be done or none of them must be done.

Consistency: The execution result of a transaction must be to move the database from one consistency state to another.

Isolation: The execution of a transaction in the database cannot be interfered with by other transactions. That is, the internal operations and data used by a transaction are isolated from other transactions, and transactions executed concurrently cannot interfere with each other.

Durability: Once a transaction is committed, the changes to the data in the database are permanent. Post-commit operations or failures will not have any impact on the transaction results.

There are two markers for the end of a transaction

End normally, COMMIT (submit transaction)

Abnormal end, ROLLBACK (rollback transaction)

transaction processing model

Submit level

transaction isolation level

Transaction isolation level and problem correspondence table

branch topic