MindMap Gallery Complete knowledge of algorithm data structures
Idea: The basic idea of the greedy algorithm is to find the optimal solution for each small part of the whole, and combine all these local optimal solutions to form an optimal solution for the whole
Edited at 2022-05-13 22:47:51Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Complete knowledge of algorithm data structures
Basics Part 2
Why does redis must use jump tables to implement ordered collections?
Skip list: The structure of linked list plus multi-level index is skip list, a dynamic data structure that supports fast insertion, deletion and search operations.
Use the dichotomy method to achieve efficient query of linked lists, the time complexity is Ologn
By exchanging time for space, you can obtain higher efficiency. You can reduce the space complexity by reducing the density of index nodes. For storage objects, the index memory overhead can be ignored.
Efficient dynamic insertion and deletion
Insertion requires querying Ologn, insertion time O1, deletion requires querying Ologn, deleting nodes, and deleting indexes.
Dynamic update of skip table index
In extreme cases, the skip list may degenerate into a singly linked list.
When inserting data, k values are generated through a random function, and nodes are added to the k-level index to maintain "balance"
Why use skip tables instead of red-black trees?
Skipping tables to find data by interval is more efficient than red-black trees
Jumping is more flexible. It can effectively balance execution efficiency and memory consumption by changing the index construction strategy.
Hash table: How to implement document word spelling function?
Hash collision
open addressing method
Calculate the storage location based on the hash function. If it already exists, search for the empty space in sequence.
Search is the same, according to the mapping, if not found, search in sequence
Delete elements: By marking them as deleted, they will be skipped when the mark is encountered during the query.
Disadvantages: large data, many conflicts, long detection time, extreme situations On
The above is a linear detection method, as well as secondary detection (the conflict detection subscript increment is square, 1, 4, 9, etc.), double hashing
Measuring the amount of conflicts: load factor = number of loaded elements/hash table length
linked list method
Hash again
Construct multiple hash algorithms and calculate conflicts again
Create a public overflow area
Divide the hash table into basic tables, overflow out of the table, conflict with the basic table, fill in the overflow table
How to design an efficient hash function
The hash function should not be too complex and the generated values should be as random and evenly distributed as possible
Solution: Analyze the characteristics of the data and select a random data part as the key
Direct addressing, square centering, folding method, random number method, ASCLL code
The load factor cannot be too large, expand the capacity, and reduce the capacity when the data is small.
When the amount of data is large, it is not possible to move the data at once
If the data exceeds the threshold, apply for space.
When inserting data, insert it into a new hash table, and at the same time take a value from the old hash table and add it to the new hash table.
Choose a conflict resolution
Open addressing applicable scenarios
When the amount of data is small and the load factor is small, ThreadLocalMap uses open addressing to resolve hash conflicts.
linked list method
It is more suitable for storing objects and hash tables with large amounts of data. It is more flexible and supports more optimization strategies, such as red-black trees instead of linked lists.
Design requirements
Supports fast query, insertion, and deletion operations
Unreasonable memory usage
The performance is stable and in extreme cases, the hash table performance cannot degrade to an unacceptable level.
Solution
Design a suitable hash function
Define load factor thresholds and design dynamic expansion
Choosing an appropriate hash collision resolution method
Hash tables and linked lists used together
LinkedHashMap
Implemented by hash table, singly linked list, double linked list
How to find, how to delete, how to add
The addition operation is consistent with the LRU algorithm
Hash table dynamic data, efficient data insertion, deletion, and search, but cannot support fast sequential traversal
hash algorithm
The hash algorithm needs to satisfy
Data cannot be deduced backwards
Sensitive to the input data, if one bit is changed, the hash value will be different
Hash collisions are small
Execution efficiency must be efficient
Application path
Security encryption, data verification, hash function, load balancing, data sharding, distributed storage
What are the applications of hash in distributed systems?
Load balancing processing method: Calculate the hash value of the client IP address or session ID through the hash algorithm, and then take the modulo to calculate the size of the server list, so that routing to the server through the IP address remains unchanged.
Data fragmentation: Count the number of occurrences of "search keywords" in 1T log files, and find out whether a picture book exists in the library
Distributed storage-one-time hash algorithm
Binary tree
definition
Node height, node depth, node level, tree height, complete binary tree, full binary tree
storage
Chained storage, sequential storage
Arrays can be used to store a complete binary tree, which is the most space-saving
Binary tree recursive traversal
Search, delete, add in binary search tree
The deletion operation involves three possibilities: the deleted node is a leaf node, the deleted node has two nodes, and the deleted node has one child node.
How to solve the problem of identical keys in binary trees
Store the same value on a node through a linked list or an array that supports dynamic expansion.
Directly treat it as a number greater than the node and store it in the right leaf
Why should we use binary trees instead of hash tables instead of binary trees?
1. The hash table data is stored out of order. In the binary search tree, only in-order traversal is needed to achieve ordering of the data.
2. Hash expansion takes a lot of time, and the performance is unstable when encountering hash conflicts. In the project, a balanced binary tree is used to solve the unstable performance, so that the time complexity is stable at Ologn
3. The design of the hash table is complicated, there are many situations to consider, and space is wasted due to the loading factor.
red black tree
definition
General theory: The height difference between the left and right trees of any node in a binary tree cannot be greater than 1
Narrow theoretical requirements: In red-black trees, there is no guarantee that the height difference between the trees is 1, as long as the height of the tree is Ologn
The color of the root node is black
Each leaf is a black empty node, and the leaf node does not store data.
There cannot be two consecutive red nodes from a node to a leaf node.
Every node, and all paths from that node to its reachable leaf nodes, contain the same black node
Balance: Make the tree look symmetrical and balanced on the left and right, ensuring that the height of the entire tree is relatively low, and the efficiency of inserting, deleting, and searching is high.
recursive tree
Solve recursive time complexity analysis
Heap sort
Definition: A heap is a complete binary tree
Characteristics: The value of a node is greater than or equal to the value of its child nodes
Two steps of heap sorting: building heap and sorting
Insert elements: stack from bottom to top
Delete elements: stack from top to bottom
Complexity: The heap building time complexity is On, sorting is Onlogn, and the overall complexity is: Onlogn
Lots of applications
priority queue
The queue is dequeued successively according to the priority. You can use the heap to implement the priority queue.
Huffman coding, shortest path of graph, minimum spanning tree
timer
Use the heap to store scheduled tasks and calculate the time interval between the top of the heap and the next scheduled task, without the need for scheduled polling
Get top 100
Use the heap to maintain a small top heap of 100 elements
Static data: directly compare the element with the top of the heap. If it is less than the top of the heap, it will not be processed. If it is greater than the top of the heap, delete the top of the heap and then heap it.
Dynamic data: If there are two operations: adding data and accessing the current top 100 data, the heap can be maintained
Get the median and 99% response time of the statistical program interface
Representation of graph
Like a binary tree, it is a nonlinear structure
Vertices, edges, degrees, in-degrees, out-degrees, weighted graphs, adjacency matrices, adjacency lists
Advantages and disadvantages of adjacency matrix and adjacency list
Adjacency matrix: It wastes space in storage, has high query efficiency, and facilitates matrix operations.
Adjacency list: Each vertex is stored in a linked list, which saves space and is inconvenient to search. The query efficiency is not as high as the adjacency matrix, but it can be optimized through balanced binary trees, skip tables, and hash tables.
Depth and breadth first search (six-dimensional space theory)
breadth first
Time complexity OE (number of edges)
Space complexity OV (number of vertices)
depth first
Same as above
Breadth first uses queue implementation, depth first uses stack implementation
string matching
BF (Brute Force violent matching)
The main string and the pattern string, the pattern string and the main string match one by one
Time complexity (On*m)
RK (Rabin-Karp)
Use the hash algorithm to calculate the hash value of all substrings of the string and compare it with the hash value of the pattern string
The time complexity is On, which requires On to process the hash value and On to compare.
BM (Boyer-Moore)
Relatively complex string matching, but has high matching efficiency and is widely used in text editors
The core idea is that when the pattern string does not match the main string, the pattern string slides back a few more places to reduce unnecessary character matching.
There are two main construction rules: bad character rules and good suffix rules
KMP
It is very similar to the BM algorithm. When bad characters are encountered, the pattern string is moved a few more places.
Time complexity O(n m)
AC automaton
Improved algorithm based on Trie tree, failure pointer exists
Build steps
Building an AC automaton
Build Trie tree
Build failure pointer
Match main string in AC automaton
Trie tree
Also known as dictionary tree: a tree structure, a data structure that specializes in string matching, to quickly find a string
Time complexity of searching string in Trie tree: 0(n)
Trie tree is an idea of exchanging time for space, which consumes a lot of space.
Solution: use hash table, red-black tree
Trie has many string requirements for processing
String containing characters cannot be too large
Requires string prefixes to overlap more
Implement Trie logic yourself
The pointer is discontinuous for data blocks, and performance will be affected.
application
Automatic input completion, input method automatic completion, IDE compiler automatic completion
greedy algorithm
Huffman compression coding, minimum spanning tree, single source shortest path, knapsack problem
Difficulty: Abstract the problem into a greedy algorithm model
Idea: The basic idea of the greedy algorithm is to find the optimal solution for each small part of the whole, and combine all these local optimal solutions to form an optimal solution for the whole
Scope of application
The overall optimal solution can be found through local optimal solutions
.A whole can be divided into multiple parts, and optimal solutions can be found for these parts.
step
Starting from an initial solution to the problem
Using loop statements, when a step can be taken towards the solution goal, a partial solution is obtained based on the local optimal strategy to reduce the scope or scale of the problem.
Combine all partial solutions to get the final solution to the problem
getting Started
Algorithm function
Write better performing code
Algorithms, expand your way of thinking
Can train the brain’s thinking ability
Conducive to framework reading and design thinking
Difficulty learning algorithms
The project involves less
easy to forget
Unable to use flexibly
The relationship between algorithms and data structures
Data structures serve algorithms, and algorithms act on specific data structures.
how to learn
Think more, do more, practice while learning, write repeatedly, ask more questions, interact more
First, master algorithm complexity analysis, work hard, and balance time and space resources.
You need to be able to concentrate on learning, and knowledge needs to be continuously accumulated
The method of killing monsters and upgrading, and constantly cultivating your own interests and experience
Write study notes for each article and implement the code of the week once a week
The origin, characteristics, applicable scenarios of learning algorithms, and the problems it can solve
Learning purpose
Establish awareness of time complexity and Confucian complexity, and write high-quality code
Able to design infrastructure, improve programming skills, and train logical thinking
Improve the depth of looking at problems and develop angles to solve problems
Recommended books
"Algorithm 4th Edition Java"
"Sword Finger Offer"
"Programming Jewels"
"The Beauty of Programming"
"Data Structure and Algorithm Analysis"
"Introduction to Algorithms"
Basics Part 1
Complexity analysis
How to analyze and count algorithm execution efficiency and resource consumption
What is complexity analysis
Data structure and algorithm solving is "how to make computers solve problems faster, more time-saving, and more space-saving"
Therefore, the performance of data structures and algorithms needs to be evaluated from the two dimensions of execution time and space occupied.
Performance problems are described using the two concepts of time complexity and space complexity respectively, both of which are collectively referred to as complexity.
Complexity describes the growth relationship between the execution time (or space occupied) of an algorithm and the size of the data.
Necessity of complexity analysis
Compared with performance testing, complexity analysis has the characteristics of not relying on the execution environment, low cost, high efficiency, easy operation, and strong guidance.
Mastering complexity analysis will enable you to write code with better performance, which will help reduce system development and maintenance costs.
How to perform complexity analysis
Big O notation
The execution time of the algorithm is proportional to the number of executions of each line of code, represented by T(n) = O(f(n)), where T(n) represents the total execution time of the algorithm, and f(n) represents the total execution time of each line of code. times, and n often represents the size of the data.
Take time complexity as an example. Since time complexity describes the growth trend of algorithm execution time and data size, constant order, low order and coefficients do not actually have a decisive impact on this growth trend. Therefore, when doing time complexity These terms are ignored during degree analysis.
Common complexity levels
polynomial order
O(1) (constant order), O(logn) (logarithmic order), O(n) (linear order), O(nlogn) (linear logarithmic order), O(n^2) (square order), O(n^3) (cubic order)
non-polynomial order
O(2^n) (exponential order), O(n!) (factorial order)
Logarithmic order derivation formula
log3n is equal to log32 * log2n derived by changing the base formula
2x=n: Break out of the loop when looping x times >=n
Time complexity analysis
Best, worst, average, and amortized time complexity
average time complexity
If the complexity of the code differs in magnitude under different circumstances, it is represented by the weighted average of the number of times the code is executed under all possible circumstances.
Amortized time complexity
The code is low-level complexity in most cases, and high-level complexity in only a few cases;
There is a temporal pattern to the emergence of low and high levels of complexity. The amortized result is generally equal to the low-level complexity.
Why do arrays start numbering from 0?
How to implement random access
linear table
Continuous memory space and data of the same type
Array out of bounds problem
In Java, depending on the compiler, -fno-stack-protector (stack protection function) is enabled by default in gcc. Whether i is defined before or after, it will be pushed onto the stack after array.
Insertion and deletion optimization
When the array is unordered and requires insertion, the current element can be replaced and the current element can be added at the end.
When deleting data, avoid frequently moving data. You can mark the deleted elements and delete the data when there is insufficient space. According to the garbage collection mechanism
Array advantage
Collection framework cannot store basic data types
If you know the size of the data and the data operation is simple, you can use an array.
Why do computers use 0
Reduce instruction calculations
a[k]_address = base_address K*type_size, if starting from 1 k-1
How to implement LRU (Least Recently Used) cache elimination algorithm
There are two cases of deleting nodes in the linked list. The execution efficiency of single linked list and multi-linked list is different.
Delete a node whose value is equal to a certain value
Delete the node pointed to by the given pointer
Programming optimization ideas
Optimize by exchanging space for time
Arrays are simple and easy to use, and are friendly to CPU caching, because linked lists are not stored continuously, and the CPU caches data blocks.
Implement LRU
The data exists in the cache and needs to be deleted and added to the head of the queue.
Not in cache
The cache is not full, insert into the head
When the cache is full, delete the tail node and insert a new node into the head.
Single linked list implementation of palindrome string idea
Practice LetCode 206, 141, 21, 19, 876
Better written linked list algorithm
Understand the meaning of pointers or references
Be wary of lost pointers and memory leaks
Simplifying implementation difficulty using Sentinels
Pay attention to the processing of boundary conditions
Linked list is empty
The linked list has only one node
The linked list has two nodes
The code logic is processing the head node and tail node
Drawing to aid thinking
Practice more, use your hands more
Common operations on linked lists
Reverse singly linked list
Speed pointer, forward or backward part reversal
Linked list ring detection
Determine the cycle in a singly linked list
Traverse to determine whether nodes are repeated, fast and slow pointers
The exit of the judgment ring
The distance a from the head of the linked list to the entry point is equal to the distance from the meeting point of slow and fast to the entry point.
Find the number of nodes on the ring
Start from the meeting point, traverse it again, and return to the starting point
Merge two ordered linked lists
Compare two pointers, sort
Delete the last n nodes of the linked list
Move the two pointers equidistantly to the right and traverse twice length-k 1. Reverse the linked list
Find the middle node of the linked list
speed pointer
Using stack to implement browser rollback function
Stack: A linear list with limited operations, only allowing insertion and deletion of data at one end
Implementation method: sequential stack, chain stack
the complexity
Time complexity, best case 1, worst case n
Amortized time complexity 1
For temporary variable storage, the operating system allocates independent memory space and uses local variables in the stack storage method.
Application in four calculations
Implemented using two stacks, one to store operands and one to store symbols. When the symbol is pushed into the stack, the symbol in the stack is compared. If the priority is lower than the symbol in the stack, the operation is popped out of the stack.
Use stack matching brackets
memory space
code area
Store the binary code of the method body, high-level (job) scheduling, intermediate (memory scheduling), low-level (process scheduling), control the switching of execution code in the code area
static data area
Store global variables, static variables, and constants. Constants include final-modified constants and String constants. The system automatically allocates and recycles.
dynamic data area
stack area, heap area
Stack: stores formal parameters, local variables, and return values of running methods. Automatically allocated and recycled by the system
Heap: The reference or address of a new object is stored in the stack area, pointing to the real data of the object stored in the heap area.
Letcode programming exercises
20,155,232,844,224,682,496
The use of queues in limited resource pools such as thread pools
Queue: A linear table with limited operations, only allowing insertion at one end and deletion at the other end
There are also two ways to implement
Sequential queue implemented by array, chained queue implemented by linked list
Based on the fact that the array queue needs to be expanded after its capacity is full. If there is still free space in front of the head and tail ==n cannot add elements, it needs to be expanded.
Data can be copied, time complexity n
Move the data directly and move the array sequentially, the time complexity is only 1
Use circular linked list to solve this problem
Empty and full judgment
Array implementation
Empty head==tail Full tial==n
Linked list implementation
Empty tail==head Full (tail 1)%n=head
blocking queue
Applied in the producer-consumer model
concurrent queue
Regarding thread safety, using concurrent queues, you can directly lock enqueue() and dequeue(). However, the lock granularity is large and the concurrency is relatively low.
You can use array-based circular queues and use CAS atomic operations to achieve efficient concurrency.
Task request thread resources, database connection pool
Unbounded queue based on linked list
Too many tasks, too long waiting time, not suitable
Array-based bounded queue
Request exceeds queue size, request rejected
Mainly to find the most reasonable queue size
Recursion: How to find the "final recommender" with three lines of code?
Recursion needs to meet three conditions
A problem can be broken down into multiple problems
After decomposing into multiple problems, except for the different scales, the solution ideas are the same.
There is a recursive termination condition
Prevent stack overflow
Solution: 1. Convert to non-recursive 2. Use static instead of nostatic 3. Increase the stack size value
Be wary of double counting
You can use a hash table to store the results of each recursion
Disadvantages of recursion
Required time complexity, large space complexity, stack overflow, and repeated calculations
Rewrite recursive code and change recursive code to non-recursive code
Recursive debugging methods
Print logs to discover recursive values and debug using conditional breakpoints
Why insertion sort is more popular than bubble sort
Wide variety of sorting
The most commonly used: bubble sort, insertion sort, selection sort, merge sort, quick sort, counting sort, radix sort, bucket sort.
Such as monkey sorting, sleep sorting, noodle sorting, etc.
How to analyze a sorting algorithm
Algorithm execution efficiency
Best, worst, and average case time complexity
Coefficients, constants, low-order of time complexity
The number of comparisons and exchanges
Memory consumption of sorting algorithm
The stability of sorting algorithms
Stable sorting algorithm can keep two objects with the same amount, and the order remains unchanged after sorting (applied to sorting by order amount, the same amount is sorted according to the order time)
Bubble Sort
In-place sorting algorithm, adjacent equal elements are not exchanged, complexity is best 0n and worst is 0n2
Probabilistic method analysis time complexity analysis
Analysis through the concept of "degree of order and degree of reverse order"
Order degree: the number of ordered pairs in the array, reverse order degree = full order degree - order degree
Worst case: the order degree is 0, n*(n-1)/2 exchanges are required, the average is n*(n-1)/4 exchanges, and the average time complexity is 0n2
insertion sort
In-place sorting is a stable sorting algorithm. The best time complexity is On, the worst is On2, and the average complexity is On2.
selection sort
In-place sorting is an unstable sorting. The best time complexity is On2, the worst is On2, and the average complexity is On2.
It is precisely because the sorting is unstable that bubble sorting is less efficient than insertion sorting.
Insertion sort vs bubble sort comparison
Bubble sort requires three assignments to exchange data, while insertion sort only requires one assignment to move data.
Quick sort and merge sort
Merge sort, select the middle point
Whether the merge is stable depends on whether the position of the same elements changes during the merge.
Time complexity is: Onlogn, space complexity On
Quick sort, randomly select points, and then place them on both sides
Quick sort is an unstable sort, with space complexity O1 and time complexity Onlogn. In extreme cases, On2 will occur, so quick sort is generally used.
Both adopt the divide-and-conquer idea and are implemented through recursion. Quick sort understands the recursive formula and merge function, and quick sort understands the recursive formula and partition function.
Quick sort, recursive sort, find the kth largest number
Bucket sort (On), counting sort (On k), radix sort (Odn)
bucket sort
Applicable scenarios: Suitable for use in external sorting, for example, when 10G data needs to be read and the memory is only 300M
counting sort
It is suitable for situations where the granularity is smaller than bucket sorting, the range is small, and it is a positive integer.
Radix sort
Suitable for situations where the range is large and there are gradient levels, such as comparing phone numbers.
How to implement a general-purpose, high-performance sorting function
The time complexity of linear sorting is relatively low, but the applicable scenarios are quite special. If you need to write a general sorting, you cannot choose linear sorting.
Why is quick sort often used?
Since the time complexity is low and the space complexity is On, it is only the worst case On.
Optimization method: three-number elimination, random method
The time complexity of recursive sorting is low, but the space complexity is On
Application-level sorting: achieved through a mix of sorting methods
qsort implementation in c language
First use merging, exchanging space for time. When the amount of data is too large, use quick sorting.
Quick sorting is implemented using the method of taking the middle of three numbers
In quick sort, insertion sort is used when the sorting interval is less than 4
For small-scale sorting, On2 does not necessarily take longer to execute than Onlogn. If the amount of data is small, you can choose insertion sort, or use sentinels to simplify the code and optimize performance.
binary search
How to use the most memory-saving way to achieve fast search function?
Binary search is highly efficient and has a time complexity of Ologn
An ordered array data structure is required, suitable for situations where the scale is relatively large, queries are frequent, and there are no frequent data insertion or deletion operations.
Pay attention to the exit of the loop condition, the value of mid, and the update of low and high
Binary search can be implemented using loops and recursion
Hash tables and binary trees can solve the problem of quickly searching dynamic data structures, but they require relatively large additional memory space.
leetcode 33
Variation of binary search for duplicate situations
Query the first occurrence of the value
Query the last occurrence of the value
Query the first number greater than a specific value
Query the last number smaller than a specific value