MindMap Gallery python big data mining directory
This is a mind map about the python big data mining directory, which introduces the basics of python, financial data mining, database details, cloud server deployment practice, etc.
Edited at 2022-03-13 22:11:25Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Python big data mining and analysis
Chapter 1: python basics
Installation and use of Python and PyCharm
Install
use
Python basics
Variables, lines, indentation, and comments
type of data
Numbers and strings
Lists and dictionaries, tuples and sets
operator
Python statements
if conditional statement
for loop statement
while loop statement
try/except exception handling statement
Functions and libraries
Function definition and calling
Function return value and scope
Introduction to commonly used basic functions
Library
Chapter 2: Crawler Basics of Financial Data Mining
Basics of crawler technology
Web page structure basics
View web page source code
F12 key
right click menu
URL composition and http and https protocols
A preliminary understanding of web page structure
Advanced web page structure
HTML basics
basic structure
Title, paragraph, link
block
class and id
Preliminary actual combat
Get web page source code
Analyze source code information
regular expression
findall() function
Non-greedy matching (.*?)
Non-greedy matching.*?
Modifier re.S that automatically considers newlines
Supplementary knowledge points
Chapter 3: Financial Data Mining Case Practice 1
Extract Baidu news titles, URLs, dates and sources
Get web page source code
Write regular expressions to extract news information
Data cleaning and printout
Obtain Baidu news from multiple companies in batches to generate data reports
Batch crawl Baidu news from multiple companies
Automatically generate public opinion data report text files
Exception handling and 24-hour real-time data mining practice
Exception handling practice
24-hour real-time crawling practice
Crawl chronologically and crawl multiple pages of content in batches
Crawling Baidu News in chronological order
Crawl multiple pages of content in batches at one time
Sogou News and Sina Finance Data Mining Practice
Sogou News Data Mining Practice
Sina Finance Data Mining Practice
Chapter 4: Detailed explanation and practical use of database
Introduction and installation of MySQL database
MySQL database basics
Introduction to MySQL database management platform phpMyAdmin
Create database and data tables
Basic operations of data tables
Python interaction with MySQL database
Install PyMySQL library
Connect to database using Python
Store data into database using Python
Find and extract data from database using Python
Delete data from database with Python
Case practice: storing financial data in the database
Chapter 5: Data cleaning optimization and data scoring system construction
In-depth analysis - data deduplication and cleaning optimization
Data deduplication
Common data cleaning methods and date formats are unified
Deep filtering of text content—removing noisy data
Handling garbled data
coding analysis
Re-encoding and decoding
Empirical methods for solving garbled code problems
Construction of public opinion data scoring system
Rating based on title
Score based on text content
Solve the problem of garbled characters
Process irrelevant information
Complete Baidu News Data Mining System Construction
Store public opinion data scores in the database
Baidu News Data Mining System Code Integration
Aggregate daily ratings from database
Chapter 6: Data analysis tools: Numpy and pandas libraries
NumPy library basics
NumPy library and arrays
Several ways to create arrays
pandas library basics
Creation of two-dimensional data table DataFrame and modification of indexes
Reading and writing files such as Excel workbooks
Reading and editing data
Data table splicing
Use pandas library to export public opinion data scores
Aggregate public opinion data scores
Export public opinion data scoring table
Chapter 7: Data Visualization and Correlation Analysis
Use Tushare library to retrieve stock price data
Basic usage of Tushare library
Matching public opinion data scores and stock price data
Visualization of public opinion data scores and stock price data
Data visualization basics
Data visualization in action
Correlation analysis between public opinion data scores and stock price data
Pearson correlation coefficient
Correlation analysis in practice
Chapter 8: Advances in crawler technology for financial data mining
Introduction to IP proxy
How IP proxy works
How to use IP proxy
Selenium library detailed explanation
Difficulties in Network Data Mining
Download and installation of simulated browser ChromeDriver
Selenium library installation
Use of Selenium library
Chapter 16: Building a customer default prediction model using machine learning
Application of machine learning in finance
Basic principles of decision tree model
Introduction to decision tree model
The basis for building the decision tree model
Case practice: Building a customer default prediction model
Model building
Model prediction and evaluation
Model visual presentation
Chapter 15: Practical Cloud Server Deployment
Purchase and configuration of cloud servers
Cloud deployment of programs
Install the software required to run the program
Enable the program to run 24 hours a day
Chapter 14: Data analysis based on stock information and its derived variables
Basic ideas of strategy
Obtain basic stock information and derivative variable data
Get basic stock information data
Get stock derivative variable data
Data visualization presentation
Data table optimization and code summary
Data visualization presentation
Generate Excel workbook using xlwings library
Basic usage of xlwings library
Case practice: Automatically generate Excel workbook reports
Strategy deepening ideas
Chapter 13: Generating Word Documents with Python
The basics of creating Word documents with Python
First introduction to python-docx library
Basic operations of the python-docx library
Advanced knowledge of creating Word documents with Python
Set Chinese font
Add text to paragraph
Set font size and color
Format paragraphs
Set table style
Set image style
Case practice: automatically generate data analysis report Word document
Chapter 12: Analysis of investment decisions based on rating reports
Obtain table data from the brokerage research website
General methods of obtaining table data
Use Selenium library to crawl Hexunyanbao.com form data
Advanced usage of pandas library
Duplicate and missing value handling
Use the groupby() function to group and summarize data
Batch processing with pandas library
Evaluate the accuracy of brokerage analyst forecasts
Read analyst rating report data for data preprocessing
Calculate stock return using Tushare library
Calculate average returns and rank analyst forecast accuracy
Strategy extension
Considerations for daily limit
View each stock's returns by analyst
Calculate multi-period stock returns
Chapter 11: Setting up an email reminder system
Automatically send emails using Python
Send emails through Tencent QQ mailbox
Send emails through NetEase 163 mailbox
Send emails in HTML format
Send email attachment
Case practice: sending data analysis reports regularly
Use Python to extract data and send data analysis report emails
Use Python to send emails regularly every day
Chapter 10: Analysis of financial management announcements of listed companies through PDF text
PDF file batch download practice
Crawl multiple pages of content
Automatically filter what you need
Automatic batch download of financial announcement PDF files
PDF text parsing basics
Extract text content using pdfplumber library
Write regular expressions to extract data
Practical practice of PDF text analysis - finding suitable financial announcements
Traverse all PDF files in a folder
Parse each PDF file in batches
Automatically archive qualified PDF files
Chapter 9: Financial Data Mining Case Practice 2
Sina Finance stock real-time data mining practice
Get web page source code
Data Extraction
Oriental Fortune Network data mining practice
Get web page source code
Write regular expressions to extract data
Data cleaning and printout
Function definition and calling
Practical practice of data mining on Judgment Document Network
Juchao Information Network Data Mining Practice
Get web page source code
Write regular expressions to extract data
Data cleaning and printout
Function definition and calling