MindMap Gallery Information retrieval mind map
This is a mind map about information retrieval, with a detailed introduction and comprehensive description. I hope it will be helpful to interested friends!
Edited at 2023-12-04 20:28:49This Valentine's Day brand marketing handbook provides businesses with five practical models, covering everything from creating offline experiences to driving online engagement. Whether you're a shopping mall, restaurant, or online brand, you'll find a suitable strategy: each model includes clear objectives and industry-specific guidelines, helping brands transform traffic into real sales and lasting emotional connections during this romantic season.
This Valentine's Day map illustrates love through 30 romantic possibilities, from the vintage charm of "handwritten love letters" to the urban landscape of "rooftop sunsets," from the tactile experience of a "pottery workshop" to the leisurely moments of "wine tasting at a vineyard"—offering a unique sense of occasion for every couple. Whether it's cozy, experiential, or luxurious, love always finds the most fitting expression. May you all find the perfect atmosphere for your love story.
The ice hockey schedule for the Milano Cortina 2026 Winter Olympics, featuring preliminary rounds, quarterfinals, and medal matches for both men's and women's tournaments from February 5–22. All game times are listed in Eastern Standard Time (EST).
This Valentine's Day brand marketing handbook provides businesses with five practical models, covering everything from creating offline experiences to driving online engagement. Whether you're a shopping mall, restaurant, or online brand, you'll find a suitable strategy: each model includes clear objectives and industry-specific guidelines, helping brands transform traffic into real sales and lasting emotional connections during this romantic season.
This Valentine's Day map illustrates love through 30 romantic possibilities, from the vintage charm of "handwritten love letters" to the urban landscape of "rooftop sunsets," from the tactile experience of a "pottery workshop" to the leisurely moments of "wine tasting at a vineyard"—offering a unique sense of occasion for every couple. Whether it's cozy, experiential, or luxurious, love always finds the most fitting expression. May you all find the perfect atmosphere for your love story.
The ice hockey schedule for the Milano Cortina 2026 Winter Olympics, featuring preliminary rounds, quarterfinals, and medal matches for both men's and women's tournaments from February 5–22. All game times are listed in Eastern Standard Time (EST).
information retrieval
Chapter 1 Information Retrieval
1. Information resource retrieval
(1) Information retrieval and related concepts
1. Information retrieval: refers to the entire process of quickly and accurately finding the information required by users from numerous information collections. Information retrieval in a broad sense is also called information storage and retrieval, which includes two aspects: one is the storage of information, that is, organizing and storing information in a certain way; the other is information retrieval, that is, finding out relevant information according to the needs of the user. information process. Information retrieval in a narrow sense only refers to the second half of the process, that is, the entire process of finding the required information from an information collection.
2. Description: It is the process of analyzing, selecting and recording the content and formal characteristics of documents according to certain rules. Information description is the basis for organizing information retrieval systems and an important link in the information storage process.
3. Indexing (information indexing/document indexing): refers to the process of analyzing the content attributes and appearance attributes of documents, and using specific languages to express the analyzed attributes or characteristics, thereby giving document retrieval identification. It is developed based on the analysis of document information objects, and is also called document indexing. (Including two links: ① theme analysis ② conversion identification)
(2) Types of information retrieval
1. Divide according to the content of the search object
⑴Literature search
Literature search: refers to a search with the goal of finding relevant literature on a certain topic. Document retrieval is a correlation retrieval rather than a deterministic retrieval, and its search objects are various types of documents containing specific information. Literature retrieval includes: full-text information retrieval and secondary literature information retrieval such as catalogs, bibliographies, indexes, and abstracts.
⑵Data retrieval
Data retrieval: refers to retrieval targeting specific numerical data. Data retrieval is a deterministic retrieval, that is, the retrieval system directly provides the exact data required by the user, and the retrieval results are generally deterministic.
⑶Fact search
Fact retrieval: refers to retrieval targeting specific facts. Fact retrieval is also a deterministic retrieval. This retrieval includes not only the retrieval, operation, and derivation of numerical data, but also the retrieval, comparison, and logical judgment of facts, concepts, etc.
2. Divide by search method
⑴Manual information retrieval
Manual information retrieval: refers to the use of printed retrieval tools to conduct information retrieval manually. Its advantages are: intuitive, flexible, and easy to control the accuracy of retrieval; disadvantages are: the search process is complex, the retrieval speed is slow, and the search workload is large.
⑵Computer information retrieval
Computer retrieval: It is the process of converting information and its retrieval identifiers into a binary coded form that can be read and processed by computers, and storing them in a database system. The computer searches and outputs the digitized information according to the designed program. Computer retrieval has greatly improved the efficiency and comprehensiveness of retrieval, broadened the field of information retrieval, and enriched the research content of information retrieval. It can be further divided into offline retrieval, online retrieval, CD retrieval and network retrieval.
3. Divide according to search requirements
⑴Characteristic search
Feature retrieval: Also known as strong correlation retrieval, it emphasizes providing users with highly relevant information. This type of retrieval emphasizes the accuracy of the retrieval, as long as the retrieved literature information can meet the needs of the user, and usually there is no requirement for the number of retrieval results.
⑵Ethnicity search
Family retrieval: Also known as weak correlation retrieval, it emphasizes providing users with complete system information. This type of search focuses on the comprehensiveness of the search and requires retrieving all information on a specific topic within a period of time. In order to avoid missing relevant information as much as possible, the accuracy of the retrieval is relatively low.
4. Divide by search time span
⑴Search by fixed topic
Set-topic retrieval (SDI): Based on the content and retrieval needs of the user's retrieval topic, retrieval questions are formulated and the formulated retrieval questions are stored in the retrieval system in advance, and the information in the retrieval system is queried regularly. The characteristics of fixed-topic retrieval are: only the latest information is retrieved, and the retrieval time span is small. This retrieval mode is very suitable for information tracking and makes it easy to keep abreast of the latest developments in relevant subject areas.
⑵Backtracking search
Retrospective search (RS): Also known as retrospective search, it is a search for information about a specific topic in the past period of time, and provides the search results to the user at one time, so that the user can fully understand a certain topic in a certain period of time with one search. development. The characteristics of retroactive retrieval are: it can not only search for specific subject information in a certain period of time in the past, but also can find recent specific subject information. Unlike fixed-title searches, each backtracking search is generally run only once.
5. According to the information expression form of the search object
⑴Full text search
Full-text retrieval: For the entire article or even the entire book stored in the retrieval system, you can obtain relevant chapters, paragraphs, sentences, sections and other information according to your own needs, and you can also perform various frequency statistics and content analysis.
(Text retrieval: It is a search for text documents that contain specific information. The retrieval results reflect documents with specific information in text form. This is a traditional type of information retrieval that still occupies a dominant position in information retrieval.)
⑵Multimedia retrieval
Multimedia retrieval: It is the process of organizing and storing text, sound, graphics, images and other media information according to the user's needs, so as to identify, find and obtain relevant information.
⑶Hypertext retrieval and hypermedia retrieval
Hypertext retrieval: Hypertext is a non-linear text structure formed by connecting many text information through hyperlinks. Hypertext retrieval emphasizes the semantic connection structure between central nodes, relying on the complex tools provided by the system for graphical navigation and node display, and providing browsing queries.
Hypermedia retrieval: It is a supplement to hypertext retrieval. Its storage objects go beyond text and incorporate a variety of media information such as graphics, images, and sounds. The information storage structure has developed from single-dimensional to multi-dimensional, and the scope of storage space has also continued to expand.
(3) Basic principles of information retrieval
The basic principles of information retrieval can be summarized as: matching and selection of information resource collections and information demand collections.
⑴Information resource collection
Information resource collection refers to a collection of information related to a certain field that has been selectively collected, organized and processed. In order to ensure the speed and efficiency of information retrieval, it is necessary to perform some formal processing on the collection of information resources to form their characteristic representation, that is, to analyze and index them to make the originally implicit and difficult-to-identify features explicit. ization, and obtain the corresponding identification (such as classification number, subject heading, etc.). Storing these analyzed and extracted features and their identifiers forms an index database, which becomes the basis and standard for organizing and searching for information resources.
⑵Information needs collection
The collection of information needs of many users in different forms forms a collection of information needs. The information requirements put forward by users also need to be characterized, that is, the content of the requirements is analyzed, the subject concepts or other attributes are extracted, and the same identification system (i.e., retrieval language) as the information resource collection is used to represent the concepts contained in the requirements. and attributes, so as to obtain the characteristic representation of user needs - question-formation.
⑶ Matching and selection
In order to quickly obtain the information and knowledge required by users from a collection of information resources, information retrieval is required to provide a "matching" mechanism. The main function of this mechanism is to compare and judge the information demand set and the information resource set based on a certain similarity standard, and then select information that meets the user's needs. On the basis of the characteristic representation of the information resource collection and the combination of information requirements, the matching between them is simplified to the matching between the question and the established ordered index library.
(4) Information retrieval model
⑴Boolean logic retrieval model
The Boolean retrieval model uses Boolean algebra and set theory methods, uses Boolean expressions to express user questions, and retrieves documents through logical operations on document identifiers and question formulas. The most commonly used logical operators are Boolean logical operators. The main logical operators are logical "AND", logical "OR", and logical "NOT". They are AND (or *), OR (or), NOT (-) respectively. express. Advantages: There are fewer logical operators, and the question structure is simple and easy to modify; Disadvantages: There is no weight difference between keywords in the search, the search results are not ranked by importance, the recall rate is difficult to control, and the user's semantic extraction ability is required to be high.
⑵Vector space retrieval model
The vector space retrieval model is a new type of retrieval model constructed using linear algebra theory and methods. Its basic premise is to regard both retrieval documents and retrieval questions as a set of numerical vectors. These values form a space vector graph, thus converting the document and question matching process in information retrieval into document vectors and questions in vector space. Vector similarity calculation problem. The relevance of a certain document to a certain question is determined by retrieving the similarity between the vector pairs.
⑶Probability model
The probabilistic model is an information retrieval model that is simple to implement and has good effects. Its basic idea is: given a user asking a question, there is an ideal result set in the retrieval system that only contains documents related to the question, denoted as R. If we can know the characteristics and description of the set R, we can find all relevant documents and exclude all irrelevant documents.
⑷Fuzzy retrieval model
Fuzzy retrieval treats documents as questions related to a certain extent. For each indexing word, there is a fuzzy collection of documents related to it. For a given indexing word, each document is represented by a membership function. The degree of correlation with the word, that is, the degree of membership, takes a value on [0, 1], 0 means not relevant, 1 means completely relevant, the larger the value, the higher the correlation.
2. Information retrieval language
(1) The concept of information retrieval language
Information retrieval language: also known as indexing language, index language, etc., is a specialized language developed in response to the common needs of document information processing, storage and retrieval. It is used to describe the content, external characteristics and interaction of information in the retrieval system. A conceptual identity system that relates and expresses information user needs questions.
(2) Functions of information retrieval language
⑴It is used to standardize the indexing of information content and its external characteristics to ensure the consistency of different indexing personnel's representation of information concepts.
⑵Standardize and control the information feature identifiers and search question identifiers in the retrieval system to facilitate the consistency comparison of quotations and search terms, and link information storage with information retrieval.
⑶ Organizing and sorting information, concentrating or explaining the relevance of information with the same content and related content, ensuring centralized, systematic, organized and orderly information storage, making it easier for searchers to conduct orderly retrieval.
⑷ Provides multiple retrieval methods for retrieval systems and is an important part of various retrieval systems.
(3) Types of information retrieval languages
⑴According to the composition principle
① Classification language: also called taxonomy. It uses classification numbers to express subject concepts, organizes and arranges subject concepts into a category system based on knowledge classification, and mainly uses the structure of the category system to display the relationship between concepts. Systematization centered on disciplines and majors is its main feature. It can be subdivided into system classification language, combination classification language, and system-combination classification language.
②Theme language: also known as the theme method. It is an indexing language that directly expresses the subject concepts by controlled natural language words, arranges the subject concepts in alphabetical order of the words, and mainly uses reference to display the relationship between concepts. Thing-centered immediacy is its main feature. It can be further divided into title language, unit word language, descriptor language, and keyword language.
⑵According to the combination method
①Pre-assembled language: refers to an indexing language in which the marks have been assembled during tabulation (before indexing), and there is no need or less need for assembly during indexing and retrieval. For example, the title method in the subject method falls into this category. type.
② Post-grouping language: It refers to an indexing language in which the marks are mainly used for configuration, and the marks are not grouped together during indexing, but are grouped together during retrieval, such as unit lexical method and descriptor method.
③Scattered group language: It refers to an indexing language in which marks are mainly used for combination, and when indexing, several marks that express the theme concept must be grouped together.
⑶According to the degree of standardization
①Controlled language: refers to the identification vocabulary of information organization and the index vocabulary of information retrieval that have been optimized and standardized before use, and the entire language is often under the management of an authoritative organization or retrieval system. This language is also called a normalized language, as opposed to a natural language. For example, the title language, unit word language, thesaurus language, system classification language, etc. in the subject method are all controlled languages.
②Natural language: refers to the index vocabulary and search vocabulary that come directly from the document itself and have not been optimized and standardized before use, as opposed to controlled language. Such as the keyword language in the subject method.
3. Information retrieval system and retrieval tools
(1) Information retrieval system
Information retrieval system: refers to a complete system for the collection, processing, sorting, storage and retrieval of information established to meet specific information needs. (It consists of four basic parts: retrieval documents, information storage and retrieval equipment, retrieval rules, and personnel.) It is an ordered collection of information resources and a multi-functional open system that can provide users with information services.
1. Composition of information retrieval system
⑴Retrieve documents
A retrieval document is a collection of information that has been sorted and marked with a retrieval identifier, and is the core component of the retrieval system. For example, the bibliography, abstract, and index of a manual retrieval system are composed of several entries, while the retrieval documents of a computer retrieval system are composed of several records.
⑵Information storage and retrieval equipment
Information storage and retrieval equipment refers to technical equipment used to store information and retrieval identifiers, as well as to achieve comparison, matching and transmission of information retrieval identifiers and information characteristics required by users, such as card catalogs of manual retrieval systems and input and output devices of computer retrieval systems. , memory, communication devices, etc.
⑶Search rules
Search rules refer to the standard systems used by the system to standardize the processes of information collection, indexing, description, organization and management, retrieval and transmission, including search language, indexing methods, description rules, system composition and management methods, information transmission and control standards, output formats, etc.
⑷Personnel
Including system managers, information collection personnel, indexing personnel, information users, etc.
2. Types of information retrieval systems (development history)
⑴Manual search system
Manual retrieval system refers to a retrieval system based on printed retrieval tools. It is a traditional retrieval system in which searchers use manual methods instead of relying on other equipment for retrieval. Common manual retrieval systems include: ① book-based retrieval systems (such as catalogs, indexes, abstracts, encyclopedias, etc.); ② card-based retrieval systems (such as library card catalogs).
Features: ① The manual search system mainly implements and completes searches through the searcher’s own judgment. It is faced with printed search tools, which is in line with people’s reading habits. In the process of retrieval, the searcher can based on his or her own judgment. According to information needs, the strategy can be modified at any time, and the accuracy rate is high. ② However, the retrieval speed of the manual retrieval system is slow, the retrieval content is updated slowly, the recall rate is low, and the comprehensive retrieval efficiency is far less than that of the computer retrieval system.
⑵Design and equipment retrieval system
⑴ Mechanical information retrieval system is a mechanical system that uses various mechanical devices to perform information retrieval. It is the transitional stage from manual retrieval to modern information retrieval. Mechanical retrieval mainly includes two basic types: ① Electromechanical information retrieval system: an information retrieval system that uses electromechanical equipment such as punching machines, hole inspection machines, and sorting machines to record secondary documents, and uses brushes as retrieval elements. ② Photoelectric information retrieval system: A system that uses microphotography to record secondary documents and uses photoelectric retrieval elements to search for documents.
⑵ Features: ① The mechanical information retrieval system used the advanced mechanical devices at the time to improve the storage and retrieval of information and promote the automation of information retrieval. ② However, it has not developed an information retrieval language, and is overly dependent on equipment. The retrieval is complex, the cost is high, and the retrieval efficiency and quality are not ideal. It was quickly replaced by rapidly developing computer retrieval systems.
⑶Computer retrieval system
Computer retrieval system is a retrieval system that uses computer technology, electronic technology, and network technology to retrieve information resources stored in computers or computer networks. It is also a modern retrieval system that is developing rapidly and is extremely widely used. Computer retrieval systems generally include four parts: hardware, software, network communication and database. The computer retrieval system has gone through the following four stages in its development process:
①Offline retrieval system: The user does not talk directly to the computer. The user hands over the retrieval requirements to the information retrieval personnel. The retrieval personnel collect the retrieval questions, conduct regular batch searches on the computer, and then provide the retrieval results to the user in a centralized manner. Also called offline batch search.
②Online retrieval system: refers to a retrieval method in which users use terminal equipment to directly conduct human-computer dialogue with the computer database center through the communication network. Online retrieval is an organic combination of computer technology, information processing technology and modern communication technology. With the help of the communication network, users can use the terminal to connect to the remote central computer, enter search terms and search strategies according to prescribed instructions, and retrieve the required information from the database stored in advance by the search system. Overcoming the time and space barriers of offline retrieval, users can adjust retrieval strategies at any time and obtain retrieval results in a timely manner. Online retrieval database updates quickly and the retrieval speed is fast, but the cost is high. (Real-time, completeness, sharing, extensiveness)
③ CD retrieval system: It is a computer retrieval system established using CD database as information resource data. It is divided into stand-alone version and network version. An optical disc is a high-density storage carrier that uses a laser beam to record information on an optical medium and can read the information. According to the different ways of accessing information, optical discs can be divided into read-only optical discs, write-once optical discs, and rewritable optical discs. Optical discs have the advantages of high storage density, large capacity, easy storage, fast reading, easy operation, and low cost. CD-ROM retrieval improves retrieval efficiency and reduces retrieval costs.
④Network retrieval system: refers to computers and their peripheral devices that are relatively dispersed in physical locations, interconnected using communication media, and supported by network software to form a retrieval system for resource sharing and data interaction. This is currently the fastest growing and most popular information retrieval system. Through it, people can retrieve information materials of various types and media without being limited by time and space. Its characteristics are rich information materials, easy retrieval, and low cost.
(2) Search tools
1. Definition and sorting method of search tools
⑴Retrieval tools: tools that can be used to report, store and find various types of information, including traditional secondary and tertiary printing retrieval tools, online databases and CD-ROM databases for computers and networks, as well as various network retrievals such as search engines tool.
⑵ Sorting method: The sorting method of search tools refers to a method in which all entries are arranged into a system according to certain rules to facilitate retrieval. From the perspective of use, it is called the arrangement method, from the perspective of retrieval, it is called the retrieval method, and the two are collectively called the sorting method.
①Character ordering method: Also known as word ordering method, it refers to the method of arranging reference book entries according to a certain order of glyphs or pronunciations. This method is commonly used in dictionaries, dictionaries, encyclopedias, etc.
②Classification and sorting method: refers to the method of classifying and organizing according to the subject system or the nature of things. Such as "Chinese Library Classification" and "International Decimal Classification"
③Theme sorting method: refers to the method of collecting and arranging information by theme. Most search tools now provide search by theme, with a theme index.
④Sequential sorting method: refers to the method of arranging information materials in chronological order. The reference books compiled using this method mainly include chronology, calendar, memorabilia, yearbook, chronology, etc. The clues are clear and easy to search.
⑤ Geographical sequence sorting method: refers to the method of arranging information materials according to the order of geographical divisions or administrative divisions. The reference books compiled by this method are mainly geographical data and local documents, such as atlases, gazetteers, local chronicles, etc.
2. Types of search tools
⑴According to search method
① Manual search tools: refers to various printed search tools. It is a traditional search tool in which people directly participate in the search.
②Computer retrieval tools: refers to various databases in computer retrieval systems. According to the type of information checked in the database, it is divided into three types: full-text database, reference database, and fact database.
③Network retrieval tools: refers to information retrieval tools on the Internet, such as search engines, search directories, subject guides, etc.
⑵Press search object
① Literature information retrieval tool: It is mainly used to search for relevant literature information on a certain research topic. The result is to obtain a batch of relevant literature clues, which mainly include four types: catalog, bibliography, index and abstract.
②Data fact retrieval tool: also called reference retrieval tool, it is a three-dimensional information and is mainly used for querying various data or facts. The result is to obtain direct and reference answers. The information provided is more specific, generally including dictionaries, lexicons, encyclopedias, general books, political books, yearbooks, directories, handbooks, etc.
⑶According to the collection range
① Comprehensive search tool: The documents included in it are documents in multi-disciplinary fields. For example, the "Science Citation Index", "Science Abstracts" and "Engineering Index" in the United States and my country's "National Newspaper and Periodical Index" are comprehensive search tools, covering a wide range of disciplines and specialties.
② Professional search tools: Their income scope is limited to a certain subject and a certain professional field, such as "Biological Abstracts" and "Chemical Abstracts" in the United States.
③Unique search tool: Its income scope is limited to a specific type of documents. Such as China's "Patent Annual Index" and "National Standard Catalog of the Republic of China".
4. Information retrieval technology
(1) Traditional information retrieval technology
1. Boolean logic search
Boolean logic retrieval is a retrieval method that uses Boolean logic relational operators in Boolean algebra to express the logical relationship between search terms.
Boolean logical operators
Boolean logical operators are used to express the logical relationship between two search terms to form a new concept. Commonly used Boolean operators are:
①Logical "AND": It is a combination used to express cross-relationships or limited relationships, expressed by AND or * operators. For example, the search formula "A AND B" means that the retrieved document records must contain both A and B to be successful. This combination can narrow the scope of hit documents and enhance the accuracy of retrieval.
② Logical "or": It is a combination used to express parallel relationships. It is used to express the relationship between words with the same concept, expressed by the OR or operator. "A OR B" means that as long as the retrieved document records contain either A or B, it will be a hit. This combination can be used to expand the search scope, increase the number of hit documents, and help improve the recall rate of search results.
③ Logical "not": It is used to exclude unnecessary concepts from the search scope or exclude concepts that affect the search results, represented by NOT or - operator. "A NOT B" means that all records containing A but not B in the search records are detected. This combination can narrow the scope of hit documents and increase the accuracy of retrieval.
2. Truncated word search
Truncation search: refers to a retrieval method that uses special truncation symbols to indicate that a certain part of the search word is allowed to have certain word form changes during search, and uses the stem or incomplete word form of the search word to find information, and It is considered that any document that satisfies all the characters in the word part is a hit document. In the actual retrieval process, in order to reduce the input amount of search terms, expand the search scope, and ensure the recall rate, truncation search can be used.
There are many ways to truncate words. According to the truncation position, it can be divided into back (right) truncation, middle truncation and front (left) truncation; according to the number of truncated characters, it can be divided into limited truncation and infinite truncation. Limited truncation refers to specifying the number of characters to be intercepted, usually represented by "?"; unlimited truncation refers to not specifying the number of characters to be intercepted, represented by "*".
3.Location search
Positional retrieval: Also known as proximity retrieval, it mainly uses positional operators to specify and limit the relative positions between search terms, or to implement retrieval technology at specific positions in records. Position retrieval mainly has the following levels: word position retrieval, same sentence retrieval, and same field retrieval.
4. Restrict search
Restricted search: It is a method to constrain and optimize search results by limiting the search scope. There are many ways to limit retrieval. Commonly used ones include field limit retrieval and limiter limit retrieval.
5. Weighted search
Weighted retrieval: During the retrieval, each search term is assigned a numerical value indicating its importance, that is, the "weight". During the retrieval process, documents containing these words are weighted and calculated, and the sum of the weights is within the specified value. (called the threshold) will be output as search results, and the size of the weight can reflect the relevance of the detected documents. Currently, there are two basic weighted retrieval methods: word weighted retrieval and word frequency weighted retrieval.
(2) New network information retrieval technology
1.Full text search
Full-text retrieval technology: It is a technology that uses the content of information materials, such as text, sounds, images, etc., as the main processing object, rather than its external characteristics to achieve information retrieval. By providing fast data management tools and powerful data query methods, full-text retrieval technology provides an effective way for people to quickly and conveniently obtain the original text of a document instead of document clues, and has become the core supporting technology of full-text database systems and search engines.
The meaning of full-text retrieval system➕Existing problems
(Full-text retrieval system: It is a software system established according to the full-text retrieval theory and used to provide full-text retrieval services. The core of the full-text retrieval system has functions such as establishing indexes, processing queries and returning result sets, adding indexes, and optimizing index structures.
Problems: ① The object stored in the full-text retrieval system is the information source itself, not the information clues, so it takes up a lot of space; ② The system response speed is slow; ③ The full-text retrieval system uses natural language indexing and retrieval, which results in false connections and wrong combinations. It's inevitable. )
2. Multimedia retrieval
Multimedia information retrieval technology: refers to a retrieval technology that analyzes the content semantics and extracts features of multiple types of media objects such as images, audios, and videos based on user needs, and performs similarity matching based on these features. According to the retrieval content, it can be divided into image retrieval technology, video retrieval technology and audio retrieval technology.
3. Intelligent information retrieval
Intelligent information retrieval technology: It is a technology that uses artificial intelligence computer technology to conduct information retrieval. It can simulate the way of thinking of the human brain, analyze user requests through retrieval expressed in natural language, and automatically form a retrieval strategy for intelligent, fast and efficient information retrieval. It includes natural language understanding technology, intelligent agent technology, machine learning, knowledge discovery technology, etc.
4. Data mining
Data mining technology: refers to data processing technology that extracts hidden, unknown but potentially useful information and knowledge from large, incomplete, fuzzy, noisy, and random data in large databases or data warehouses. Data mining tasks mainly include association analysis, cluster analysis, classification, prediction, time series patterns, deviation analysis, etc.
5. Natural language retrieval
Natural language retrieval: means that users can enter search requirements expressed in natural language during retrieval. During the retrieval process, after the search tool receives the user's search question, it first uses a banned word list to eliminate those questions that have no substantive topics. meaningful words, and then search the remaining words as keywords.
6. Fuzzy search
Fuzzy retrieval: also known as concept retrieval, means that the search tool can not only retrieve information content containing the specified search terms, but also retrieve information that is the same as the subject concept of the search terms.
7.Related search
Related search: It is a search technology provided by most search engines at present. It means that when searching using keywords, in addition to getting the corresponding search results, some keywords related to the search terms will also appear on the search page. Click these keywords to get the corresponding search results.
5. Methods, approaches and steps of information retrieval
(1) Information retrieval methods
1. Conventional law
The conventional method, also known as the instrumental method, is currently the most commonly used information resource retrieval method. It refers to the method of directly searching for document information using various retrieval tools such as abstracts, bibliographies, and indexes, or various computer retrieval systems.
⑴Sequential search method: It is a method of searching for information from far to near in chronological order and according to the starting era of the search topic. Suitable for searching theoretical or academic topics.
⑵Backward search method: It is a method of searching from near to far in chronological order until the needs of information retrieval are met. It is mostly used to search for new topics or old topics with new content, or for retrieval topics that have a certain foundation in research on a topic and need to understand its latest developments.
⑶Spot check method: Based on the characteristics of the retrieval topic, the method is to select the time period when the literature information related to the topic is most likely to appear or appear most to conduct a focused search. It is a search method that takes less time and obtains more documents.
2. Backtracking method
Backtracking method: also known as the citation method, refers to the method of using the references, related bibliographies, recommended articles and citation notes attached at the end of the document as the search entry based on the citations and references between the documents. Citation relationships reveal some internal connections between documents and then find more related documents.
3. Comprehensive method
Comprehensive method: also called alternating method or circular method, it refers to a search method that combines the conventional method and the backtracking method. That is, when searching for document information, both general search methods and reference citations attached to the original document are used. The literature serves as the search portal, and the two methods are used alternately in stages and periodically. It is very helpful to improve retrieval efficiency.
(2) Methods of information retrieval
1. Content feature retrieval methods
Content characteristics of the document: including the things discussed in the document, the questions raised, the basic concepts involved, and the subject scope to which the document content belongs.
⑴Classification approach
① Classification approach: It is a way to perform retrieval based on information content and using classification search language. The classification approach is to search from the subject category to which the document content belongs, and it is based on a referable classification system.
②The basic process of implementing the classification approach is: first analyze the subject concepts of the questions, select classification categories that can express these concepts, and then search in the classification system according to the category numbers or words of the classification categories, and then obtain the required literature information .
③ Classification retrieval generally uses the subject system as the entrance for retrieval, which can reflect the systematic nature of the subject. The content of adjacent subjects is relatively concentrated, which can better meet the needs of ethnic retrieval. It has strong generality and a high recall rate. , but generally can only satisfy the retrieval of single-dimensional concepts.
⑵Theme approach
①Theme approach: It is a way to perform retrieval based on the information content and using the subject search language. The implementation of the subject approach requires the use of various subject indexes, such as subject indexes, keyword indexes, thesaurus indexes, etc.
②The basic process of implementing the topic approach is to first analyze the topic concepts of the questions, select topic words that can express these concepts, and then search in the topic index according to the words of the topic words, and then obtain the required literature information.
③The subject approach has the characteristics of characteristic retrieval, with strong specificity and high accuracy. It can satisfy the retrieval of multi-dimensional concept retrieval, and can timely reflect the development of emerging disciplines, interdisciplinary subjects and edge subjects.
2. External feature retrieval approach
The external characteristics of a document refer to the visible features marked on the surface of the document carrier, such as title, author, serial number, etc.
⑴ Title route
The title approach is a way to retrieve literature information based on a certain document title. Document titles mainly refer to book titles, article titles, journal titles, publication titles, conference titles, etc.
⑵Author path
Author path is a path for searching documents based on the name of the known person responsible for the document. Document responsibilities include individual author responsibilities, group responsibilities, editors, translators, sponsors, patent holders, etc.
⑶Number route
The number approach refers to a retrieval approach that uses unique serialization numbers or identification codes that some documents have, such as patent numbers, standard numbers, call numbers, etc., to find document-related information. (ISBN International Standard Book Number)
⑶Citation route
The first is to search for cited documents through cited documents, that is, source documents; the second is to search for cited documents through cited documents, directly using the references attached at the end of the document.
(3) Formulation and implementation of information retrieval strategies
1. Information retrieval strategies
Information retrieval strategy: It is an information retrieval plan designed based on retrieval questions and using retrieval methods and technologies. Its purpose is to achieve a certain recall rate and precision rate.
⑴In a narrow sense: it refers to the construction of retrieval question expressions, that is, using the specific retrieval technology of the retrieval system to determine the logical relationship between search terms to form a retrieval question expression that expresses the user's information needs.
⑵In a broad sense: it refers to selecting the search system and search tools, clarifying the search approaches and methods, determining the logical relationship between the search terms and the best solution for the search steps on the basis of analyzing the substantive content of the search topic and clarifying the search goals. A series of scientific arrangements.
2. Manual information retrieval strategies
⑴Analyze and study information retrieval topics
Analyzing and studying information retrieval topics is the fundamental starting point for determining information retrieval strategies, and is also the key to retrieval efficiency and success.
⑵Choose information retrieval tools
Information retrieval tools are cards, tables, computer information systems and specific publications that are processed and compiled by people to report, reveal, store and find information in order to fully, accurately and effectively utilize existing information resources. Just choose high-quality information retrieval tools that are relevant to the topic and meet time requirements.
⑶Determine information retrieval method
Commonly used information retrieval methods include forward search method, reverse search method, spot check method, retrospective method, circular method, etc. Each information retrieval method has its own characteristics. In practice, it can be selected or used in combination according to the information retrieval requirements. Complete information retrieval tasks quickly and accurately to achieve expected goals.
⑷ Grasp the clues to obtain original information
When obtaining information clues, read carefully to determine whether the detected information meets the retrieval requirements. If the detected information meets the requirements, it is necessary to record the relevant characteristics of the information material, such as the title of the article, the author and work unit, the source of the information, etc., in order to find the original information.
⑸Get original information
Obtaining original information is the last step of information retrieval and is very important to achieve the ultimate goal of information retrieval. Its main tasks include: ① Determining the publication type of documents; ② Organizing the source of documents; ③ Searching the collection catalog or joint catalog in libraries or information institutions to determine the collection according to the publication type. ④ Obtain original information from as many channels and methods as possible.
3. Formulation and implementation of computer information retrieval strategies
⑴Analysis and retrieval topics
The analysis of retrieval topics, that is, topic analysis, is the fundamental starting point for formulating retrieval strategies and is also the key to retrieval efficiency or the success or failure of retrieval. ①Clear the main content of the search topic. ②Clear the disciplines and professional scope involved in the search topic. ③Clear the requirements for the type, language, age and quantity of required documents. ④ Clarify the user’s index requirements and priorities for novelty checking, complete checking, and accurate checking.
2. Select the search system and database
The key to choosing a computer retrieval system is choosing a database. Because the types of databases and the scope of disciplines they contain are different, their different applicable objects and different retrieval requirements are directly determined. ①The contents of the database. ② Database coverage. ③Timeliness of database. ③The cost of database.
3. Determine the search terms
Search terms are the basic unit for expressing information needs and retrieval subject content, and are also the basic unit for matching operations in related databases in the system. Search terms in computer retrieval systems can be divided into three categories: controlled vocabulary, non-controlled vocabulary and manual codes.
4. Construct search question expressions
The retrieval question expression is the concrete embodiment of the retrieval strategy. In the computer retrieval process, the matching between the retrieval question and the storage identifier is completed by the computer. Therefore, constructing a retrieval question expression that can express the retrieval topic requirements and can be recognized by the computer system has become the key to computer retrieval. Search question expressions consist of search words and operators.
5. Experimental searches and modified search strategies
Search strategies can be erroneous or even wrong for a variety of reasons. This requires search personnel to conduct a quick and small number of experimental searches before formally implementing the search to test whether the search strategy is effective, and make full use of the real-time and manual interaction functions of computer search to continuously understand feedback information, analyze repeatedly, and eliminate Uncertain factors, adjust the search strategy in a timely manner.
6. Implement search
The implementation of computer information retrieval mainly involves inputting the constructed retrieval question expression into the computer retrieval system, using the retrieval instructions approved by the retrieval system to perform matching operations and output the retrieval results. Information users organize search results, select and obtain original information.
6. Information retrieval effect evaluation
(1) Evaluation indicators of search results
⑴ Recall rate R: It refers to a measurement index that measures the ability of an information retrieval system to detect relevant documents when performing a certain retrieval operation. It refers to the percentage of the number of relevant documents detected to the total number of relevant documents in the system. The recall rate reflects the comprehensiveness of the search, and its complement is the missed detection rate. The calculation method is: R=the amount of relevant documents detected/the total amount of relevant documents in the retrieval system
⑵ Precision rate P: It is a measurement index to measure the retrieval accuracy of the information system when implementing a certain retrieval operation. It refers to the percentage of the number of relevant documents detected to the total number of documents detected. The precision rate reflects the accuracy of the search, and its complement is the false detection rate. The calculation method is: the amount of relevant documents detected/the total amount of documents detected.
⑶ Missed detection rate O: as the complement of the recall rate. The calculation method is: the amount of relevant documents that have not been detected/the total amount of relevant documents in the retrieval system.
⑷The false detection rate E: as the complement of the accuracy rate. The calculation method is: the amount of non-relevant documents detected/the total amount of documents detected.
Recall rate and precision rate are two important indicators to evaluate the retrieval effect. They can be used to evaluate the quality of the retrieval system and to measure the retrieval effect of specific topics.
(2) Factors affecting the effect of information retrieval
The factors that affect the retrieval effect mainly come from two aspects: one is the retrieval system itself; the other is the retrieval level of the retrieval personnel (or information users).
⑴ For the retrieval system: ① The information storage in the system is not comprehensive, and there are serious omissions in the collection; ② The vocabulary structure is imperfect, the relationship between words is ambiguous or incorrect, and the index vocabulary lacks control; ③ The indexing is not exhaustive or the indexing is specific. The degree lacks depth and cannot accurately describe the information theme; ④ The combination rules are not strict and it is easy to produce ambiguity, etc., which are all factors that affect the recall rate and precision rate.
⑵ For search personnel (or information users): ① The search topic requirements are not comprehensive or the search requirements cannot be comprehensively and completely described; ② The selection of the search system is inappropriate; the search approach and method are single; the search terms are improperly used or the search terms are lacking Specificity; wrong combinations, etc., also affect the retrieval effect.
(3) Measures and main methods to improve search results
⑴Improve the quality of the retrieval system
① Expand the scope of information resources included in the retrieval system database and improve the quality of information resources.
②The search topic must conform to the included content of the database.
③The description content of the database must be detailed and accurate, with complete auxiliary indexes, good index language specificity, and high indexing quality.
⑵Improve users’ ability to utilize the retrieval system
① Users must have certain knowledge of search language and be able to correctly select search terms and rationally use operators to fully and accurately express the topic of information needs.
② Flexibly use various search technologies, search methods and search methods.
③Able to use a combination of comprehensive retrieval system and professional retrieval system to implement cross-database retrieval.
④ Develop an optimized search strategy, accurately express the search requirements, try multiple searches, and continuously adjust the search strategy as background knowledge increases.
⑤ Adopt a rigorous scientific attitude, carefully follow the retrieval operation steps, prevent operational errors, and maximize the role of the retrieval system.
⑥According to the needs of different retrieval topics, reasonably consider and adjust the requirements for recall rate and precision rate.
7. The significance of information retrieval
Information retrieval is an important means of information literacy education; information retrieval is a basic skill necessary for innovative talents; information retrieval is an important part of scientific research; information retrieval is an effective way to develop information resources; information retrieval is the prerequisite for scientific decision-making.
Chapter 2 Network Information Retrieval
1. Overview of network information resources
(1) Concept and characteristics of network information resources
1. The concept of network information resources
Network information resources: Information in various forms such as text, images, sounds, animations, videos, etc. is stored in the form of electronic data in non-paper printed carriers such as magneto-optical materials, and is reproduced through network communications, computers or terminals. The incoming and outgoing information resources are the sum of various information resources available through the computer network.
2. Characteristics of network information resources
⑴Large amount of information and rich content
The Internet is an open data transmission platform with a huge number of information resources of various types, such as academic, business, government, personal, entertainment, news information, etc. On the one hand, it provides users with a large space for information selection; on the other hand, a large amount of worthless and redundant information also brings a lot of trouble to users.
⑵Information is updated in a timely manner and changes are accelerated
Due to the development of network technology, compared with traditional information sources, network information sources change more quickly and novelly, and the amount of data is constantly increasing.
⑶Diversified forms of information expression
The Internet has information resources with rich forms of expression, such as sounds, images, text, videos, animations, etc. While expressed in multimedia forms, the interactivity between users and information has been greatly enhanced.
⑷ Information is arranged non-linearly and disorder is enhanced
Network information sources use hyperlinks to form a three-dimensional network information chain, linking information from different countries, different regions, different contents, and different formats through nodes, thereby enhancing the correlation between information. But at the same time, the state of disorder is also increasingly prominent.
(The amount of information is large and widely disseminated; the information content is rich and diverse; the information is timely, dynamic and unstable; the existence state is scattered and disorderly, but the degree of correlation is high; the value of information varies greatly and is difficult to manage)
(2) Types of network information resources
1. According to the corresponding non-network resources
Many network information resources have counterparts to traditional information resources, and have been organized digitally and networked to form network information resources, including: library catalogs, e-books, reference books, databases, and other types of information.
2. According to the method of information exchange
Information exchange requires certain media and carriers. These carriers include formal publications, semi-formal publications, and informal publications. Therefore, network information can also be divided into: informal published information, semi-formal publications, and formal publications.
3. According to the level of network information resources
Instructions, information units, documentation, information resources
(3) Organization of network information resources
⑴File method
A file is a collection of orderly organized data and is the basic unit for computers to save processing results. The computer has a complete set of file processing technologies and methods that can realize "access by file name". The file management program can automatically complete the data transfer operation according to the file name given by the user. The function of the file transfer protocol FTP that we are familiar with is to transfer various types of text and non-text files to users through the network.
(FTP: File Transfer Protocol. It is a set of standard protocols for file transfer on the network. FTP allows users to communicate with another host in the form of file operations. However, users are not actually logged in to the computer they want to access. You can use the FTP program to access remote resources and enable users to transfer files back and forth, manage directories, and access emails, even if the computers on both sides may be equipped with different operating systems and file storage methods.
⑵Hypertext/Hypermedia
Hypertext method: Hypertext is a new type of information organization method and is the basis of network information organization. A major feature of hypertext technology is the nonlinear arrangement of information. It uses nodes as the basic unit, and the nodes are connected by link points to organize the information into a certain network structure. Another major feature is the diversity of its information expression forms. Hypertext information can be in various media forms such as text, graphics, images, sounds, animations, etc., so it can also be called "hypermedia".
⑶Database
Database organization method: All obtained network information resources are stored in a fixed record format. Users can find the required information clues (i.e., related site links) through keyword and combination queries, and directly use the information clues to Connect to appropriate network information resources.
⑷Search engine
Search engine method: Search engine refers to a type of tool on the Internet that specifically provides query services. It is currently one of the main ways to organize secondary information on the Internet. Although the information collected in this way is rich and extensive, it is of mixed quality and has a low accuracy rate.
⑸Theme tree
Topic tree organization method: Information resources are organized layer by layer according to a certain predetermined conceptual system structure. Users select layer by layer through browsing and traverse layer by layer until they find the required information clues and pass them through. Information clues directly find the corresponding network information resources. Some well-known Internet search tools, such as Yahoo! InfoSeek and others organize information resources in this way.
2. Retrieval of network information resources
(1) Concepts and characteristics of network information resource retrieval
1. The concept of network information resource retrieval
Generalized network information resource retrieval includes two aspects: network information resource sorting and network information resource search.
⑴ Network information resource sorting: It is the process of collecting, analyzing and indexing information connected to the Internet according to certain rules, and organizing, sorting and storing it in databases and other ways to form a retrieval tool or retrieval system;
⑵Network information resource search: refers to the process of using the Internet as a retrieval platform, using corresponding network information retrieval tools, and using certain network information retrieval technologies and strategies to find the required information from a collection of network information resources.
The organization of network information resources is the basis and prerequisite for network information search. Network information resource retrieval in the narrow sense only refers to the search link in the broad concept.
2. Characteristics of network information resource retrieval
⑴Wide search range
Network information retrieval can retrieve information resources in all fields, all types, and various media on the Internet, far exceeding the information sources available through online retrieval, CD-ROM retrieval and other information retrieval methods.
⑵User-friendly interface
The network information resource retrieval tool directly targets users and is simple and convenient to operate. It generally uses a graphical window interface and provides a variety of navigation functions and multiple search methods. Searchers do not need to master complex search instructions. As long as they enter the search formula according to the prompts and rules on the search interface, they can obtain the search results.
⑶Interactive operation mode
The network information resource retrieval tool has the characteristics of interactive operations. It can respond to the user's requirements in a timely manner, obtain corresponding instructions from the user's commands, perform corresponding operations according to the instructions, and finally feed back the execution results to the user.
⑷Integration of traditional retrieval technology and network retrieval technology
Network information retrieval not only follows many traditional retrieval methods and technologies, but also uses new retrieval technologies such as hypertext/hypermedia, full-text retrieval, and intelligent retrieval with the help of the development of network information technology.
⑸High retrieval efficiency
Through hyperlink technology, the retrieval process of network information resources and the information browsing process are carried out in the same interface. With a simple click of the mouse, users can browse and obtain the full text of Web page documents that can be directly read and utilized.
⑹ Large information redundancy
Network information resources lack unified and standardized management and control and are highly dynamic. Current network information retrieval tools have certain deficiencies in information collection and indexing. The information retrieval process will generate a large amount of useless or even junk information. Accuracy, completeness and authority cannot be guaranteed.
(3.Characteristics and development trends of network information retrieval in the Web2.0 environment)
(2) Network information resource retrieval methods
1. Browse
Browsing generally refers to information browsing in a hypertext file structure, that is, when users read a hypertext document, they use hyperlinks in the document to go from one web page to another related web page. The characteristics of this retrieval method are that it does not rely on any retrieval tools, the retrieval purpose is not strong, and the retrieval results are unpredictable.
2. Use search engines to retrieve
Using search engines to conduct network information retrieval is currently a commonly used retrieval method. Enter the search engine website address in the browser to open the homepage of the website. Enter the search term in the search box on the homepage of the website. The search engine will soon return the search results list. By clicking on the hyperlink in the search results list, you can enter the relevant website. Find the information you need. The advantages of this method are that it is simple and easy to learn, saves time and effort, has fast retrieval speed and wide retrieval range. However, it is difficult to control the relevancy and accuracy of retrieval, and the retrieval quality fluctuates greatly.
3. Retrieve with the help of network navigation
Network navigation is a directory-type retrieval system based on a classification system, and it is also a commonly used information retrieval method. Users log in to the network navigation website and click on specific URL links to find the content they are interested in. They can also click on the category list to make more specific choices. Network navigation is provided by professionals who are responsible for resource description, which is of high quality and plays an important guiding role in discovering network information. However, network navigation based on manual description also has its limitations.
4. Search through professional resource system
As network information construction becomes more and more specialized, a large number of professional resource systems appear on the network. These professional resource systems generally focus on the construction of a specific field or a specific type of resources. Under the premise of manual participation, they realize the storage, management, maintenance and update of a large number of organized information resources through professional platforms, and publish them on the Internet. Provide query services to users with the help of a specific web page, which is customarily called a database or data resource library, information resource library, etc. Such as CNKI, VIP Resource Information System, Wanfang Data Resource Information System, etc.
(3) Network information resource retrieval tools
1. Composition of network information resource retrieval tools
⑴Information collection subsystem
The information collection of network information resource retrieval tools includes manual collection and automatic collection: ① Manual collection is performed by specialized information personnel to track and select valuable network information resources, and classify the collected information resources in a certain way , organize, index, and build an index database. ② Automatic collection uses a network automatic tracking indexing program called Robot to complete information collection. Robot retrieves files on the network and automatically tracks the hypertext structure of the file, and cyclically retrieves all referenced files. It travels through the network information space, visits various sites and web pages in the public area of the network, records their addresses, indexes their contents, organizes and establishes indexed documents, and forms a database for retrieval.
⑵Database
The information collected and indexed by the information collection subsystem is organized by the database management system software to form a database, which serves as the basis for network information resource retrieval tools to provide retrieval services. Generally speaking, the network resource content provided in the database includes website name, keywords, web page URL, web page summary, related hypertext links, etc. Since the size and quality of the database directly affect the effectiveness of information retrieval, the data in the database needs to be updated and processed in a timely manner.
⑶Search agent software
When the user puts forward a search request, the search software is responsible for searching in the database on behalf of the user, calculating, evaluating, and comparing the search results, sorting the search results according to the degree of relevance to the search request, and providing them to the user.
2. Working principle of network information resource retrieval tools
⑴ Extensive collection of various network information resources through manual collection or automatic tracking and indexing procedures of the data collection subsystem;
⑵After a series of judgment, selection, indexing, processing, classification, organization, etc., the database management system is used to organize and form a database for retrieval, create a directory index, and provide relevant information to users in the form of Web pages. Resource navigation, directory indexing and retrieval interface.
⑶Users construct search questions according to their own search requirements and the grammatical requirements of the search tool, and enter search questions through the search interface.
⑷After the retrieval software identifies and determines the user's search questions, it searches the database on behalf of the user based on the user's search questions, evaluates and compares the search results and sorts them by relevance before submitting them to the user.
3. Types of network information resource retrieval tools
1) Classify by search content
① Comprehensive: Comprehensive network resource retrieval tools are also called universal network resource retrieval tools. They do not limit the discipline, subject scope and data type of resources when collecting information resources. They can be used to retrieve almost all aspects of network information. resource.
②Specialist type: Specialist type network resource retrieval tool refers to the specialized collection of information resources on a certain subject subject and scope, and provides more detailed classification, in-depth indexing and description suitable for the characteristics of its professional resources and retrieval needs.
③Special type: Special type network information resource retrieval tools are tools designed to provide retrieval services for a special type of information resources.
2) Classification according to the type of information resources retrieved
⑴Non-Web resource retrieval tools: A type of retrieval tool that mainly targets non-Web resources, such as FTP information resources, Gopher information resources, Telnet information resources, Usenet information resources and other special types of information resources.
⑵ Web resource retrieval tool: It is a specialized Web server or Web website established on the Internet using hypertext technology to provide online information resource navigation and retrieval services. It is a clue tool that not only takes Web resources as the main search object, but also provides services in the form of Web.
①Keyword search tool: Search engine, which uses automatic indexing software to discover, collect and index web pages, and establish a database; it provides users with a search interface in the form of a Web for users to enter search terms such as keywords, phrases or phrases ; Replace the user in the database to find records that match the question, and return the results and output them in order of relevance.
② Directory search tool: It is a hierarchical structure directory that can be searched according to a certain classification system. The classification method is mainly based on subject classification. The retrieval method using such tools is called "categorical search". This is a "top-down, gradually refined" search method that traverses layer by layer.
③Hybrid retrieval tools: At present, the keyword retrieval of search engines and the classification retrieval of catalog-type retrieval tools are gradually integrated. You can not only enter search terms directly, but also browse the catalog to learn about resources in a specific field to enhance retrieval capabilities.
4. Evaluation of network information resource retrieval tools
⑴Inclusion scope
Each network information resource retrieval tool has specific inclusion objects and inclusion policies. Therefore, when choosing a retrieval tool, you must first consider the range of data resources, resource types, data volume, index depth, data update frequency, processing language, etc. included in the retrieval tool. To understand.
⑵Search function
The retrieval function directly affects the recall rate, precision rate and retrieval flexibility, convenience and retrieval speed of information retrieval. Selecting and evaluating the functions of retrieval tools can be carried out from the following aspects: ① Determine whether the retrieval method is single or diverse; ② Determine whether the retrieval technology used is advanced and diverse; ③ Determine whether you have the right to select and limit the retrieved information resources.
⑶Retrieval efficiency
At present, the indicators for measuring the efficiency of search tools are mainly recall rate and precision rate. In addition, there are also factors such as response time and ease of connection.
⑷User interface
The design of the user interface directly affects the efficiency and effect of human-computer interaction. Generally speaking, judging whether the user interface is excellent mainly starts from the following aspects: ① intuitively judge whether it is easy to use; ② whether online auxiliary instructions are provided; ③ whether the function keys and toolbar settings of the search interface are clear, definite and complete; ④ Whether the search interface is simple and whether the switching is flexible; whether the search steps are simple and compact, etc.
⑸Search result processing and display
The way search results are displayed directly affects the user's browsing experience. At present, most search tools sort according to the authority of data resources and the relevance of search content and websites. The more relevant the results will be, the higher they will be.
(5. Evaluation indicators of network information retrieval effect➕Existing problems➕Improvement measures)
(4) Important areas of network information retrieval (Features ➕ Development Trends)
1. Multimedia retrieval
Multimedia information retrieval is to retrieve information from multimedia such as graphics, images, text, sounds, animations, etc. according to the user's requirements to obtain the information the user needs. Divided into text-based retrieval and content-based retrieval information.
2. Cross-language search
At present, the main research hotspots of cross-language information retrieval include: cross-language information retrieval auxiliary technology methods, language conversion methods, information organization and retrieval models, etc. At the same time, there are still some issues that need further research, such as semantic-based information retrieval, application-oriented Oriented cross-language retrieval platform, correlation search result merging, visualization processing, etc. will become new research directions.
3. Intelligent information retrieval
The intelligent information system is developed from word extraction and full-text retrieval. It is an artificial intelligence retrieval system based on the relevance of search terms and has high judgment, understanding and processing capabilities for search terms. In recent years, intelligent information retrieval based on semantics, agents, and ontology has become a research hotspot.
4.Visualization of information retrieval
Retrieval visualization is the application of information visualization technology in information retrieval. It refers to converting document information, user questions, various information retrieval models, and internal semantic relationships invisible during the information retrieval process into graphics, a two-dimensional or three-dimensional visualization. displayed in space.
5. Intelligent question and answer system
The current rapid development of artificial intelligence, and the gradual application of machine learning, neural network and other technologies, have greatly promoted the development of intelligent question and answer systems, producing some representative products, including automated question and answer platforms, voice question and answer robots, etc. The one-question-one-answer service method of these intelligent question and answer systems facilitates and accurately locates user needs, enables real-time interaction, and greatly improves the degree of personalized service.
Chapter 3 Search Engine
1. Concept
⑴Search engine
Search engine: refers to a system that uses specific computer programs to collect and process information on the Internet according to certain strategies, stores the processed information in a database, and provides retrieval services to users through an interactive interface. (Refers to a retrieval tool that accepts questions from users, searches the database, and feeds back information objects matching the user's questions to the user. Broadly speaking, a search engine not only refers to the information retrieval program itself, but also refers to the retrieval interface and related entrances. program, and the index databases and services that support it).
⑵Meta search engine
Metasearch engine: also known as multiple search engines or integrated search engines. It refers to a search engine that helps users search in multiple search engines through a unified user interface and optimizes the search results. Metasearch engine is the integration, call, control and optimized utilization of multiple independent search engines. In the process of metasearch, the called search engine is called a source search engine or an independent search engine, which is a search engine in the usual sense. Metasearch engines generally consist of three parts: user interface, search agent and result optimization.
2. Classification
⑴Divided according to search scope
① Comprehensive search engine
It refers to a search engine that does not have clear restrictions on the scope and type of resources included. The inclusion scope of this type of search engine includes the entire Internet, and resource types include all common resource types such as web pages, videos, audios, image files, etc. The more well-known comprehensive search engines include Google, Baidu, Yahoo, Bing, Sogou, etc.
②Vertical/professional search engine
It refers to a professional search engine that limits the scope of resource collection to a specific field or type. It is a subdivision and extension of search engines. It integrates information in specific fields on the Internet and is aimed at the low accuracy of comprehensive search engines. , a new search engine service model proposed due to insufficient search depth and other problems.
⑵Divided according to search function
①Independent search engine
Also known as a single search engine or a regular search engine. It refers to a search engine that independently has a searcher, indexer, index database, retriever, and user interface, and does not rely on other search engines for its work. The more common independent search engines include Google, Baidu, etc.
②Meta search engine
Also known as multiple search engines or integrated search engines. It refers to a search engine that helps users search in multiple search engines through a unified user interface and optimizes the search results. Metasearch engine is the integration, call, control and optimized utilization of multiple independent search engines. In the process of metasearch, the called search engine is called a source search engine or an independent search engine, which is a search engine in the usual sense. Metasearch engines generally consist of three parts: user interface, search agent and result optimization.
⑶Divided according to working methods
①Directory search engine
Also known as a classified index or web resource guide, it is a website-level browser-based search engine. It uses professional information personnel to collect network resource site information manually or semi-automatically, and uses manual methods to describe the collected websites, and compiles a hierarchical structure directory for browsing and retrieval according to a certain subject classification system.
②Index search engine
Also known as a robot search engine or keyword search engine, it is a web-level search engine. It mainly uses an automatic tracking and indexing software called a web robot, web spider or web crawler to analyze hyperlinks of web pages in an automatic way, relies on hyperlink and HTML code analysis to obtain web page information content, and uses automatic search and automatic indexing. Automatic summarization and other pre-designed rules and methods are used to establish and maintain its index database, and provide users with a search interface in a web format for users to enter search keywords, phrases or logical combinations for search, and its background search agent software replaces The user finds records matching the search query in the index database, and the search results are fed back to the user.
(⑷According to the information media of the index database: image search engine, video search engine, web search engine)
3. Function
⑴Search network information in a timely and comprehensive manner
⑵Search for effective and valuable network information
⑶Search for network information in a targeted manner
4.System structure
⑴Searcher
It is a special program that collects information from the Internet, also known as network robots, spiders, crawlers, etc. Its function is to roam the Internet day and night, constantly collecting and sending back relevant information in a timely manner.
⑵Indexer
It extracts index items from the plain text information files returned by the searcher, generates inverted working files, and then gradually builds an index database.
⑶Index database
It is the core of the search engine. It is not only the product provided by the indexer, but also the basis for the searcher's work. It consists of four types of files, inverted address tables, inverted indexes, other index files and plain text files.
⑷Searcher
It is a special retrieval package developed for the index database configured by a specific search engine. Its responsibility is to accept and understand user needs from the user interface, convert them into retrieval instructions, perform retrieval on the index database, and organize the result set according to the content. Relevance sorting, and the sorting result file is fed back to the user.
⑸User interface
Its function is to accept the input of the user's search requirements and perform grammatical checks to make them standardized. It can be divided into two parts, the user demand submission interface and the search result feedback interface. The former is used to accept user needs, and the latter feeds back to the user the results retrieved by the search engine for the needs submitted by the user.
5. Working principle
⑴Search engines discover and collect information by roaming and traversing the Internet through search engines;
⑵The indexer is responsible for extracting index items from the information searched by the searcher, and establishing an index table to form an index library;
⑶The searcher searches in the index database according to the user's query conditions, processes the search results and returns them to the user through the user interface;
⑷User interface provides an interactive interface for users.
6. How to use
⑴Boolean search
Boolean logic retrieval refers to information retrieval that supports Boolean logic operations. All search engines provide some form of Boolean logic retrieval, either using "simplified" Boolean logic (using plus and minus signs), or using full Boolean logic. Logic (AND, OR, NOT)
⑵Truncation search
Among many search engines, most support truncation search using root words. Truncated words are generally represented by the root plus a truncation symbol (usually an *), which can greatly simplify the retrieval of words with different suffixes.
⑶ Phrase and name search
When using search engines to retrieve network information, users are allowed to use two words to search. The two words can be adjacent to form a phrase, and the input can be searched using quotation marks. The two words can also appear in the context and at a certain distance. Some search engines also use similar operators such as NEAR to describe the distance between two words.
⑷Category search
In order to improve the accuracy, it is often necessary to limit the search to a specific part or several parts of the web page (record), which is a category search. Its idea comes from traditional online retrieval. Commonly used categories in network information retrieval include Title, Date, URL, Links, images, etc.
7. Development trends
⑴Personalization
With the advent of the Web 2.0 era, more emphasis is placed on user experience and individual preferences of users. Many search engines have begun to provide user registration and preference setting functions, and launch personalized search homepages to meet the specific needs of users.
⑵Intelligent
Search engines can improve their intelligence level through information extraction, semantic indexing and other technologies, and more clearly define the semantic characteristics of information.
⑶Integration
Users hope that search engine feedback results include multiple types of relevant information, so that they do not need to repeatedly retrieve various types of information. Therefore, many search engines have begun to feed back search results for images, web pages, and videos to users in an integrated manner.
⑷Verticalization
The amount of information on the Internet is getting larger and larger. The massive amount of information causes users to search for information in a specific subject area on comprehensive search engines, and information in other unrelated subject areas will also be returned. Therefore, many vertical search engines oriented to specific topics began to appear. Since they only focus on information content in specific subject areas, the index volume and the relevance of search results are greatly enhanced.
⑸Mobile
With the development of the mobile Internet, mobile terminals have gradually become a new tool for obtaining information. Therefore, many search engines have begun to launch search services based on mobile platforms. Search engines can be accessed through mobile terminals such as mobile phones.
⑹Open type
In order to broaden their application scope, various search engines have begun to open search interfaces and databases, allowing third-party developers to quickly build various search services.
8. Application examples
Introduction
The Chinese name is Google. Google is both a company name and a search engine. Founded in 1998 by Larry Page and Sergey Brin. It is currently the world's largest search engine, providing convenient online information query methods and query services, and promoting global information exchange.
Features
The network resource organization has a wide range; supports many languages; adopts new technologies; and has strong system functions.
Function
Web retrieval; image retrieval; advanced search; Google web directory.
Baidu
Introduction
Baidu search engine is the world's largest Chinese search engine. It is a product of Baidu. Baidu was founded in Zhongguancun, Beijing in January 2000. The founders are Robin Li and Xu Yong. The word "Baidu" originated from the Chinese poet Xin Qiji's "Qingyu Case. Yuanxi" in the Song Dynasty, which means "Looking for him in thousands of Baidu", which symbolizes Baidu's persistent pursuit of Chinese information retrieval technology.
Features
Powerful functions, wide service range, good inclusiveness; intelligence and scalability; technological novelty and forward-lookingness; adaptability and flexibility.
Function
In addition to ordinary web search, Baidu has also launched related products in vertical search. It provides two methods: simple search and advanced search. The steps of simple search are simple and easy to operate. Baidu's advanced search is reflected in three aspects: first, advanced search is realized in the simple search box through advanced syntax; second, through Baidu's advanced search The interface realizes advanced retrieval; the third is to realize advanced retrieval through Baidu's vertical search. Baidu's search results page mainly includes information such as title, abstract, Baidu snapshot, related searches, retrieval time, total number of results, etc.
Bing
Introduction
It is a search engine launched by Microsoft. It was released on May 28, 2009. The simplified Chinese version was opened on June 1, 2009. The Chinese name must have the meaning of "response to all requests".
Function
In addition to web search, the simplified Chinese version of Bing also provides vertical search services such as image search, video search, information search, and map search. The English version also provides vertical search services such as travel, history, and shopping. The interface is soft, and the home page is backgrounded by constantly updated pictures. For certain search terms, Bing will classify the detection results.
Chapter 4 Searching Chinese Internet Databases
1. CNKI and Chinese Journal Full-text Database
⑴CNKI
Overview
China Knowledge Infrastructure Project (CNKI) is a key national informatization project with the goal of realizing the sharing and value-added utilization of knowledge and information resources throughout society.
China Journal Network, also known as China National Knowledge Infrastructure or Knowledge Innovation Network, is an important part of the CNKI project. It is an information resource system that integrates journals, papers, patents and newspaper information. Users can use it through China Journal Network Database products.
The main series of Chinese source database products that CNKI has launched include: China Journal Full-text Database, China Doctoral Thesis Full-text Database, China Excellent Master's Thesis Full-text Database, China's Important Newspaper Full-text Database, China's Important Conference Paper Full-text Database, etc. There are three user service modes: online package library, mirror site, and full-text CD, and IP identity authentication is used to confirm legitimate users.
⑵Chinese Journal Full-text Database
Overview
China Journal Full-text Database (CJFD) is a large-scale, integrated, multi-functional, continuously dynamically updated Internet-based journal full-text database developed on the basis of "Chinese Academic Journals (CD-ROM Edition)". It is the most distinctive feature of CNKI. A literature database.
Features
① It integrates bibliography, abstracts, and full-text information to achieve a high degree of integration of massive data and a one-stop literature information retrieval.
②The knowledge content is organized with reference to the popular knowledge classification system at home and abroad, and the database has a knowledge classification navigation function.
③ There are multiple search entrances. Users can not only conduct primary searches through a single search entrance, but also use Boolean logic operators and other flexible search questions to conduct advanced searches.
④ It has citation search and link functions. In addition to building relevant knowledge networks, it can also be used for measurement and evaluation of individuals, institutions, papers, journals, etc.
⑤ The full-text information is completely digitized. By downloading the most advanced reader software for free, the original layout structure and style of the journal article can be displayed and printed without distortion.
⑥ Diversified product forms and timely data updates can meet the personalized information needs of users of different types, industries and sizes.
⑦Every paper in the database has obtained a clear electronic publishing authorization.
⑧Database exchange service centers throughout the country and overseas, coupled with year-round user training and efficient technical support.
Search
① Primary search
When entering the Chinese Periodicals Full-text Database from the CNKI Chinese Periodicals Network, the system's default search method is the primary search method. The left side of the page is the navigation area, which is used to help determine the range of albums to be searched. The specific implementation steps of the primary search are as follows: ① Select the category range ② Select the search field (topic, title, keywords, abstract, author, first author, unit, reference, CLC classification number, etc.) ③ Enter the search terms ④ Select each search restriction (time span, update, range, match, sort, per page).
②Advanced search
Advanced search can be used to achieve fast and effective combined queries, with fewer query results and a high hit rate. The advanced search page lists three search term input boxes and three search term drop-down lists by default. You can also add or reduce search terms through the and - on the page. The search terms can be AND (OR) or (OR). Logical combination of five Boolean relationships, including NOT, same sentence, and same paragraph, to achieve retrieval of complex concepts and improve retrieval efficiency. The system's default logical relationship is "and". Advanced search can also select the search time span, update, range, matching, and sorting method of search results.
③Professional search
Professional search is a more powerful and more accurate search method than advanced search, but professional search is more suitable for professional searchers who are proficient in search technology. It allows searchers to compile search terms that meet their own information needs based on the system's search syntax. .
④Search results
The search results page is divided into a bibliographic page and a detailed information page.
⑤Special services
In addition to regular services such as information retrieval, information consultation, and original text delivery, the China Journal Full-text Database can also provide some special services. Such as citation services, query services, journal evaluation services, scientific research capability evaluation, project background analysis, and topic setting services.
2. VIP Information System and Chinese Science and Technology Journal Database
⑴VIP information system
Overview
Chongqing VIP Information Co., Ltd. is a large-scale professional data company affiliated to the Southwest Information Center of the Ministry of Science and Technology. Since 1989, it has been committed to the in-depth development, promotion and application of newspapers and other information resources.
The VIP Information Network, also known as Tianyuan Data Network, developed by the company in 2000, has developed into a world-famous Chinese information service website and the largest comprehensive literature service network in China after years of commercial operations, and has become an important strategic partner of Google Search. The largest Chinese content cooperation website with Google Scholar.
The three important databases on VIP Information Network are: Chinese Science and Technology Journal Database, Chinese Science and Technology Journal Citation Database and Foreign Science and Technology Journal Abstract Database.
⑵Chinese science and technology journal database
Overview
The Chinese Science and Technology Journal Database is the largest comprehensive full-text database in China developed by Chongqing VIP Information Co., Ltd. in 1989. A CD-ROM version of the database was published in 1992, and an online version of the database service began in 1999.
Features
① It is the Chinese journal database that contains the largest number of domestic journals, the longest period of time, and the largest amount of professional literature.
② Citing common rules such as the China National Library of Science and Technology of China for classification indexing and subject indexing, and implementing the ISO9001 international quality management system, it is a standardized database with higher quality assurance.
③ Adopt domestic first-class full-text retrieval core and international standard PDF full-text data format to implement faster, more stable and clearer database retrieval services.
④The unique synonym library and author library with the same name can more accurately locate the user's search request.
⑤The personalized "My Database" service function can save the user's search history, collected full-text documents, and various customized search plans.
Search
The Chinese Science and Technology Journal Database provides five search methods: fast search, advanced search, traditional search, classification search and journal navigation. It limits the search scope through subject categories and data years, and uses logical operators and or not to construct search terms or express logical combination relationships.
①Quick search
That is, simple search, and the system defaults to fast search. Directly select the search field on the database search page, enter the corresponding search terms in the subsequent text box, and click the search button to complete the quick search process.
②Advanced search
Advanced search provides two search methods: guided search and direct input search for readers to choose.
③Traditional search
When searching between traditional search pages, you must first select synonyms, authors with the same name, journal scope, year, search entry and search formula at the top of the search page, and then navigate or classify according to album in the navigation area on the left side of the search page Category division of navigation, select the subject category you want to search, and then perform the search.
④Category search
Classification retrieval is equivalent to the classification navigation restricted retrieval of traditional retrieval. Professional standards personnel classify and index each mid-term journal data according to the China Library Classification Method. Users can select the subject classification to be searched according to the needs of the retrieval topic. Classification retrieval can meet users' different requirements for classification refinement.
⑤Journal Navigation
Provides three search methods: alphabetical search, journal subject classification navigation, and foreign data collection navigation. Users can browse the included journals or search for a specific journal by alphabetical order of the journal name, subject category, journal title or ISSN number.
⑥Search result display, output and full text browsing
The search result records of the database have two display formats: simple record and detailed record; pdf, email, print; browse by journal title.
⑦Document correlation function
On the detailed record format page of the search results, there is a "related literature" clustering function, which provides literature associations in three directions: topic-related, references, and citations of this article.
3. Wanfang Data Knowledge Service Platform and Wanfang Academic Journal Database
⑴Wanfang Data Knowledge Service Platform
Overview
Wanfang Data Co., Ltd. is the first joint-stock high-tech industry in China with information services as its core industry.
The foreign data resource system is a large-scale scientific and technological business information platform developed and established by Beijing Wanfang Data Co., Ltd. The system began to provide services to the outside world in August 1997. In June 2009, the Wanfang Data Resource System was fully upgraded to the Wanfang Data Knowledge Service Platform. As the largest comprehensive knowledge information service platform in the country, its data resources are complete and the search methods Personalization, diversified knowledge network expansion, and scientific file management can provide users with comprehensive online information services.
The database resources of Wanfang Data Knowledge Service Platform provide readers with more than 100 databases in several resource sections such as academic papers, academic journals, theses, conference papers, patented technologies, Chinese and foreign standards, scientific and technological achievements, policies, regulations and institutions. Provide information retrieval services.
⑵Wanfang Academic Journal Database
Overview
The Wanfang Academic Journal Database is an important part of the Wanfang Data knowledge service platform. It collects the full-text content of a variety of science and technology, humanities and social science journals, most of which are core journals that have entered the statistical source of scientific papers of the Ministry of Science and Technology.
Search
①Browse
You can browse journals by subject, region or alphabetical order of the journal title. The journal page provides information about the journal and can be searched within the journal.
②Search
On the journal search page, click "Search Papers" or "Search Journal Titles" to search papers and journals respectively. For paper retrieval, the system provides simple search, advanced search and professional search.
First, simple search is the system's default search method. Enter keywords in the input box, click Search Papers, and the system will automatically search for documents.
Second, advanced search refers to adding search conditions within a specified range to meet more complex user requirements, thereby enabling users to retrieve satisfactory information.
Third, professional retrieval is a way for professional users to use Common Query Language (CQL) to construct retrieval formulas that can express new user needs to perform retrieval.
③Processing of search results
Results display; view and download full text
4. The Books and Newspapers Data Center of Renmin University of China and the full-text database of photocopied newspapers and periodicals of Renmin University of China
⑴Books and Newspapers Information Center of Renmin University of China
Overview
The Book and Newspaper Information Center of Renmin University of China (hereinafter referred to as the Book and Newspaper Information Center) was established in 1958. It has now developed into a comprehensive, cross-media modern publishing organization and new resource service that also operates periodical publishing, online electronic publishing, information consulting and other businesses. mechanism.
The printed version of "Photocopied Journal Materials" selected by the Book and Newspaper Data Center has become the most influential social science literature database in China due to its wide coverage, large amount of information, scientific classification, strict screening, and reasonable and complete structure. Since 2001, Beijing Boliqun Company has produced and distributed an online version of the National People's Congress photocopied newspaper and periodical information database, including a full-text database, a digital journal database, a newspaper and periodical index database, a newspaper and periodical database, a catalog database and a special research database.
⑵Full-text database of National People’s Congress photocopied newspapers and periodicals
Overview
The full-text database of Renmin University's photocopied newspapers and periodicals is a collection of social science and humanities documents selected by more than 100 experts, scholars, and professors hired by the Renmin University of China Book and Newspaper Data Center from more than 6,000 core journals and newspapers published nationwide. This database has very important reference value for researchers, teachers and students of various schools in their studies and research.
Features
⑴Has full search function
This database selects central and local newspapers and periodicals, university journals and other literature. It not only contains independent papers, but also compiles indexes of unselected articles, including article titles and table of contents, covering many fields of social sciences.
⑵Academic and authoritative
This database focuses on selecting information on various academic theories, with special attention to hot issues in the humanities and social sciences. The reprint rate of this database has become one of the main indicators for the academic community to evaluate the quality of journals and academic papers.
⑶ Be novel and innovative
This database collects the latest special literature in the field of humanities and social sciences, and reflects new theories and trends in a timely manner. It not only pays close attention to the scientific development trends in the information age, but also strives to track new developments in social sciences and humanities.
Search
①Simple query
It is set for cross-database search, and you can select one or more databases for search.
②Advanced query
Advanced search provides single condition or compound query of multiple conditions.
③User customization and auxiliary functions
Using the "user customization" and "accessibility" options provided by the database, users can customize their own personalized interface to assist browsing and retrieval.
5. Comparison of the four major Chinese journal full-text databases
⑴Inclusion scope and quantity
① VIP Chinese scientific and technological journal database has the largest collection and the longest revenue period, and is more suitable for users to conduct retrospective searches of scientific and technological documents.
②CNKI China Journal Full-text Database includes a large number of journals on education and social sciences, political economy and law, and is very comprehensive. It is highly complementary to the VIP database and cannot replace each other.
③The Wanfang Academic Journal Database has a high duplication rate with CNKI and VIP.
④The core journals of the National People’s Congress full-text database of copy materials have the highest revenue ratio and the best quality of included documents, followed by Wanfang.
⑵Search function
① Search methods: Each database provides search methods such as keywords, article titles, journal titles, authors, institutions, abstracts, etc. Each database has its own characteristics. For example, CNKI has the most search methods, and the hit rate is higher when searching for the same search terms through the same search field. It is more suitable for retrieving cutting-edge topics or unpopular topics with a small amount of literature. The detection rate of VIP is relatively low, but the hit results are relatively practical.
②Search method: Each database provides navigation search, simple search and advanced search functions, but there are slight differences in the implementation of functions. For example, for simple search functions, CNKI, Wanfang and National People's Congress copy data databases only provide a search box, and only one search term can be entered at a time, and words combined with multiple operators are not supported. VIP's simple search supports the simultaneous input of words combined with multiple operations in the search box.
③Special search function: CNKI provides a search word dictionary, which is conducive to a more comprehensive and accurate retrieval of document information. VIP has compiled a thesaurus and used the author database of the same name to limit the author's unit. It also provides a search function for CLC classification numbers to improve the recall rate and precision rate. Wanfang can limit the distribution area. The National People's Congress photocopying materials provide input help for search words in multiple fields, and can select qualified help words from matching words, pinyin, strokes, matching methods and logical relationships.
⑶Search results
①Full-text output format: Documents in several databases can be browsed as original images, and the full text can also be processed by text recognition using the recognition system provided on the document reader.
② Sorting and deduplication function: CNKI search results are sorted and output according to the relevance of the topic or the date of the document. VIP search results are arranged in reverse chronological order, and results in the same time period are arranged in journal order. Wanfang can sort by relevance, classic papers or latest papers. The sorting method of NPC copy materials is the most flexible and diverse. It can be sorted by file loading time or by any specified field.
⑷User interface
①CNKI and the full-text database of National People's Congress photocopying materials have simple operation interfaces and simple and flexible search methods. The National People's Congress photocopying materials also provide more input help information, which can be mastered even by users without professional knowledge.
② VIP Chinese Science and Technology Journal Database enables literature browsing through classification and navigation, and can also be searched through primary and advanced search methods. The search pages are displayed more clearly.
③Wanfang searches for documents through multiple search methods, but the pages are too complex and cumbersome, making it difficult for first-time users to successfully master and achieve high recall and precision rates.
④ In addition, compared with foreign full-text databases, several major domestic databases have not yet developed many personalized search functions, and their level of intelligence is not high, and they need further improvement and perfection.
⑸Service method
① In terms of service methods, several databases provide service methods such as retrieving service cards, CDs, LANs, establishing mirror sites, package libraries, and traffic accounting.
②From the perspective of order price, CNKI and VIP are relatively cheap and convenient to use and maintain, while Wanfang is relatively expensive.
③From the user's perspective, when selecting a database, the selection can be based on the specific characteristics and needs of the unit.
④In addition, when obtaining the full text from the Internet, each document of CNKI, Wanfang and Renmin University photocopied materials can be directly linked to the full text from the bibliography for download, while some documents of VIP can only be obtained through e-mail.
7. National newspaper and periodical index database
Overview
The National Newspaper and Periodical Index Database was founded in 1993. It is currently the largest continuously and dynamically updated Chinese newspaper and periodical index database in the world. Its content covers various fields such as humanities, social sciences, natural sciences, etc., and covers newspaper and periodical resources published in China. At present, the database is divided into four databases: catalog database, article title database, conference database, and Spanish database.
Search
Ordinary search: The default page for database retrieval is a very ordinary search page, and ordinary retrieval supports field retrieval.
Advanced search: In addition to ordinary search functions, advanced search also supports logical combinations between fields.
Professional search: Provides command search query method, you can directly enter the assembled search formula to search.
Journal navigation: First, search based on the journal title, year of publication, sponsor, and place of publication. The second is to browse the journals according to the order of the pinyin initials of the journal titles. The third is to browse journals according to the Chinese Library Classification.
8. Chinese Social Sciences Citation Index
CSSCI Overview
The Chinese Science Citation Index (CSSCI) is a key research project of the Ministry of Education. It was developed by the Chinese Social Science Research Evaluation Center of Nanjing University and is used to search for the inclusion and citation status of Chinese papers in the field of humanities and social sciences.
Search
⑴ Source document retrieval: It is mainly used to query the authors, film titles, references, etc. of other articles from the sources used.
⑵ Cited literature search: mainly used to query the citation status of authors, papers, journals, etc.
9. Chinese Science Citation Database
CSCD Overview
The Chinese Science Citation Database (CSCD) was founded in 1989. It is jointly funded by the National Natural Science Foundation of China and the Chinese Academy of Sciences, and undertook and developed by the Documentation and Information Center of the Chinese Academy of Sciences. The compilation of the database fully refers to the compilation system of the American "Science Citation Index".
The database is rich in content, scientific in structure, and accurate in data. In addition to general search functions, the system also provides a new type of index relationship - citation index.
10. China’s higher education document security system
CALIS overview
The China Higher Education Documentation Security System (CALIS) is one of the higher education public service systems in China approved by the State Council. The purpose is to integrate national investment, modern library concepts, advanced technical means, efficient and rich literature resources and human resources under the leadership of the Ministry of Education to build educational documents with China Higher Education Digital Library (CADLIS) as the core The comprehensive guarantee system realizes the co-construction, common knowledge and sharing of information resources to maximize social and economic benefits and serve China's higher education.
The CALIS Management Center is located at Peking University. Since its construction began in 1998, it has developed an online cooperative cataloging system, a document delivery and interlibrary loan system, a unified search platform, and a resource registration and scheduling system, forming a relatively complete CALIS document information service network.
Data resources mainly fall into two categories: foreign language data resources and Chinese data resources. Foreign language data resources include: full-text database, abstract database and fact database, which are mainly divided into several categories such as foreign language full-text e-book database, foreign language PhD and master's degree thesis full-text database, OcLC FirstSearch database system, special resources database and other imported databases. Chinese data resources include: joint catalog sub-project, university thesis database sub-project, special topic database sub-project, key subject navigation database sub-project, virtual reference consultation sub-project, teaching reference information sub-project, resource assessment sub-project and standard specification construction wait.
Overview of CALIS College Thesis Database
The CALIS university dissertation database contains doctoral and master's degree theses from 83 member libraries including Peking University, Tsinghua University and other nationally renowned universities.
The database provides two search methods: simple search and complex search.
11. Online bibliographic search system
Overview of the Online Library Public Search Catalog
The Online Library Public Access Catalog (OPAC) evolved from the open public access catalog. It was jointly developed by some university libraries and public libraries in the United States in the late 1970s to provide an online bibliographic retrieval system for readers to query collection data. OPAC is the foundation of library automation and an integral part of future electronic libraries.
Characteristics of current OPAC
⑴ Data resources are more abundant. On the basis of providing bibliographic data, the current system adds data sources such as indexes, personal directories, institutional directories, maps, manuscripts, etc., and is linked to full-text databases, making it suitable for users to obtain the full text of documents remotely.
⑵The user interface is more friendly. The design purpose of OPAC is to be standardized, concise, vivid, and suitable for ordinary end users without special training. The system not only prompts and guides users to operate correctly and quickly, and provides feedback information for human-computer dialogue, it also provides detailed error information and a display format that is consistent with user habits.
⑶Retrieval technology is flexible and diverse. Using various retrieval technologies such as keyword retrieval, natural language retrieval and Boolean logic retrieval,
Online services are more thoughtful. OPAC is a network-based bibliographic retrieval system that provides a full range of online information retrieval services.
Chapter 5 Searching English Online Databases
EBSOhost system full-text database
ProQuest system full-text database
Elsevier Science Direct full-text database
SpringerLink full-text journal database
Journal storage back issue full text database
Chapter 6 Core Search Evaluation System (omitted)
Dialog international online search system
OCLC FirstSearch international online search system
ISI Web of Science database
Engineering Index
Chinese Social Sciences Citation Index
Chapter 7 Retrieval of Special Literature Information
1. Search for conference documents
Overview of conference literature
concept
Conference documents: refers to the materials and publications produced at domestic and foreign academic and non-academic conferences, including conference papers, conference documents, conference reports, discussion papers, etc., among which conference papers are the most important conference documents.
Features
⑴ Strong professionalism and high academic level;
⑵The content is novel and timely;
⑶Large amount of information and concentrated professional content;
⑷High reliability;
⑸Flexible and diverse publishing forms, etc.
Classification
⑴Divided according to the order of publication time: pre-conference documents, mid-conference documents, and post-conference documents
⑵Divided by publication form: books, journals, scientific and technological reports, audio-visual materials
Retrieval of domestic conference documents
Internet search system
CNKI China's important conference paper full-text database
The full-text database of China's important papers is the conference paper database of China Journal Network (CNKI), which collects the papers of my country's second-level and above societies, associations, universities, research institutes, academic institutions and other units since 2000.
Search methods: The database provides multiple search methods such as primary search, advanced search, professional search and conference organizer navigation.
Wanfang academic conference paper database
This database contains conference papers sponsored by the world's major societies and associations from 1985 to the present provided by the China Institute of Scientific and Technological Information, mainly high-quality conference papers sponsored by first-level societies and associations. The content of conference papers covers natural sciences, engineering It covers many fields such as technology, agriculture, forestry, and medicine. It is one of the most comprehensive and largest collection of conference paper databases in my country.
Search method: Search through Wanfang Data Knowledge Service Platform.
Shanghai Library Conference Information Database
The Shanghai Institute of Science and Technology Information, which merged with the Shanghai Library in 1995, has collected and collected various scientific and technological conference documents since 1958, forming a professional collection. It now provides retrieval services for conference materials from 1986 to the present.
Search method: You can choose one of the search methods such as document title, paper title, individual responsible person, conference name, conference location, date/category, etc., and full-text copying services are provided.
CALIS academic conference paper library
Includes papers from international conferences hosted annually by 61 key schools of the 211 Project. Most of the conferences provide conference proceedings with official publication numbers. If visitors want to obtain the full text, they can use interlibrary loan and document delivery.
China Conference Network, China Academic Conference Online and other professional websites, conference sites, etc.
Retrieval of foreign conference documents
ISI Conference Proceedings Database (WOSP)
The ISI Web of Knowledge retrieval platform of the Institute of Scientific Information (ISI) integrates the scientific and technological conference proceedings citation index and the social science conference proceedings index into ISI Proceedings, which is included in the ISI Web of Science database, referred to as WSOP. A collection of the latest published conference paper materials in the world.
Conference Paper Index (CPI) database
"Conference Paper Index" was founded by the American Data Express Company in 1973. It was originally called "Recent Conference Forecast" and was renamed in 1978 as a monthly publication. Provide timely information on the latest research advances in science, technology and medicine. The current issue of "Conference Paper Index" includes a classification table, conference address table, main text and index.
The American Conference Proceedings Index Database is the online version of the Conference Proceedings Index, which collects information on conferences and conference documents around the world since 1982, and provides an index of conference papers and announcement conferences.
2. Search for dissertations
Dissertation Overview
concept
Thesis is produced with the implementation of the degree system. It is an academic research paper written by graduates of colleges and universities or scientific research institutions to obtain degree qualifications.
Features
⑴ The content should be specific and original: Dissertations, especially doctoral theses, generally discuss relatively specialized topics and often contain important information or novel and original academic viewpoints, which have great reference value.
⑵Special publication form: The purpose of dissertations is only for review and defense. Most of them are not published publicly, but are kept in the form of printed copies in the library of the degree-granting institution or other prescribed collection locations.
⑶ Huge number and decentralized management: With the increasing scale of degree education, colleges and universities or scientific research institutions around the world produce a large number of master's and doctoral theses every year. These dissertations are generally collected in various granting units or designated locations. It's more difficult to get up.
Classification
⑴According to the level of the degree awarded, it can be divided into bachelor's thesis, master's thesis and doctoral thesis;
⑵According to the discipline and major studied by the degree applicant, thesis can be divided into social science thesis and natural science thesis;
⑶ According to the country of the awarding unit, dissertations include domestic dissertations and foreign dissertations;
⑷ According to the language of the dissertation, it can be divided into Chinese dissertation, Japanese dissertation, English dissertation, etc.
Search domestic dissertations
Traditional search methods
Consult printed reference books, such as: "China Dissertation Bulletin", "China Doctoral Dissertation Abstracts", etc.
Internet search methods
One is comprehensive or professional search tools, such as: "Chemical Abstracts" and "Science Abstracts" all include dissertations in their respective fields;
The second is specialized thesis retrieval tools, such as: International Dissertation Abstracts, CNKI China's Excellent Doctoral and Master's Thesis Full-text Database, Wanfang China Dissertation Database, CALIS University Dissertation Database, etc.;
Third, the website of the thesis collection institution of the university library provides the database of thesis collected by the institution, such as: Peking University Dissertation Database, Tsinghua University Dissertation Service System, etc.
Retrieval of foreign dissertations
PQED Doctoral/Master's Thesis Database (ProQuest)
It is the world's largest and most widely used international dissertation abstract index database. It provides 18 search languages, including Chinese, and searches mainly through browsing, basic search, and advanced search.
NDLTD Degree Database
Who has an online dissertation co-construction and sharing project supported by the National Natural Science Foundation of the United States? Utilizing the OAI's joint catalog of theses and dissertations, we provide users with free dissertation abstracts, as well as the full text of some freely accessible dissertations.
3. Patent document search
Overview of patents and patent documents
concept
Patent: refers to a legally protected, technologically exclusive right granted to patent applicants by the patent administration department in accordance with the law in a country that has established a patent system. Patents usually include three levels of meaning: patent rights, patented inventions, and patent documents.
Patent literature: It is the general name for official documents and publications produced by national and international patent organizations that implement the patent system during the approval process. It is an important information source that integrates technical, legal and economic aspects. Patent documents in a broad sense mainly include the following types: patent specifications, patent gazettes, and patent classifications.
(Intellectual property rights: refers to the exclusive rights that people enjoy in accordance with the law regarding the results of their intellectual labor. It is usually the exclusive right or exclusive right granted by the state to creators over their intellectual achievements for a certain period of time.)
Characteristics of patent documents
⑴ Detailed content and novel technology.
⑵The amount of literature is large and covers a wide range of fields.
⑶ It spreads rapidly and has a large number of repeated reports.
⑷The content is limited and technically conservative.
Types of patent document searches
Novelty search: usually also called novelty search, advanced nature search or authorization prospect search. By searching patent documents, we can determine whether a technical subject is attractive and creative as stipulated in the Patent Law.
Thematic search: A worldwide search of patent and non-patent documents on a certain technical topic to retrieve all relevant documents.
Through patent family search, you can learn about the patent applications for the same subject technology in multiple countries to determine the regional protection scope of this patent.
Legal status search: It includes patent infringement search and validity search.
Tracking and retrieval: Regular tracking of a certain major can help you understand the development direction of related technologies and master the latest patent information.
Printed search tool for patent documents
UK Derwent professional literature search tool
The British Derwent Company is a world-famous patent document publishing organization founded in 1951.
The "World Patent Document Search Tool" published by the company is the patent document search tool with the widest coverage, the largest scale and the most complete search system in the world. It uses English to report patent information in more than 30 countries and regions, two international patent organizations and two international patent publications in the world in the form of bibliography and abstracts, by country and by profession.
It is published quickly and has various media. In addition to printed retrieval tools, it also has microfilm, disk and optical disk databases and other forms, and is widely used by countries around the world.
"China Patent Bulletin"
The "China Patent Bulletin" includes three volumes: "Invention Patent Bulletin", "Utility Model Patent Bulletin" and "Design Patent Bulletin".
It reports in the form of abstracts or bibliographies the disclosure specifications, approval specifications, authorization announcements and invention patent affairs announcements published within a week.
Updated weekly, it aims to quickly report on my country's recent patent situation. You can learn about my country's latest patent situation through the patent bulletin.
"China Patent Index"
In order to facilitate the retrospective retrieval of my country's patent documents, the State Intellectual Property Office published the "China Patent Index". The index reports on the disclosure, review, announcement and authorization of invention patents, utility model patents and design patents during the cumulative period. It is now a quarterly publication.
The "China Patent Index" is now published in three volumes: "Classification Number Index", "Applicant and Patentee Index" and "Application Number and Patent Number Index". Users can obtain the classification number, invention title, application number, patent number, applicant or patentee, and the volume and issue number of the corresponding patent bulletin by querying any index.
Search for Chinese patent information
Search method
One is through printed search tools, such as "Patent Gazette", "China Patent Index", "China Patent Abstract", etc.;
The second is through CD-based retrieval systems, such as China Patent Abstracts Database, China Patent Specifications Database, etc.;
The third is to use network-based search systems, such as the China Intellectual Property Office Patent Search System, China Patent Information Network, China Intellectual Property Network, CNKI China Patent Database, etc. These patent databases, intellectual property rights and patent websites are frequently used patent information retrieval methods.
Advantages of online search systems
The network search system for patent information resources greatly exceeds the search scope of traditional patent search tools in terms of search space. Not only is it rich in data resources, many patent databases can also provide valuable information such as the full text of patent specifications;
At the same time, the patent information resource network retrieval system provides multi-language retrieval, which has high retrieval efficiency, is not limited by time and space, and has strong retrieval timeliness;
In addition, the patent information resource network search system provides multiple search methods such as classified browsing, simple search, menu search, etc. It also provides online help, operation guides and many other auxiliary functions.
Retrieval of foreign patent information
USPTO patent database
The USPTO patent database is an online patent database provided by the United States Patent and Trademark Office (USPTO). It provides retrieval services for bibliographies, abstracts, and full text of patent specifications including drawings of U.S. patents through the Internet. The data is updated weekly.
It can be divided into two parts: authorized patent database and patent application database.
Derwent series database
The Derwent series of patent databases are launched by Derwent, the world's most authoritative patent document publishing organization. They are currently the most powerful patent databases with search capabilities. They mainly provide search services for the following three databases.
⑴ Derwent World Patent Index (WPI): It is the world's most authoritative, high value-added deep processing patent database. It mainly collects patents from 41 industrialized countries and regions around the world and two international patent organizations, and can provide users with the world's major patent databases. Patent specifications issued by the agency. He compiled a patent classification system using the International Patent Classification, all in English abstracts.
⑵ Derwent Patent Innovation Index (DII). It is a Web-based patent information database launched by Derwent Company. It integrates the Derwent World Patent Index (WPI) and the Patent Citation Index (PCI) to provide global patent information services.
⑶Derwent Discovery
4. Search for scientific and technological reports
Overview of scientific and technological reports
concept
Science and Technology Report: Also known as a research report or technical report, it is a formal report on the results of a scientific research project or scientific research activity or an actual record of the research process. It is a written form of a scientific research institution, scientific research unit, professional academic group or individual that provides funding and A report by the sponsoring department or organization on the progress of its research design or project.
type
⑴ According to content: it can be divided into two categories: basic theoretical research reports and engineering technical reports;
⑵Divided by form: can be divided into technical reports, technical notes, technical papers, technical memoranda, notifications, technical translations, contractor reports, special publications, others, etc.;
⑶ According to the degree of research progress: it can be divided into preliminary report, progress report, intermediate report and final report;
According to the scope of circulation: it can be divided into top secret reports, confidential reports, secret reports, non-confidential restricted release reports, non-secret reports and declassified reports.
Features
⑴The content is novel, professional and specific;
⑵Respond quickly to new scientific and technological achievements;
⑶ Many types and large quantities;
⑷The publishing form is unique.
Search for Chinese scientific and technological achievements
Wanfang China Science and Technology Achievements Database
National scientific and technological achievements database of CNKI
Aviation science and technology report abstract database
National research report
National Science and Technology Library and Documentation Center
Four major U.S. government science and technology reports and their retrieval
Four major U.S. government science and technology reports
⑴PB report
After World War II, the United States established the Publication Bureau (PB) of the Department of Commerce in order to organize internal scientific and technological materials obtained from the defeated countries. Each document is titled with the initials PB of the English name of the U.S. Publication Bureau of Commerce, so it is called a PB report. The scope of the PB report has changed several times and now focuses on civil engineering technology.
⑵AD report
The AD report was produced in 1951. It was originally a scientific and technological report of the US Military Technical Intelligence Agency and had a unified number. It was later changed to the National Defense Technical Information Center and continued to use the AD number to collect and report reports on national defense research and development results.
⑶NASA report
The NASA report focuses on aviation and aerospace technology. It is a scientific and technological report published by the National Aeronautics and Space Administration (NASA) and has a unified number.
⑷DE report
Formerly known as the DOE report, it is a technical report document published by the U.S. Department of Energy (DOE) and its affiliated scientific research institutions, energy information centers, companies, enterprises, and academic groups focusing on energy and its applications.
A print search tool for the four major U.S. science and technology reports
Search tool for the four major reports - "Government Report Circular and Index"
GRA&I reports research reports and scientific literature provided by U.S. government agencies and their contractors in the form of abstracts, and is the main search tool for finding the four major scientific and technological reports. Covers all PB and AD reports, some NASA reports and DE reports.
Search tool for NASA reports—"Aerospace Science and Technology Report"
STAR is the main retrieval tool for finding NASA reports. It is a comprehensive abstract publication published by the National Aeronautics and Space Administration's Intelligence Division. It is an auxiliary tool for retrieving the four major reports.
DE report search tool - "Energy Research Abstracts"
ERA's Energy Research Abstracts, edited and published by the U.S. Department of Energy's Office of Science and Technology Information, is the primary tool for retrieving first reports. It mainly reports in the form of abstracts research reports provided by laboratories, research centers and contractors affiliated to the U.S. Department of Energy.
Internet search of reports from four major scientific disciplines in the United States
NTIS system
Provided by NAIS, the U.S. National Technical Intelligence Service, it is the online version of the U.S. "Government Reports Bulletin and Index" and is mainly used to search for the four major reports of the U.S. government.
STINET database
The U.S. Defense Intelligence Center Reports Database (STINET) provides free search services through the Defense Technical Intelligence Center Science and Technology Network Server.
NASA technical reports Server (NTRS)
Used to search for aerospace science and technology reports, you can browse and search the abstracts and full texts of the reports.
5. Search for standard literature
Overview of standards literature
concept
Standard document: It is prepared in accordance with prescribed procedures and approved by a recognized authoritative organization for extensive and repeated use within a certain range. It includes a set of technical documents that must be implemented in a specific activity area such as specifications, quotas, plans, and requirements. The special scientific and technological literature system composed of.
Features
⑴Have a unified production process and special writing format and narrative method;
⑵Have a clear scope of application and purpose;
⑶Legally binding;
⑷It is timely;
⑸Be coordinated.
type
⑴Divided by scope of use: international standards, regional standards, national standards, industry standards, local standards, and enterprise standards.
⑵Divided according to content and nature: technical standards, management standards.
⑶Divided according to the degree of legal constraints: mandatory standards and recommended standards.
Retrieval of Chinese standard documents
Traditional search tools
"National Standard Catalog of the People's Republic of China"
It contains all current national standard information, and also supplements the catalog of replaced and abolished national standards, as well as national standard modifications, corrections, errata notices and other related information.
"China Standardization Yearbook"
Its content includes three parts: the current status of national standardization undertakings, the national standard classification catalog and the standard serial number index.
"Compilation of Chinese National Standards"
This compilation is a large comprehensive collection of national standards. It collects all current national standards officially published in my country.
"National Standards Replacement and Abolition Catalog"
It provides the latest replacement, abolition and transformation information of national standards.
"China Standard Herald"
It is a standardized comprehensive publication integrating policy, academic, technology and information.
"World Standard Information"
Introducing the latest national standards, industry standards, "Taiwan" standards, international and foreign advanced standards, and domestic and foreign standardization trends in the form of bibliographies.
Web search tools
Wanfang Chinese and foreign standards database
The database includes a large number of standards at home and abroad, including all standards issued by the Chinese state, industry standards in certain industries, and technical standards for electrical and electronic engineers; it includes international standards databases, national standards from the United States, Britain, Germany, etc., and international electrical standards; it also includes Includes industry standards in certain countries, etc.
CNKI Chinese Standard Database
The China Standards Database contains all national standards published by China Standards Press and issued by the National Standardization Administration Committee from 1950 to the present, accounting for more than 90% of the total national standards.
National Standardization Administration website
The China National Standardization Administration Committee is the competent authority authorized by the State Council to perform administrative functions and uniformly manage national standardization work. Its website provides a relatively systematic national standard search database.
National Standard Document Sharing Service Platform
"Construction of Standard Document Sharing Service Network" is one of the key construction projects of the National Science and Technology Basic Conditions Platform. It is a national standards information service portal and the Chinese site of the World Standards Service Network. The standard databases that can be queried include Chinese national standards, ISO international standards, IEC, ANSI, DIN and other standard databases.
ChinaGB national standard channel
This website is China's largest professional website for standards consulting services, providing comprehensive consulting services on Chinese national standards, industry standards, local standards, international standards, and foreign standards.
Retrieval of foreign standard documents
International Organization for Standardization ISO
International Electrotechnical CommissionIEC
American National Standard ANSI
Japanese Industrial Standard JIS
British Standard BS
German standard DIN
Chapter 8 Retrieval of data and factual information
1. Data and fact-based reference tools
⑴Definition and characteristics of reference reference books
definition
Printed reference reference book: referred to as reference reference book, it is a compilation of knowledge and materials of a certain subject or a specific scope according to certain social needs, using specific arrangement methods and retrieval methods, and is designed for the purpose of solving problems and providing data or factual information for people. , a specific type of book used as a tool for reference.
Features
⑴The information content is exclusively for reference
⑵Concise summary of items
⑶Special ease of inspection of arrangement
⑷The authoritativeness and reliability of the content
⑵Main types, structures and troubleshooting methods of reference reference books
type
Dictionaries, lexicons, encyclopedias, yearbooks, biographies, materials, handbooks, directories, guides, tables, catalogs, information compilations, etc.
structure
Description, Table of Contents, Text, Appendices and Index
Inspection method
Word order, classification, theme, natural order, alphabet
⑶Searching steps for reference books
①Analyze search topics and determine search tools
②View the arrangement structure
③Find detailed content
2. Data and fact database
Characteristics and types of data and fact databases
data database
Also known as numerical database, it refers to a type of database that uses various survey and statistical data as storage objects and specifically provides data representation in digital form.
factual database
It refers to a type of database that stores various factual information with retrieval and utilization value. Database information comes from encyclopedias, dictionaries, directories of people, directories of institutions, etc.
Features
①Rich and complete content with extensive links
②Flexible and convenient to use
③Fast data updates and powerful service functions
type
⑴Dictionary, Dictionary
Dictionaries are the most familiar and commonly used reference tools for people. They are tools that collect words such as language and thing nouns and arrange them in a certain order to find the pronunciation, spelling, grammar, meaning, usage, etc. of words. According to the content scope of the income entries, they can be divided into language dictionaries, comprehensive dictionaries and specialist dictionaries.
⑵Encyclopedia
Encyclopedia: refers to a large-scale reference tool that collects knowledge of various knowledge categories, or systematically and completely summarizes the knowledge of a certain knowledge category. It is the most complete reference tool and has the reputation of "King of Reference Books". The encyclopedia systematically and concisely explains the basic knowledge and important research results of each discipline, provides definitions, principles, methods, history, current status, statistics, reference materials and other information on a certain discipline, providing people with systematic and comprehensive knowledge and information.
Features: authoritative content, comprehensive interpretation, completeness of the retrieval system, completeness of the reference system, and completeness of the revision system.
Ancient encyclopedias are mainly divided into: general books and political books
Leishu: It is a large-scale information book in ancient my country. Compile materials from various books and arrange them according to categories, rhymes, etc. for inspection.
Political books: refer to special books that mainly record the evolution and changes of laws and regulations and the development of politics, economy, and culture. Because it has certain properties and characteristics of a reference book, people also classify it into the category of a reference book.
⑶Yearbook
Yearbook: It is a serial publication that is published annually, summarizes or reflects the major events, major progress and important results in the relevant fields in the previous year, and collects important documents, detailed data and statistical information. According to the content, it can be divided into comprehensive yearbooks, specialist yearbooks, statistical yearbooks and regional yearbooks.
⑷Manual
The manual is a practical and easy-to-read reference tool that collects basic knowledge and basic data within a certain range so that people can frequently check it during specific work processes such as production, scientific research, and teaching.
Features: clear theme, dense information, reliable data, easy to carry, and strong practicality.
⑸Directory
The directory includes a directory of people, places, institutions, etc. It is a reference tool that specifically collects and briefly reveals and introduces the names of people, places, organizations, etc. and their related information for search purposes.
⑹Notation
Sheet music is a reference tool that uses tables or other neat and concise formats, supplemented by simple words to record historical facts, time, geographical evolution and other information.
There are two main types of notation: chronology and calendar.
A chronology is a table that arranges events in chronological order and is designed for checking historical dates, historical events and other information.
A calendar is a reference tool that arranges the calendar days of different calendars together in a certain order to form a mutual comparison table for checking and converting years, months, and days of different calendars.
⑺ Catalog
Catalog is a reference tool that uses images, words and symbols to intuitively, concisely and clearly reflect the characteristics of objective things, including maps, catalogs of people, catalogs of cultural relics, subject maps of various natural sciences, and design drawings of technical sciences. Set etc.
⑻ Comprehensive database
Comprehensive databases contain several professional or multiple types of data and factual information.
Replenish
⑴Guidelines for Institutional Groups
Organization Guide: It refers to a tool book that can correctly guide readers to retrieve information about organizations. Whether it is a comprehensive understanding of the community or consultation with data and facts, it can be solved through the community guide or a tool book of the same nature.
⑵Biographic information
Biography: It mainly records the life stories of characters, and is selectively arranged, described and explained based on various written and oral memories, investigations and other related materials. Biography and history are closely related, and some biographies written a long time ago are often regarded as historical materials.
⑶Geographic data
It refers to a reference book used to check and study the name, brief introduction, history, evolution, history, current situation, etc. of the relevant place.
⑷Statistics
Statistical data refers to the general term for statistical results and other related data generated during statistical activities that reflect the national economic and social development.
⑸Regulatory information
Chapter 9 Comprehensive Utilization of Network Information Resources
Science and Technology Novelty Search
concept
Science and technology novelty search: It is a novelty searcher of an information consulting agency with novelty search business qualifications. Through manual retrieval and computer retrieval, it uses comprehensive analysis and comparison methods to provide literature verification results for evaluating the novelty of scientific research results, scientific research project establishment, etc. An information consulting service.
program
Novelty search entrustment, accepting novelty search entrustment and entering into novelty search contract, literature search, completing and submitting novelty search report, and filing documents.
Reference consulting services
Reference service: refers to targeting user needs and relying on various types of authoritative information resources to help and guide users to retrieve the required information or provide relevant data, literature, literature clues, topic content and other forms of information service model.
Digital Reference Consulting Services
Digital reference consulting services: refers to information institutions relying on the Internet, based on local collections and digital information resources widely distributed on the Internet, to provide users with reference consulting services that are not limited by time and space through certain electronic means. The main forms include digital reference consulting services based on email, digital reference consulting services based on real-time interaction, and cooperative digital reference consulting services.
Topic setting service
Topic setting service: Also known as SDI service, it is a service model that continuously delivers the latest information that meets the needs to readers one-time or regularly based on readers' needs. It also refers to a continuous service that information institutions collect, screen, organize and provide to users regularly or irregularly according to user needs until they assist in the completion of the project.
Thesis
concept
Thesis is an academic paper submitted by a degree applicant to obtain a degree. It reflects the degree applicant's knowledge, ability, and academic contribution. It is the basic basis for assessing whether the applicant can graduate and be awarded the corresponding degree. Including bachelor's thesis, master's thesis and doctoral thesis.
Features
The argument is objective and innovative; the arguments are detailed and verifiable, scientific; academic and logical; the style is clear and the language is standardized.
Literature review and proposal report are important parts of thesis
Literature review: refers to after determining the topic, and then based on extensive reading and understanding of the literature in the research field involved in the topic, the current research status, new levels, new trends, new technologies and new technologies in the research field are reviewed. Comprehensive analysis of findings, development prospects and other contents, summarizing and commenting, and presenting a special research report with your own insights and research ideas. It is characterized by concise language, large amount of information, objective comments, and eye-catching titles.
Proposal report: The proposal report refers to a topic plan written by graduates based on investigation and research and submitted to the expert committee for approval after the direction of the graduation thesis topic is determined. It is a written explanatory material for the topic selection of the graduation thesis.
Replenish
(1) American Chemical Abstracts CA
American Chemical Abstracts Service "Chemical Abstracts" CA is the world's largest and most updated chemical literature information database. It is also the most widely used and important information retrieval tool for chemistry, chemical engineering and related disciplines.
(2) American Biological Abstracts BA
Biological Abstracts (BA) of the American Biological Science Information Service is an important tool for searching biology, medicine, agronomy and related disciplines. BA has a variety of publishing formats. In addition to print editions, there are also CD-ROM editions and online editions.
(3) American Engineering Index EI
EI is a network-based information service system developed by the American Engineering Information Company and has extensive influence in the fields of applied science, technology and engineering research.
(4) British "Science Digest" SA
"Science Digest" was founded in 1898 and is now jointly published by the British Institute of Engineering and Technology and the American Institute of Electrical and Electronics Engineers. Its database name is INSPEC. The main subject areas covered by SA are physics, electrical and electronics, and computers and control.
Academic journals in the field of library and information
"Library and Information Work", "Journal of Library Science in China", "Journal of University Library Journal", "Journal of Information Science", "Library Magazine", "Library Forum", "Information and Documentation Work", "Library Theory and Practice", "Modern Library and Information Technology" Information Science" "Archival Science Newsletter" "Archival Science Research"
personal digital library
Personal digital library: refers to individuals who use free or basically free full-text database software on their own computers to collect and store relevant online information and self-created digital information resources in order to study and study, making it an organized library. Collection of information. Personal digital library is a type of digital library and is the digital library closest to the individual needs of users.