Semantic Scholar Overview and Features
- Semantic Scholar provides a one-sentence summary of scientific literature.
- It aims to address the challenge of reading numerous titles and lengthy abstracts on mobile devices.
- The project uses artificial intelligence to generate the essence of a paper through an abstractive technique.
- It combines machine learning, natural language processing, and machine vision to add semantic analysis to citation analysis.
- Research Feeds, an AI-powered feature, recommends the latest research to help scholars stay up to date.
- Semantic Reader is an augmented reader that revolutionizes scientific reading by making it more accessible and contextual.
- It provides in-line citation cards with TLDR summaries and skimming highlights to help users digest information faster.
- Semantic Scholar is designed to highlight the most important and influential elements of a paper.
- The AI technology identifies hidden connections and links between research topics.
- It exploits graph structures, including the Microsoft Academic Knowledge Graph and Springer Nature's SciGraph.
- Semantic Scholar does not search for material behind a paywall and is free to use.
Number of Users and Publications
- The Semantic Scholar corpus includes more than 40 million papers from computer science and biomedicine.
- The number of included papers metadata has grown to more than 173 million, including the Microsoft Academic Graph records.
- A partnership with the University of Chicago Press Journals made all articles published under the press available in the corpus.
- Semantic Scholar had indexed 190 million papers by the end of 2020.
- It reached seven million users per month in 2020.
Comparison with Other Search Engines
- A study found that for papers cited by secondary studies in computer science, Semantic Scholar and Google Scholar had comparable coverage.
- It uses AI and cannot distinguish between reviews of books and the books themselves.
- It struggles to distinguish between authors with common names due to its use of science standards for author identification.
- The assigned field for papers is not always clearly distinguished, leading to issues like listing History for Literature papers.
Related Concepts and Resources
- Citation analysis examines the frequency, patterns, and graphs of citations in documents.
- Citation index is an index of citations between publications.
- Knowledge extraction involves creating knowledge from structured and unstructured sources.
- There is a list of academic databases and search engines for further exploration.
- Scientometrics is the study of measuring and analyzing science, technology, and innovation.
Semantic Scholar Partnerships and Expansion
- Semantic Scholar has coverage of over 500 publishers, including the University of Chicago Press.
- In 2020, Semantic Scholar added 25 million scientific papers through new publisher partnerships.
- AI2, the Allen Institute for Artificial Intelligence, expanded Semantic Scholar to include biomedical research.
- AI2 collaborated with Microsoft Research to upgrade search tools for scientific studies.
- Semantic Scholar has gained recognition and usage in the scientific community and plays a crucial role in improving the accessibility and visibility of scientific research.
Semantic Scholar is a research tool for scientific literature powered by artificial intelligence. It is developed at the Allen Institute for AI and was publicly released in November 2015. Semantic Scholar uses modern techniques in natural language processing to support the research process, for example by providing automatically generated summaries of scholarly papers. The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval.
Type of site | Search engine |
---|---|
Created by | Allen Institute for Artificial Intelligence |
URL | semanticscholar |
Launched | November 2, 2015 |
Semantic Scholar began as a database for the topics of computer science, geoscience, and neuroscience. In 2017, the system began including biomedical literature in its corpus. As of September 2022[update], it includes over 200 million publications from all fields of science.