PubMed Central Open Access Subset (PMC OA)

Description

Not all articles in PMC are available for text mining and other reuse, many have copyright protection, however articles in the PMC Open Access Subset are made available for download under a Creative Commons or similar license that generally allows more liberal redistribution and reuse than a traditional copyrighted work.

Resources

Name Format Description Link
21 https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/#oasubset
21 The AWS RODA, PMC OAI service, the PMC FTP service and BioC API are the only services that may be used for automated downloading of PMC content. Systematic retrieval (or bulk downloading) of articles through any other automated process is prohibited. https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/
21 The PubMed Central OAI-PMH service (PMC-OAI) provides access to metadata of all items in the PubMed Central (PMC) archive, as well as to the full text of a subset of these items. PMC-OAI is an implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a standard for retrieving metadata from digital document repositories. Visit the Open Archives Initiative site for more information about the protocol and other activities of the OAI group. PMC-OAI supports OAI-PMH version 2.0. It does not support earlier versions of the protocol. https://www.ncbi.nlm.nih.gov/pmc/tools/oai/
21 All the PubMed Central (PMC) Open Access articles are available in the BioC format. This provides a large number of full text research articles for text mining and information retrieval research. BioC is a simple format designed for straightforward text processing. These articles are available in BioC XML or BioC JSON, in Unicode or ASCII, and via PubMed ID or PMC ID. https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/BioC-PMC/
21 PubMed Central (PMC) has several datasets of articles hosted the cloud and listed in the Registry of Open Data on Amazon Web Services (AWS), all of which can be accessed freely, without charge, through either an HTTPS or S3 URL. The datasets of articles are organized on AWS based on their license type: PMC Open Access (OA) Subset - all articles in PMC with a machine-readable Creative Commons license https://www.ncbi.nlm.nih.gov/pmc/tools/pmcaws/

Tags

  • dataset
  • literature
  • api

Topics

Categories