This repository has been archived by the owner on Mar 21, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Overview
cstubben edited this page Sep 5, 2014
·
5 revisions
The pmcOAI
function loads PMC Open Access articles into an XMLInternalDocument
. Other functions are used to parse the XML document including
-
pmcText
splits xml into a list of subsections, where each subsection is a vector of paragraphs or sentences -
pmcTable
extracts tables into a list of data frames -
pmcSupp
lists supplementary files and optionally downloads them -
pmcRef
returns a data frame containing references -
pmcMetadata
lists metadata fields
The package was initially described in BMC Bioinformatics and that paper focused on extracting locus tags mentioned in full text and tables. You can use this code to find Burkholderia pseudomallei locus tags
bpgff <- read.ncbi.ftp( "Burkholderia_pseudomallei/GCF_000011545", "gff")
tags <- "(BPSL0* OR BPSL1* OR BPSL2* OR BPSL3* OR BPSS0* OR BPSS1* OR BPSS2*)"
bp <- ncbiPMC(paste(tags, "AND (Burkholderia[TITLE] OR Burkholderia[ABSTRACT]) AND open access[FILTER]"))
pmcLoop(bp, bpgff, prefix = "BPS[SL]" , suffix= "[abc]", file="bp.tab")
Check the links for more details