pestr
Package is a set of functions and wrappers that
allow painless and quick data retrieval on pests and their hosts from EPPO Data Services and EPPO Global Database. First of all, it
allows extraction of scientific names of organisms (and
viruses), as well as synonyms and common names from
SQLite database. The data base can be easily downloaded with
eppo_database_download()
function. Second, there are four
functions in the package that use REST API to extract
data on hosts, categorization and taxonomy
and pests. Further, there is a function that downloads
data csv files containing information on
organisms (and viruses) distribution. Important feature is that the
csv are never saved onto hard drive, instead
they are directly used to create data.frame
that can be
assigned to a variable in R. Beside above features,
this package provides some other helper functions e.g. connecting to
database or storing EPPO token as variable.
token
and connecting to SQLite
databaseIn order to start working with pestr package, you
should register yourself (free of charge) to EPPO Data Services. Then run
create_eppo_token
and assign results to a variable which
will be used by functions that connect to REST API.
Next, you can run eppo_database_download
function that
will download (by default to your working directory,
which can be override with filepath
argument) archive with
the SQLite file. If you are on Linux operating
system, file will be extracted into your working directory (or other
directory provided in filepath
argument). On
Windows you will be asked to extract the database file
manually.
Last step of setup is to connect to database file, which can be
easily done with eppo_database_connect
function.
With this three short steps you are ready to go.
Currently searching for pest names supports scientific names,
synonyms and common names. By default search will work with partial
names – e.g. when you query for Cydia packardi you get
information related to this species, while when you query for
Cydia you get information on whole genus. Likewise, when you
search for Droso you will get information on all species that
contain Droso in their names. Moreover you can pass whole
vector of terms in one shot,
e.g. c('Xylella', 'Cydia packardi', 'Droso')
.
### Create vector of names that you are looking for
pests_query <- c('Cydia', "Triticum aestivum", "abcdef", "cadang")
Than you should start with querying for names and assigning your
results to a variable. This variable will contain eppocodes
that are used by other functions to extract data from EPPO REST
API. eppo_names_tables
takes two arguments: first
is a vector of names to query the database, second is variable with
connection to SQLite database.
pest_names <- eppo_names_tables(pests_query, eppo_SQLite)
names(pest_names)
### names that exist in database
head(pest_example[[1]], 5)
### names that were not found
head(pest_example[[2]], 5)
### preferred names for eppocodes from first table
head(pest_example[[3]], 5)
### all names that are associated to eppocodes from first data frame
head(pest_example[[4]], 5)
#> [1] "exist_in_DB" "not_in_DB" "pref_names"
#> [4] "all_associated_names"
codeid | fullname |
---|---|
6698 | Cydia pomonella |
8607 | Cydia inopinata |
8608 | Cydia leucostoma |
8609 | Cydia sp. |
9907 | Ephialtes cydiae |
codeid | fullname | eppocode |
---|---|---|
6698 | Cydia pomonella | CARPPO |
8607 | Grapholita inopinata | CYDIIN |
8608 | Cydia leucostoma | CYDILE |
8609 | Cydia sp. | CYDISP |
9907 | Ephialtes cydiae | EPHICY |
codeid | fullname | preferred | codelang | eppocode |
---|---|---|---|---|
6698 | æblevikler | 0 | da | CARPPO |
6698 | Obstmade | 0 | de | CARPPO |
6698 | carpocapse des pommes | 0 | fr | CARPPO |
6698 | pyrale de la pomme | 0 | fr | CARPPO |
6698 | ver des pommes et des poires | 0 | fr | CARPPO |
As a result you will get list
containing 3
data.frames
and vector
:
exist_in_DB
– data.frame
with names that
are present in EPPO Data Services;not_in_DB
– vector
with names that are not
present in database;pref_names
– data.frame
with preferred
names and eppocodes
,all_associated_names
– data.frame
with all
associated names to eppocodes
from third
data.frame
.REMEMBER: Other eppo_tabletools_
functions use results of this function or raw eppocodes to
access data from EPPO Global Database and EPPO
Data Services.
eppo_tabletools_
functions to extract categorization,
hosts, taxonomy, distribution and pestsThis functions works separately from each other, thus there is no need to use all of them. There is no need to use them in any particular order. Functions for categorization, hosts taxonomy and pests takes two arguments:
names_table
– variable containing result of
eppo_names_tables
;token
– variable created with
create_eppo_token
OR three arguments:
token
– same as above;raw_eppocodes
– character vector of eppocodes (e.q.
`c(“XYLEFA”, “ABIAL”));use_raw_codes
– logical set to TRUE.As result eppo_tabletools_cat
you will get
list
with two elements:
data.frame
with categorization tablesdata.frame
with categorization for each eppocode
condensed to single cell.pests_cat <- eppo_tabletools_cat(pest_names, eppo_token)
### long format table
head(pests_cat[[1]], 5)
### comapct table with information for each eppocode condensed into one cell
head(pests_cat[[2]],5)
eppocode | nomcontinent | isocode | country | qlist | qlistlabel | yr_add | yr_del | yr_trans |
---|---|---|---|---|---|---|---|---|
CARPPO | Africa | EG | Egypt | 2 | A2 list | 2018 | NA | NA |
CARPPO | Africa | 3G | Southern Africa | 2 | A2 list | 2001 | NA | NA |
CARPPO | America | CA | Canada | X | Quarantine pest | 2019 | NA | NA |
CARPPO | Asia | BH | Bahrain | 1 | A1 list | 2003 | NA | NA |
CARPPO | Asia | CN | China | 1 | A1 list | 1993 | NA | NA |
eppocode | categorization |
---|---|
CARPPO | Africa: Egypt: A2 list: add/del/trans: 2018/NA/NA; Southern Africa: A2 list: add/del/trans: 2001/NA/NA | America: Canada: Quarantine pest: add/del/trans: 2019/NA/NA | Asia: Bahrain: A1 list: add/del/trans: 2003/NA/NA; China: A1 list: add/del/trans: 1993/NA/NA | RPPO/EU: APPPC: A2 list: add/del/trans: 1993/NA/NA |
CYDIIN | Africa: Egypt: A1 list: add/del/trans: 2018/NA/NA; Morocco: Quarantine pest: add/del/trans: 2018/NA/NA; Tunisia: Quarantine pest: add/del/trans: 2012/NA/NA | America: Canada: Quarantine pest: add/del/trans: 2019/NA/NA; Mexico: Quarantine pest: add/del/trans: 2018/NA/NA | Asia: Bahrain: A1 list: add/del/trans: 2003/NA/NA; Israel: Quarantine pest: add/del/trans: 2009/NA/NA; Jordan: A1 list: add/del/trans: 2013/NA/NA | Europe: Turkey: A1 list: add/del/trans: 2016/NA/NA; Ukraine: A1 list: add/del/trans: 2019/NA/NA | RPPO/EU: EPPO: A2 list: add/del/trans: 1994/NA/1999; EU: A1 Quarantine pest (Annex II A): add/del/trans: 2019/NA/NA |
CYDILE | NA: NA: NA: add/del/trans: NA/NA/NA |
CYDISP | NA: NA: NA: add/del/trans: NA/NA/NA |
EPHICY | NA: NA: NA: add/del/trans: NA/NA/NA |
If you will to limit the data received from EPPO Data Services, and you are confident that you know exactly what you are looking for, you can use eppocodes directly.
pests_cat <- eppo_tabletools_cat(token = eppo_token,
raw_eppocodes = c("LASPPA", "TRZAX", "CCCVD0"),
use_raw_codes = TRUE)
pest_cat[[2]]
eppocode | categorization |
---|---|
LASPPA | Africa: Egypt: A1 list: add/del/trans: 2018/NA/NA; Morocco: Quarantine pest: add/del/trans: 2018/NA/NA; Tunisia: Quarantine pest: add/del/trans: 2012/NA/NA | America: Mexico: Quarantine pest: add/del/trans: 2018/NA/NA | Asia: Bahrain: A1 list: add/del/trans: 2003/NA/NA; Israel: Quarantine pest: add/del/trans: 2009/NA/NA; Jordan: A1 list: add/del/trans: 2013/NA/NA | Europe: Russia: A1 list: add/del/trans: 2014/NA/NA; Turkey: A1 list: add/del/trans: 2016/NA/NA; Ukraine: A1 list: add/del/trans: 2019/NA/NA | RPPO/EU: EAEU: A1 list: add/del/trans: 2018/NA/NA; EPPO: A1 list: add/del/trans: 1995/NA/NA; EU: A1 Quarantine pest (Annex II A): add/del/trans: 2019/NA/NA |
TRZAX | NA: NA: NA: add/del/trans: NA/NA/NA |
CCCVD0 | Africa: Egypt: Regulated non-quarantine pest: add/del/trans: 2018/NA/NA; Morocco: Quarantine pest: add/del/trans: 2018/NA/NA | America: Argentina: A1 list: add/del/trans: 2019/NA/NA; Brazil: A1 list: add/del/trans: 2018/NA/NA; Chile: A1 list: add/del/trans: 2019/NA/NA; Mexico: Quarantine pest: add/del/trans: 2018/NA/NA; United States of America: Quarantine pest: add/del/trans: 1989/NA/NA | Asia: Bahrain: A1 list: add/del/trans: 2003/NA/NA; China: A2 list: add/del/trans: 1988/NA/NA; Israel: Quarantine pest: add/del/trans: 2009/NA/NA | Europe: Turkey: A1 list: add/del/trans: 2016/NA/NA | RPPO/EU: APPPC: A2 list: add/del/trans: 1988/NA/NA; CAHFSA: A1 list: add/del/trans: 1990/NA/NA; COSAVE: A2 list: add/del/trans: 2018/NA/NA; EPPO: A1 list: add/del/trans: 1994/NA/NA; EU: A1 Quarantine pest (Annex II A): add/del/trans: 2019/NA/NA; PPPO: A2 list: add/del/trans: 1993/NA/NA |
eppo_tabletools_hosts
as a result returns a
list
of two data.frame
:
pests_hosts <- eppo_tabletools_hosts(pest_names, eppo_token)
head(pests_hosts[[1]], 5)
head(pests_hosts[[2]], 5)
eppocode | codeid | host_eppocode | idclass | labelclass | full_name |
---|---|---|---|---|---|
CARPPO | 37021 | MABSD | 1 | Major host | Malus domestica |
CARPPO | 29214 | CYDOB | 9 | Host | Cydonia oblonga |
CARPPO | 35259 | IUGRE | 9 | Host | Juglans regia |
CARPPO | 41521 | PRNAR | 9 | Host | Prunus armeniaca |
CARPPO | 41563 | PRNDO | 9 | Host | Prunus domestica |
eppocode | hosts |
---|---|
CARPPO | Major host: Malus domestica; Host: Cydonia oblonga, Juglans regia, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunus persica, Pyrus communis |
CYDIIN | Major host: Malus domestica; Wild/Weed: Malus baccata; Host: Cydonia oblonga, Malus, Pyrus, Pyrus communis; Experimental: Prunus |
CYDILE | Host: NA |
CYDISP | Host: NA |
EPHICY | Host: NA |
eppo_tabletools_taxo
as other functions from this family
returns a list
with two data.frame
:
Suppose, that from previous name query we are interested only in viroids and viruses. As they usually have a viroid or virus phrase in their name, we can simply limit the query to certain eppocodes.
virs_eppocodes <- pest_names$all_associated_names %>%
dplyr::filter(grepl("viroid", fullname) | grepl("virus", fullname)) %>%
.[,5] %>% ## eppocodes are in 5th column
unique()
We can now pass virs_eppocodes
as
raw_eppocodes
argument, and in consequence receive taxonomy
of viroids and viruses only.
virs_taxonomy <- eppo_tabletools_taxo(token = eppo_token,
raw_eppocodes = virs_eppocodes,
use_raw_codes = TRUE)
virs_taxonomy$long_table ## you can also access list elements by their names
virs_taxonomy$compact_table
codeid | eppocode | prefname | level |
---|---|---|---|
60969 | CPGV00 | Viruses and viroids | 1 |
64582 | CPGV00 | Baculoviridae | 2 |
84121 | CPGV00 | Betabaculovirus | 3 |
65443 | CPGV00 | Cydia pomonella granulovirus | 4 |
60969 | CCCVD0 | Viruses and viroids | 1 |
111354 | CCCVD0 | Riboviria | 2 |
65268 | CCCVD0 | Pospiviroidae | 3 |
65799 | CCCVD0 | Cocadviroid | 4 |
64718 | CCCVD0 | Coconut cadang-cadang viroid | 5 |
eppocode | taxonomy |
---|---|
CPGV00 | Baculoviridae |
CCCVD0 | Riboviria |
It is possible to obtain data on pests of particular hosts with
function eppo_tabletools_pests
. Lets say we want to know
all the pests associated with Abies alba (eppocode:
ABIAL).
abies_pests <- eppo_tabletools_pests(token = eppo_token,
raw_eppocodes = "ABIAL",
use_raw_codes = TRUE)
head(abies_pests[[1]], 5)
head(abies_pests[[2]], 5)
eppocode | pests_eppocode | idclass | labelclass | fullname |
---|---|---|---|---|
ABIAL | MELMME | 10 | Experimental | Melampsora medusae (as Abies) |
ABIAL | MELMMD | 10 | Experimental | Melampsora medusae f. sp. deltoidis (as Abies) |
ABIAL | ACLRGL | 9 | Host | Acleris gloverana (as Abies) |
ABIAL | ACLRVA | 9 | Host | Acleris variana (as Abies) |
ABIAL | AREAB | 9 | Host | Arceuthobium abietinum (as Abies) |
eppocode | pests |
---|---|
ABIAL | Experimental: Melampsora medusae (as Abies), Melampsora medusae f. sp. deltoidis (as Abies); Host: Acleris gloverana (as Abies), Acleris variana (as Abies), Arceuthobium abietinum (as Abies), Arceuthobium douglasii (as Abies), Arceuthobium laricis (as Abies), Arceuthobium tsugense (as Abies), Bursaphelenchus xylophilus (as Abies), Chionaspis pinifoliae, Chionaspis pinifoliae (as Abies), Choristoneura freemani (as Abies), Choristoneura fumiferana (as Abies), Chrysomyxa abietis (as Abies), Coniferiporia weirii (as Pinaceae), Crisicoccus pini (as Abies), Dendroctonus micans, Dendrolimus sibiricus (as Abies), Dendrolimus spectabilis (as Abies), Dendrolimus superans (as Abies), Dothistroma septosporum, Dryocoetes confusus (as Abies), Gnathotrichus sulcatus (as Pinaceae), Gremmeniella abietina (as Abies), Heterobasidion irregulare (as Abies), Ips amitinus, Ips amitinus (as Abies), Ips subelongatus (as Abies), Ips typographus, Leptoglossus occidentalis (as Abies), Malacosoma disstria (as Abies), Monochamus alternatus (as Abies), Monochamus marmorator (as Abies), Monochamus obtusus (as Abies), Monochamus saltuarius (as Abies), Monochamus scutellatus (as Abies), Monochamus sutor (as Abies), Monochamus titillator (as Abies), Monochamus urussovi (as Abies), Phacidium coniferarum (as Abies), Phytophthora cinnamomi (as Pinaceae), Phytophthora ramorum, Pissodes castaneus, Polygraphus proximus (as Abies), Sirex ermak (as Abies), Sirex noctilio (as Abies), Tetropium gracilicorne (as Abies), Trichoferus campestris (as Abies); Major host: Chrysomyxa abietis, Monochamus sutor, Neonectria neomacrospora |
eppo_tabletools_distri
does not connect to REST API, but
it downloads information from csv files directly from
EPPO Global Database. As a consequence there is no
token
argument (since it does not need the EPPO token) – a
variable containing result of eppo_names_tables
. The
function returns a two element list
:
dataframe
with distribution for
organism/virus, including invalid records and eradicated
status;pest_distri <- eppo_tabletools_distri(pest_names)
head(pestr_distri[[1]], 5)
head(pestr_distri[[2]], 5)
eppocode | continent | country | state | country.code | state.code | Status |
---|---|---|---|---|---|---|
CARPPO | Africa | Algeria | NA | DZ | NA | Present, no details |
CARPPO | Africa | Egypt | NA | EG | NA | Present, no details |
CARPPO | Africa | Libya | NA | LY | NA | Present, no details |
CARPPO | Africa | Mauritius | NA | MU | NA | Present, no details |
CARPPO | Africa | Morocco | NA | MA | NA | Present, no details |
eppocode | distribution |
---|---|
CARPPO | Africa: Algeria, Egypt, Libya, Mauritius, Morocco, South Africa, Tunisia; America: Argentina, Bolivia, Brazil, Canada, Chile, Colombia, Mexico, Peru, United States of America, Uruguay; Asia: Afghanistan, China, India, Iran, Iraq, Israel, Jordan, Kazakhstan, Kyrgyzstan, Lebanon, Pakistan, Syria, Tajikistan, Turkmenistan, Uzbekistan; Europe: Albania, Armenia, Austria, Azerbaijan, Belarus, Belgium, Bulgaria, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Malta, Moldova, Netherlands, Norway, Poland, Portugal, Romania, Russia, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey, Ukraine, United Kingdom; Oceania: Australia, New Zealand |
CYDIIN | Asia: China, Japan; Europe: Russia |
CYDILE | NA: NA |
CYDISP | NA: NA |
EPHICY | NA: NA |
eppo_tabletools_pests
):Last, but not least, package offers a simple wrapper over above
mentioned functions. If you want to make one table with all the
informations: names, categorization, hosts, distribution and taxonomy –
condensed to one cell per pest, please use
eppo_table_full
function that takes arguments:
names vector
– a character vector of pests/hosts
names;sqlConnection
– a variable for SQLite
connection (result of eppo_database_connect
);token
– an variable storing EPPO token
(eppo_token
).eppo_fulltable <- eppo_table_full(c("Meloidogyne ethiopica", "Crataegus mexicana"),
eppo_SQLite,
eppo_token)
eppo_fulltable
codeid | eppocode | Preferred_name | Other_names | hosts | categorization | distribution | taxonomy |
---|---|---|---|---|---|---|---|
84193 | CSCME | Crataegus mexicana | Other languages: aubépine du Mexique, Mexican hawthorn, tejocote | Host: NA | NA: NA: NA: add/del/trans: NA/NA/NA | NA: NA | Plantae |
79276 | MELGET | Meloidogyne ethiopica | Other languages: root-knot nematode | Major host: Actinidia chinensis, Actinidia deliciosa, Solanum lycopersicum, Vitis labrusca, Vitis vinifera; Wild/Weed: Ageratum conyzoides, Datura stramonium, Solanum nigrum; Host: Acacia mearnsii, Agave sisalana, Asparagus officinalis, Beta vulgaris, Brassica oleracea, Capsicum frutescens, Citrullus lanatus, Cucumis melo, Cucumis sativus, Cucurbita, Ensete ventricosum, Glycine max, Lactuca sativa, Nicotiana tabacum, Phaseolus vulgaris, Polymnia sonchifolia, Prunus persica, Saccharum officinarum, Sida rhombifolia, Solanum tuberosum, Vicia faba, Vigna unguiculata | Africa: Morocco: Quarantine pest: add/del/trans: 2018/NA/NA | RPPO/EU: EPPO: Alert list: add/del/trans: 2011/NA/NA | Africa: Ethiopia, Kenya, Mozambique, South Africa, Tanzania, Zimbabwe; America: Brazil, Chile, Peru | Nematoda |