Sends a request to Google Scholar service and retrieves results (title, authors, source and year of publications, and the total number of citations).
As no API is provided by Google Scholar (except the one for authors with a
Google Scholar ID), this function scraps the service using the package
RSelenium.
To bypass Google IP bans, the IP address and the User agent will be changed in case of ban.
scrap_gscholar(
search_terms,
exact = TRUE,
exclude_terms = NULL,
search_author = NULL,
search_source = NULL,
metadata = FALSE,
where = NULL,
years = NULL,
lang = NULL,
start = 0,
n_max = NULL,
include_patents = FALSE,
include_citations = FALSE,
ovpn_country,
agent = TRUE,
verbose = TRUE,
keep_html = FALSE,
output_path = "."
)a character of length 1. Terms to search papers for
(optional).
a logical. If TRUE, search for the exact terms, otherwise
search at least one of the terms.
a character of length 1. Terms to exclude from the
search (optional).
a character of length 1. Authors to search for
(optional).
a character of length 1. Publication sources to
search for (optional).
a logical. If TRUE, all publications data are extracted.
Otherwise, only the total number of publications is returned.
a character of length 1. One among 'any' (search in the
whole document) or 'title' (search only in the title).
a integer of length 1 or 2. Year(s) specifying the temporal
extent of the search.
a character of length 1. The ISO-2 code of the language to
search for. Use get_languages() to get a list (optional).
a numeric of length 1. The number of the first results from
which the results are extracted (default is 0, start from the first
result).
a numeric of length 1. The number of results to extract.
a logical. If TRUE, patents are included in the
search results.
a logical. If TRUE, citations are included in
the search results.
a character vector. The ISO-2 code of the country to
pick up a VPN server. Use get_countries() to get a list.
a logical. If TRUE, web browser user agent will be randomly
changed.
a logical. If TRUE, connection and scraping information
are printing.
a logical. If TRUE, raw HTML pages are kept.
a character of length 1. The path to the folder to
save data.
No return value.
if (FALSE) {
scrap_gscholar()
}