Camel.toolkits.google scholar toolkit
GoogleScholarToolkit
A toolkit for retrieving information about authors and their publications from Google Scholar.
Attributes:
author_identifier (Union[str, None]): The author’s Google Scholar URL
or name of the author to search for.
is_author_name (bool): Flag to indicate if the identifier is a name.
(default: :obj:False
)
scholarly (module): The scholarly module for querying Google Scholar.
author (Optional[Dict[str, Any]]): Cached author details, allowing
manual assignment if desired.
init
Initializes the GoogleScholarToolkit with the author’s identifier.
Parameters:
- author_identifier (str): The author’s Google Scholar URL or name of the author to search for.
- is_author_name (bool): Flag to indicate if the identifier is a name. (default: :obj:
False
) - use_free_proxies (bool): Whether to use Free Proxies. (default: :obj:
False
) - proxy_http (Optional[str]): Proxy http address pass to pg. SingleProxy. (default: :obj:
None
) - proxy_https (Optional[str]): Proxy https address pass to pg. SingleProxy. (default: :obj:
None
)
author
Returns:
Dict[str, Any]: A dictionary containing author details. If no data is available, returns an empty dictionary.
author
Sets or overrides the cached author information.
Parameters:
- value (Optional[Dict[str, Any]]): A dictionary containing author details to cache or
None
to clear the cached data.
_extract_author_id
Returns:
Optional[str]: The extracted author ID, or None if not found.
get_author_detailed_info
Returns:
dict: A dictionary containing detailed information about the author.
get_author_publications
Returns:
List[str]: A list of publication titles authored by the author.
get_publication_by_title
Retrieves detailed information about a specific publication by its title. Note that this method cannot retrieve the full content of the paper.
Parameters:
- publication_title (str): The title of the publication to search for.
Returns:
Optional[dict]: A dictionary containing detailed information about
the publication if found; otherwise, None
.
get_full_paper_content_by_link
Retrieves the full paper content from a given PDF URL using the arxiv2text tool.
Parameters:
- pdf_url (str): The URL of the PDF file.
Returns:
Optional[str]: The full text extracted from the PDF, or None
if
an error occurs.
get_tools
Returns:
List[FunctionTool]: A list of FunctionTool objects representing the functions in the toolkit.