Camel.toolkits.data commons toolkit
DataCommonsToolkit
A class representing a toolkit for Data Commons.
This class provides methods for querying and retrieving data from the Data Commons knowledge graph. It includes functionality for:
- Executing SPARQL queries
- Retrieving triples associated with nodes
- Fetching statistical time series data
- Analyzing property labels and values
- Retrieving places within a given place type
- Obtaining statistical values for specific variables and locations
All the data are grabbed from the knowledge graph of Data Commons. Refer to https://datacommons.org/browser/ for more details.
init
Initialize the DataCommonsToolkit.
Parameters:
- timeout (Optional[float], optional): Maximum time in seconds to wait for API calls to complete. If None, will wait indefinitely. (default: :obj:
None
)
query_data_commons
Query the Data Commons knowledge graph using SPARQL.
Parameters:
- query_string (str): A SPARQL query string.
Returns:
Optional[List[Dict[str, Any]]]: A list of dictionaries, each
representing a node matching the query conditions if success,
(default: :obj:None
) otherwise.
Note:
- Only supports a limited subset of SPARQL functionality (ORDER BY, DISTINCT, LIMIT).
- Each variable in the query should have a ‘typeOf’ condition.
- The Python SPARQL library currently only supports the V1 version of the API.
Reference: https://docs.datacommons.org/api/python/query.html
get_triples
Retrieve triples associated with nodes.
Parameters:
- dcids (Union[str, List[str]]): A single DCID or a list of DCIDs to query.
- limit (int): The maximum number of triples per combination of property and type. (default: :obj:
500
)
Returns:
Optional[Dict[str, List[tuple]]]: A dictionary where keys are
DCIDs and values are lists of associated triples if success,
(default: :obj:None
) otherwise.
Note:
- The function will raise a ValueError if any of the required arguments are missing.
- The function will raise a TypeError if the dcids are not a string or a list of strings.
- The function will raise a ValueError if the limit is not between 1 and 500.
- The function will raise a KeyError if one or more of the provided DCIDs do not exist in the Data Commons knowledge graph.
- The function will raise an Exception if an unexpected error occurs.
Reference: https://docs.datacommons.org/api/python/triple.html
get_stat_time_series
Retrieve statistical time series for a place.
Parameters:
- place (str): The dcid of the Place to query for.
- stat_var (str): The dcid of the StatisticalVariable.
- measurement_method (str, optional): The technique used for measuring a statistical variable. (default: :obj:
None
) - observation_period (str, optional): The time period over which an observation is made. (default: :obj:
None
) - scaling_factor (str, optional): Property of statistical variables indicating factor by which a measurement is multiplied to fit a certain format. (default: :obj:
None
) - unit (str, optional): The unit of measurement. (default: :obj:
None
)
Returns:
Optional[Dict[str, Any]]: A dictionary containing the statistical
time series data if success, (default: :obj:None
) otherwise.
Reference: https://docs.datacommons.org/api/python/stat_series.html
get_property_labels
Retrieves and analyzes property labels for given DCIDs.
Parameters:
- dcids (list): A list of Data Commons IDs (DCIDs) to analyze.
- out (bool): Direction of properties to retrieve. (default: :obj:
True
)
Returns:
Optional[Dict[str, List[str]]]: Analysis results for each DCID if
success, (default: :obj:None
) otherwise.
Reference: https://docs.datacommons.org/api/python/property_label.html
get_property_values
Retrieves and analyzes property values for given DCIDs.
Parameters:
- dcids (list): A list of Data Commons IDs (DCIDs) to analyze.
- prop (str): The property to analyze.
- value_type (str, optional): The type of the property value to filter by. Defaults to NONE. Only applicable if the value refers to a node.
- out (bool, optional): The label’s direction. (default: :obj:
True
) (only returning response nodes directed towards the requested node). If set to False, will only return response nodes directed away from the request node. (default: :obj:None
) (default: True) - limit (int, optional): (≤ 500) Maximum number of values returned per node. (default: :obj:
datacommons.utils._MAX_LIMIT
)
Returns:
Optional[Dict[str, Any]]: Analysis results for each DCID if
success, (default: :obj:None
) otherwise.
Reference: https://docs.datacommons.org/api/python/property_value.html
get_places_in
Retrieves places within a given place type.
Parameters:
- dcids (list): A list of Data Commons IDs (DCIDs) to analyze.
- place_type (str): The type of the place to filter by.
Returns:
Optional[Dict[str, Any]]: Analysis results for each DCID if
success, (default: :obj:None
) otherwise.
Reference: https://docs.datacommons.org/api/python/place_in.html
get_stat_value
Retrieves the value of a statistical variable for a given place and date.
Parameters:
- place (str): The DCID of the Place to query for.
- stat_var (str): The DCID of the StatisticalVariable.
- date (str, optional): The preferred date of observation in ISO 8601 format. If not specified, returns the latest observation. (default: :obj:
None
) - measurement_method (str, optional): The DCID of the preferred measurementMethod value. (default: :obj:
None
) - observation_period (str, optional): The preferred observationPeriod value. (default: :obj:
None
) - unit (str, optional): The DCID of the preferred unit value. (default: :obj:
None
) - scaling_factor (str, optional): The preferred scalingFactor value. (default: :obj:
None
)
Returns:
Optional[float]: The value of the statistical variable for the
given place and date if success, (default: :obj:None
)
otherwise.
Reference: https://docs.datacommons.org/api/python/stat_value.html
get_stat_all
Retrieves the value of a statistical variable for a given place and date.
Parameters:
- places (str): The DCID IDs of the Place objects to query for. (Here DCID stands for Data Commons ID, the unique identifier assigned to all entities in Data Commons.)
- stat_vars (str): The dcids of the StatisticalVariables at
- https: //datacommons.org/browser/StatisticalVariable
Returns:
Optional[dict]: A dictionary with the DCID of the place as the key
and a list of tuples as the value if success, (default:
:obj:None
) otherwise.
Reference: https://docs.datacommons.org/api/python/stat_all.html
get_tools
Returns:
List[FunctionTool]: A list of FunctionTool objects representing the functions in the toolkit.