Camel.datahubs.huggingface
HuggingFaceDatasetManager
A dataset manager for Hugging Face datasets. This class provides methods to create, add, update, delete, and list records in a dataset on the Hugging Face Hub.
Parameters:
- token (str): The Hugging Face API token. If not provided, the token will be read from the environment variable
HF_TOKEN
.
init
create_dataset_card
Creates and uploads a dataset card to the Hugging Face Hub in YAML format.
Parameters:
- dataset_name (str): The name of the dataset.
- description (str): A description of the dataset.
- license (str): The license of the dataset. (default: :obj:
None
) - version (str): The version of the dataset. (default: :obj:
None
) - tags (list): A list of tags for the dataset.(default: :obj:
None
) - authors (list): A list of authors of the dataset. (default: :obj:
None
) - size_category (list): A size category for the dataset. (default: :obj:
None
) - language (list): A list of languages the dataset is in. (default: :obj:
None
) - task_categories (list): A list of task categories. (default: :obj:
None
) - content (str): Custom markdown content that the user wants to add to the dataset card. (default: :obj:
None
)
create_dataset
Creates a new dataset on the Hugging Face Hub.
Parameters:
- name (str): The name of the dataset.
- private (bool): Whether the dataset should be private. defaults to False.
- kwargs (Any): Additional keyword arguments.
Returns:
str: The URL of the created dataset.
list_datasets
Lists all datasets for the current user.
Parameters:
- username (str): The username of the user whose datasets to list.
- limit (int): The maximum number of datasets to list. (default: :obj:
100
) - kwargs (Any): Additional keyword arguments.
Returns:
List[str]: A list of dataset ids.
delete_dataset
Deletes a dataset from the Hugging Face Hub.
Parameters:
- dataset_name (str): The name of the dataset to delete.
- kwargs (Any): Additional keyword arguments.
add_records
Adds records to a dataset on the Hugging Face Hub.
Parameters:
- dataset_name (str): The name of the dataset.
- records (List[Record]): A list of records to add to the dataset.
- filepath (str): The path to the file containing the records.
- kwargs (Any): Additional keyword arguments.
update_records
Updates records in a dataset on the Hugging Face Hub.
Parameters:
- dataset_name (str): The name of the dataset.
- records (List[Record]): A list of records to update in the dataset.
- filepath (str): The path to the file containing the records.
- kwargs (Any): Additional keyword arguments.
delete_record
Deletes a record from the dataset.
Parameters:
- dataset_name (str): The name of the dataset.
- record_id (str): The ID of the record to delete.
- filepath (str): The path to the file containing the records.
- kwargs (Any): Additional keyword arguments.
list_records
Lists all records in a dataset.
Parameters:
- dataset_name (str): The name of the dataset.
- filepath (str): The path to the file containing the records.
- kwargs (Any): Additional keyword arguments.
Returns:
List[Record]: A list of records in the dataset.