Storages
1. Concept
The Storage module is a comprehensive framework designed for handling various types of data storage mechanisms. It is composed of abstract base classes and concrete implementations, catering to both key-value storage and vector storage systems.
2. Types
2.1 Key Value Storages
BaseKeyValueStorage
:
-
Purpose: Serves as the foundational abstract class for creating various key-value storage systems.
-
Functionality: Standardizes operations like saving, loading, and clearing data records. It primarily interfaces through Python dictionaries.
-
Use Cases: Applicable for JSON file storage, NoSQL databases (like MongoDB and Redis), and in-memory Python dictionaries.
InMemoryKeyValueStorage
:
-
Description: A concrete implementation of
BaseKeyValueStorage
, utilizing in-memory lists. -
Feature: Ideal for temporary storage as data is volatile and lost when the program terminates.
-
Functionality: Implements methods for saving, loading, and clearing records in memory.
JsonStorage
:
-
Description: Another concrete implementation of
BaseKeyValueStorage
, focusing on JSON file storage. -
Feature: Ensures persistent storage of records in a human-readable format. Supports customization through a custom JSON encoder for specific enumerated types.
-
Functionality: Includes methods for saving data in JSON format, loading, and clearing data.
2.2 VectorDB Storages
BaseVectorStorage
:
-
Purpose: An abstract base class designed to be extended for specific vector storage implementations.
-
Features: Supports various operations like adding, deleting vectors, querying similar vectors, and maintaining the status of the vector database.
-
Functionality: Offers flexibility in specifying vector dimensions, collection names, distance metrics, and more.
MilvusStorage
:
- Description: A concrete implementation of
BaseVectorStorage
, tailored for interacting with Milvus, a cloud-native vector search engine.
Reference: Milvus
TiDBStorage
:
- Description: A concrete implementation of
BaseVectorStorage
, tailored for interacting with TiDB, one database for all your AI ambitions: vector embeddings, knowledge graphs, and operational data.
Reference: TiDB
QdrantStorage
:
- Description: A concrete implementation of
BaseVectorStorage
, tailored for interacting with Qdrant, a vector search engine.
Reference: Qdrant
OceanBaseStorage
:
- Description: A concrete implementation of
BaseVectorStorage
, tailored for interacting with OceanBase vector engine.
Reference: OceanBase
2.3 Graph Storages
BaseGraphStorage
:
-
Purpose: An abstract base class designed to be extended for specific graph storage implementations.
-
Features: Supports various operations like
get_client
,get_schema
,get_structured_schema
,refresh_schema
,add_triplet
,delete_triplet
, andquery
.
NebulaGraph
:
- Description: A concrete implementation of
BaseGraphStorage
, tailored for interacting with NebulaGraph, an open source, distributed, scalable, lightning fast graph database.
Reference: NebulaGraph
Neo4jGraph
:
- Description: A concrete implementation of
BaseGraphStorage
, tailored for interacting with Neo4jGraph, one of the most trusted graph database.
Reference: Neo4jGraph
3. Get Started
To get started with the storage module you’ve provided, you’ll need to understand the basic usage of the key classes and their methods. The module includes an abstract base class BaseKeyValueStorage
and its concrete implementations InMemoryKeyValueStorage
and JsonStorage
, as well as a vector storage system through BaseVectorStorage
and its implementation MilvusStorage
or QdrantStorage
.
3.1. Using InMemoryKeyValueStorage
3.2. Using JsonStorage
3.3. Using MilvusStorage
3.4. Using TiDBStorage
If you use TiDB Serverless, you can redirect to TiDB Cloud Web Console to get the database URL. (Select Connect with “SQLAlchemy” > “PyMySQL” on the connection panel)