VisionLanguageEmbedding
- model_name: The model type to be used for generating embeddings. And the default value is: obj:
openai/clip-vit-base-patch32
.
init
VisionLanguageEmbedding
class with a
specified model and return the dimension of embeddings.
Parameters:
- model_name (str, optional): The version name of the model to use. (default: :obj:
openai/clip-vit-base-patch32
)
embed_list
- objs (List[Image.Image|str]): The list of images or texts for which to generate the embeddings.
- image_processor_kwargs: Extra kwargs passed to the image processor.
- tokenizer_kwargs: Extra kwargs passed to the text tokenizer (processor).
- model_kwargs: Extra kwargs passed to the main model.