Embeddings
1. Concept
Creating embeddings for different types of data (text, images, videos) involves transforming these inputs into a numerical form that machines can understand and process efficiently. Each type of embedding focuses on capturing the essential features of its respective data type.
1.1. Text Embeddings
Text embeddings convert textual data into numerical vectors. Each vector represents the semantic meaning of the text, enabling us to process and compare texts based on their meaning rather than just their raw form. Techniques like OpenAI Embedding use large-scale language models to understand context and nuances in language. SentenceTransformerEncoder, on the other hand, is specifically designed to create sentence embeddings, often using BERT-like models.
Consider two sentences:
-
“A young boy is playing soccer in a park.”
-
“A child is kicking a football on a playground.”
Text embedding models would transform these sentences into two high-dimensional vector (e.g., 1536 dimension if using text-embedding-3-small
). Despite different wordings, the vectors will be similar, capturing the shared concept of a child playing a ball game outdoors. This transformation into vectors allows machines to understand and compare the semantic similarities between the context.
1.2. Image Embeddings
Image embeddings convert images into numerical vectors, capturing essential features like shapes, colors, textures, and spatial hierarchies. This transformation is typically performed by Convolutional Neural Networks (CNNs) or other advanced neural network architectures designed for image processing. The resulting embeddings can be used for tasks like image classification, similarity comparison, and retrieval.
Suppose we have an image of a cat. An image embedding model would analyze the visual content (e.g., the shape of the ears, the pattern of the fur) and convert it into a vector. This vector encapsulates the essence of the image, allowing the model to recognize it as a cat and differentiate it from other images.
2. Types
2.1. OpenAIEmbedding
Utilizes OpenAI’s models for generating text embeddings. This will requires OpenAI API Key.
2.2. MistralEmbedding
Utilizes Mistral’s models for generating text embeddings. This will requires Mistral API Key.
2.3. SentenceTransformerEncoder
Utilizes open-source models from the Sentence Transformers library for generating text embeddings.
2.4. VisionLanguageEmbedding
Utilizes OpenAI’s models for generating image embeddings. This will requires OpenAI API Key.
2.5. AzureOpenAI
Utilizes OpenAI’s models for generating text embeddings. This will requires Azure OpenAI API Key.
2.6. TogetherEmbedding
Utilizes Together AI’s models for generating text embeddings. This will requires Together AI API Key.
3. Get Started
To use the embedding functionalities, you need to import the necessary classes.