Agentic Data generation with CAMEL and finetuning Qwen models with Unsloth
For more detailed usage information, please refer to our cookbook To run this, press “Runtime” and press “Run all” on a free Tesla T4 Google Colab instance! ⭐ Star us on GitHub, join our Discord, or follow us on XCAMEL and Unsloth make an excellent pair. In this notebook we will combine the two to train a model to be proficient at content on a page You will learn how to do data generation with CAMEL, how to train, and how to run the model.

Data models
We want to generate data in the Alpaca format, so we can use CAMEL’s built-in AlpacaItem class which has some handy conversion functions for us. We will be using CAMEL’s structured output to generate all of these items in one request, which is much faster and cheaper. Here we create a wrapper around the AlpacaItem to help the model know how many have been generated as it’s going along, and another wrapper class that represents a list of these.Data generation
Next we define our data generation function. It takes a source content, and generates a list of instruction-input-response triplets around it. We will use this later to train our model to be proficient with the source content.Point to content and generate data!
Now we point to the content that we wish to generate SFT data around and use CAMEL’s Firecrawl integration to get this content in a nice markdown format. You can get a Firecrawl API key from hereInference
Let’s run the model! You can change the instruction and input - leave the output blank!- 🐫 Creating Your First CAMEL Agent free Colab
- Graph RAG Cookbook free Colab
- 🧑⚖️ Create A Hackathon Judge Committee with Workforce free Colab
- 🔥 3 ways to ingest data from websites with Firecrawl & CAMEL free Colab
- 🦥 Agentic SFT Data Generation with CAMEL and Mistral Models, Fine-Tuned with Unsloth free Colab
- 🦥 Agentic SFT Data Generation with CAMEL and Meta Models, Fine-Tuned with Unsloth free Colab