Documentation Index
Fetch the complete documentation index at: https://docs.zenbase.ai/llms.txt
Use this file to discover all available pages before exploring further.
Step 1: Install Required Libraries
Step 2: Set Up API Configuration
import os
import requests
import json
BASE_URL = "https://orch.zenbase.ai/api"
API_KEY = "YOUR API KEY"
def api_call(method, endpoint, data=None, files=None):
url = f"{BASE_URL}/{endpoint}"
headers = {"Authorization": f"Api-Key {API_KEY}"}
response = requests.request(method, url, headers=headers, data=data, files=files)
return response
Step 3: Create a Dataset
First, create a dataset where you’ll add your items:
dataset_data = {
"name": "My Bulk Dataset",
"description": "Dataset for bulk creation example",
"embedding_type": "OPENAI", # Optional: if you want to use embeddings
"embedding_api_key": "YOUR_OPENAI_API_KEY", # Optional: if using embeddings
"model_name": "text-embedding-3-small" # Optional: if using embeddings
}
dataset = api_call("POST", "datasets/", dataset_data)
dataset_id = dataset.json()['id']
print(f"Created dataset with ID: {dataset_id}")
Step 4: Prepare Your Bulk Data
Create a JSON file containing an array of items you want to add to your dataset. Each item should include the dataset ID, inputs, and outputs:
bulk_data = [
{
"dataset": dataset_id,
"inputs": {
"text": "This movie was fantastic!",
"rating": 5
},
"outputs": {
"sentiment": "positive"
}
},
{
"dataset": dataset_id,
"inputs": {
"text": "I didn't enjoy this film at all.",
"rating": 1
},
"outputs": {
"sentiment": "negative"
}
},
{
"dataset": dataset_id,
"inputs": {
"text": "The movie was okay, nothing special.",
"rating": 3
},
"outputs": {
"sentiment": "neutral"
}
}
]
# Save bulk data to a file
bulk_file_path = "./bulk_items.json"
with open(bulk_file_path, 'w') as f:
json.dump(bulk_data, f, indent=2)
Step 5: Submit the Bulk Create Request
Submit your bulk create request with the data file:
# Create a dictionary with the uploaded file opened in binary mode
files = {"file": open(bulk_file_path, "rb")}
# Set the mode for bulk create (optional)
data = {"mode": "upsert"} # Options: "insert" or "upsert" or "replace"
# Submit the bulk create request
bulk_create = api_call(
"POST",
f"datasets/{dataset_id}/bulk_create/",
data=data,
files=files
)
# Get the bulk create task ID
bulk_create_task_id = bulk_create.json()['bulk_create_task_id']
print(f"Created bulk create task with ID: {bulk_create_task_id}")
Step 6: Monitor Bulk Create Status
You can check the status of your bulk create task in the admin panel. The task will process your items in the background and add them to your dataset.
The bulk create feature supports two modes:
insert: Creates new items only
upsert: Creates new items or updates existing ones if they match
replace: Deletes existing items and creates new ones
This feature is particularly useful when you need to:
- Import large datasets from external sources
- Update multiple dataset items simultaneously
- Populate datasets with training data
Remember to structure your bulk data according to your dataset’s schema, ensuring that all required fields are included for each item.