Skip to main content

Datasets

A Dataset accepts the following parameters:
  • name (str): The name of the dataset
  • description (str): A description of what the dataset is
  • embeddig_api_key (str): the API key for generating embeddings
  • embeddig type (str): For now, only supports OPENAI
  • model_name (str): The name of the model to use for generating embeddings
Embedding type and model name are only required if you want to use online optimization, or you want to use the search_similar endpoint. Example of how to create a dataset:
import requests
import json

BASE_URL = "https://orch.zenbase.ai/api"
API_KEY = "YOUR ZENBASE API KEY"

def api_call(method, endpoint, data=None):
    url = f"{BASE_URL}/{endpoint}"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Api-Key {API_KEY}"
    }
    response = requests.request(method, url, headers=headers, data=json.dumps(data) if data else None)
    return response

dataset_data = {
    "name": "my-dataset",
    "description": "This is my dataset",
    "embedding_api_key": "MY API KEY",
    "embedding_type": "OPENAI",
    "model_name": "text-embedding-ada-002",
}
dataset = api_call("POST", "datasets/", dataset_data)
dataset_id = dataset.json()['id']

Dataset Items

A Dataset Item accepts the following parameters:
  • dataset (int): The id of the dataset this item belongs to
  • inputs (dict): The input data for the dataset item
  • outputs (dict): The output data for the dataset item
Example of how to create a dataset item:
import requests
import json

BASE_URL = "https://orch.zenbase.ai/api"
API_KEY = "YOUR ZENBASE API KEY"

def api_call(method, endpoint, data=None):
    url = f"{BASE_URL}/{endpoint}"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Api-Key {API_KEY}"
    }
    response = requests.request(method, url, headers=headers, data=json.dumps(data) if data else None)
    return response


dataset_item_data = {
    "dataset": 1, # The id of the dataset this item belongs to
    "inputs": {"text": "This is my input"},
    "outputs": {"sentiment": "positive"},
}
response = api_call("POST", "dataset-items/", dataset_item_data)