Getting Started Guide

This guide walks you through a first end-to-end example: install Rakam Systems, create an agent, and run an evaluation. For detailed usage patterns, see the User Guide.

Prerequisites

Before you begin, make sure you have:

Python 3.10 or higher
pip package manager
Docker installed

Set up the environment

Create a virtual environment

python3 -m venv venv
source venv/bin/activate  # On macOS/Linux
# On Windows: venv\Scripts\activate

Install Rakam Systems

pip install rakam-systems

Configure API keys

Create a .env file in your project root with your OpenAI API key:

# .env
OPENAI_API_KEY=sk-your-api-key

Then load it in your code:

from dotenv import load_dotenv
load_dotenv()

Create your first agent

Create a file named my_first_agent.py:

import asyncio
from dotenv import load_dotenv
load_dotenv()

from rakam_systems_agent import BaseAgent

async def main():
    agent = BaseAgent(
        name="my_assistant",
        model="openai:gpt-4o",
        system_prompt="You are a helpful assistant."
    )
    result = await agent.arun("What is Python?")
    print(result.output_text)

asyncio.run(main())

Run it with:

python my_first_agent.py

Write an evaluation function

note

The evaluation service connection is configured separately. Contact us for evaluation service setup details.

Create an eval/ directory in your project if it doesn't exist.
Add your evaluation functions there. Each function must:
- Be decorated with @eval_run
- Return an EvalConfig object

Example:

# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
    EvalConfig,
    TextInputItem,
    ClientSideMetricConfig,
    ToxicityConfig,
)

@eval_run
def test_simple_text_eval():
    """A simple text evaluation showcasing a basic client-side metric."""
    return EvalConfig(
        component="text_component_1",
        label="demo_simple_text",
        data=[
            TextInputItem(
                id="txt_001",
                input="Hello world",
                output="Hello world",
                expected_output="Hello world",
                metrics=[ClientSideMetricConfig(name="relevance", score=1)],
            )
        ],
        metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
    )

Run evaluations

From your project root:

rakam eval run

To list runs:

rakam eval list runs

To view latest results:

rakam eval show

Compare two runs to see what changed:

rakam eval compare --id 42 --id 45

Prerequisites​

Set up the environment​

Create a virtual environment​

Install Rakam Systems​

Configure API keys​

Create your first agent​

Write an evaluation function​

Run evaluations​