Getting Started Guide
This guide walks you through a first end-to-end example: install Rakam Systems, create an agent, and run an evaluation. For detailed usage patterns, see the User Guide.
Prerequisites
Before you begin, make sure you have:
- Python 3.10 or higher
- pip package manager
- Docker installed
Set up the environment
Create a virtual environment
python3 -m venv venv
source venv/bin/activate # On macOS/Linux
# On Windows: venv\Scripts\activate
Install Rakam Systems
pip install rakam-systems
Configure API keys
Create a .env file in your project root with your OpenAI API key:
# .env
OPENAI_API_KEY=sk-your-api-key
Then load it in your code:
from dotenv import load_dotenv
load_dotenv()
Create your first agent
Create a file named my_first_agent.py:
import asyncio
from dotenv import load_dotenv
load_dotenv()
from rakam_systems_agent import BaseAgent
async def main():
agent = BaseAgent(
name="my_assistant",
model="openai:gpt-4o",
system_prompt="You are a helpful assistant."
)
result = await agent.arun("What is Python?")
print(result.output_text)
asyncio.run(main())
Run it with:
python my_first_agent.py
Write an evaluation function
note
The evaluation service connection is configured separately. Contact us for evaluation service setup details.
- Create an
eval/directory in your project if it doesn't exist. - Add your evaluation functions there. Each function must:
- Be decorated with
@eval_run - Return an
EvalConfigobject
- Be decorated with
Example:
# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
EvalConfig,
TextInputItem,
ClientSideMetricConfig,
ToxicityConfig,
)
@eval_run
def test_simple_text_eval():
"""A simple text evaluation showcasing a basic client-side metric."""
return EvalConfig(
component="text_component_1",
label="demo_simple_text",
data=[
TextInputItem(
id="txt_001",
input="Hello world",
output="Hello world",
expected_output="Hello world",
metrics=[ClientSideMetricConfig(name="relevance", score=1)],
)
],
metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
)
Run evaluations
From your project root:
rakam eval run
To list runs:
rakam eval list runs
To view latest results:
rakam eval show
Compare two runs to see what changed:
rakam eval compare --id 42 --id 45