Getting Started Guide
Welcome! This guide will help you set up and use the Rakam Systems tools for local development and evaluation.
Prerequisites
Before you begin, make sure you have:
- Python 3.10 or higher
- pip package manager
- Docker installed
1. Start the Evaluation Service (required for evaluation)
The evaluation service must be running to use the evaluation features. Contact us if you need help setting it up.
2. Set Up the Environment
1. Create and activate a Python virtual environment:
python3 -m venv venv
source venv/bin/activate # On macOS/Linux
# On Windows: venv\Scripts\activate
2. Install Rakam Systems package:
pip install rakam-systems
3. Set up your API keys:
Create a .env file in your project root with your OpenAI API key:
# .env
OPENAI_API_KEY=sk-your-api-key
Then load it in your code:
from dotenv import load_dotenv
load_dotenv()
The evaluation service connection is configured separately. Contact us for evaluation service setup details.
3. Your First Agent
Create a file named my_first_agent.py:
import asyncio
from dotenv import load_dotenv
load_dotenv()
from rakam_systems_agent import BaseAgent
async def main():
agent = BaseAgent(
name="my_assistant",
model="openai:gpt-4o",
system_prompt="You are a helpful assistant."
)
result = await agent.arun("What is Python?")
print(result.output_text)
asyncio.run(main())
Run it with:
python my_first_agent.py
4. Write Your First Evaluation Function
- Create an
eval/directory in your project if it doesn't exist. - Add your evaluation functions there. Each function must:
- Be decorated with
@eval_run - Return an
EvalConfigobject
- Be decorated with
Example:
# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
EvalConfig,
TextInputItem,
ClientSideMetricConfig,
ToxicityConfig,
)
@eval_run
def test_simple_text_eval():
"""A simple text evaluation showcasing a basic client-side metric."""
return EvalConfig(
component="text_component_1",
label="demo_simple_text",
data=[
TextInputItem(
id="txt_001",
input="Hello world",
output="Hello world",
expected_output="Hello world",
metrics=[ClientSideMetricConfig(name="relevance", score=1)],
)
],
metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
)
5. Run Your Evaluation
From your project root to run evaluation functions, run:
rakam eval run
To list runs:
rakam eval list runs
To view latest results:
rakam eval show
Compare two runs to see what changed:
rakam eval compare --id 42 --id 45