Skip to main content
Version: v0.3.0

Getting Started Guide

Welcome! This guide will help you set up and use the Rakam Systems tools for local development and evaluation.

Prerequisites

Before you begin, make sure you have:

  • Python 3.10 or higher
  • pip package manager
  • Docker installed

1. Start the Evaluation Service (required for evaluation)

The evaluation service must be running to use the evaluation features. Contact us if you need help setting it up.

2. Set Up the Environment

1. Create and activate a Python virtual environment:

python3 -m venv venv
source venv/bin/activate # On macOS/Linux
# On Windows: venv\Scripts\activate

2. Install Rakam Systems package:

pip install rakam-systems

3. Set up your API keys:

Create a .env file in your project root with your OpenAI API key:

# .env
OPENAI_API_KEY=sk-your-api-key

Then load it in your code:

from dotenv import load_dotenv
load_dotenv()
note

The evaluation service connection is configured separately. Contact us for evaluation service setup details.

3. Your First Agent

Create a file named my_first_agent.py:

import asyncio
from dotenv import load_dotenv
load_dotenv()

from rakam_systems_agent import BaseAgent

async def main():
agent = BaseAgent(
name="my_assistant",
model="openai:gpt-4o",
system_prompt="You are a helpful assistant."
)
result = await agent.arun("What is Python?")
print(result.output_text)

asyncio.run(main())

Run it with:

python my_first_agent.py

4. Write Your First Evaluation Function

  1. Create an eval/ directory in your project if it doesn't exist.
  2. Add your evaluation functions there. Each function must:
    • Be decorated with @eval_run
    • Return an EvalConfig object

Example:

# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
EvalConfig,
TextInputItem,
ClientSideMetricConfig,
ToxicityConfig,
)

@eval_run
def test_simple_text_eval():
"""A simple text evaluation showcasing a basic client-side metric."""
return EvalConfig(
component="text_component_1",
label="demo_simple_text",
data=[
TextInputItem(
id="txt_001",
input="Hello world",
output="Hello world",
expected_output="Hello world",
metrics=[ClientSideMetricConfig(name="relevance", score=1)],
)
],
metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
)

5. Run Your Evaluation

From your project root to run evaluation functions, run:

rakam eval run

To list runs:

rakam eval list runs

To view latest results:

rakam eval show

Compare two runs to see what changed:

rakam eval compare --id 42 --id 45