Skip to main content

Tool Learning Demo

Complete Application

This is a complete, runnable application demonstrating Atulya integration. View source on GitHub →

An interactive Streamlit demo showing how Atulya helps LLMs learn which tool to use when tool names are ambiguous.

The Problem

When building AI agents with tool/function calling, tool names and descriptions aren't always clear. An LLM might randomly select between similarly-named tools, leading to incorrect behavior.

The Scenario

This demo simulates a customer service routing system with two channels:

ToolDescription (What the LLM sees)Actual Purpose (Hidden)
route_to_channel_alpha"Routes to channel Alpha for appropriate request types"Financial issues (refunds, billing, payments)
route_to_channel_omega"Routes to channel Omega for appropriate request types"Technical issues (bugs, features, errors)

The descriptions are intentionally vague! Without prior knowledge, the LLM must guess which channel handles what.

The Solution: Learning with Atulya

With Atulya memory:

  1. Store routing feedback about which channel handles which request type
  2. Retrieve learned knowledge when making routing decisions
  3. Consistently route correctly based on past experience

Quick Start

Prerequisites

  1. Atulya Server running (Docker):
docker run -d -p 8888:8888 -p 9999:9999 \
-e ATULYA_API_LLM_PROVIDER=openai \
-e ATULYA_API_LLM_API_KEY=$OPENAI_API_KEY \
-e ATULYA_API_LLM_MODEL=gpt-4o-mini \
ghcr.io/eight-atulya/atulya:latest
  1. OpenAI API Key:
export OPENAI_API_KEY=your-key-here

Run the Demo

./run.sh

Or manually:

pip install -r requirements.txt
streamlit run app.py

How to Use the Demo

Step 1: Test Without Memory (Baseline)

  1. Select a Financial Request (e.g., "I need a refund...")
  2. Click Route Request
  3. Observe: The "Without Atulya" column may route incorrectly

Step 2: Route First Customer and Learn

  1. Route a customer → Both LLMs route simultaneously
  2. Feedback is automatically stored to Atulya
  3. Wait ~5 seconds for Atulya to index the memory

Step 3: Test With Memory

  1. Select another request (financial or technical)
  2. Click Route Request
  3. Observe: The "With Atulya" column should now route correctly!

Step 4: View Statistics

  • See accuracy comparison between "Without Memory" vs "With Atulya"
  • Review test history to see the improvement over time

Demo Features

  • Side-by-side comparison: See routing results with and without memory
  • Pre-defined test requests: Financial and technical scenarios
  • Custom requests: Enter your own customer requests
  • Memory Explorer: Query stored routing knowledge directly
  • Live statistics: Track accuracy improvement

Key Insight

Even when tool names and descriptions don't reveal their purpose, Atulya allows the LLM to learn from experience which tool to use for which type of request.

This is especially valuable for:

  • Enterprise systems with legacy tool names
  • Multi-tenant systems where tools have generic names
  • Agents that need to learn organization-specific workflows

Configuration

SettingDefaultDescription
Modelgpt-4o-miniLLM model for routing decisions
Temperature (No Memory)0.7Randomness for baseline tests
Atulya API URLhttp://localhost:8888Atulya server URL

Files

  • app.py - Main Streamlit application
  • requirements.txt - Python dependencies
  • run.sh - Launch script with dependency checking
  • README.md - This file