Tool Learning Demo
This is a complete, runnable application demonstrating Atulya integration. View source on GitHub →
An interactive Streamlit demo showing how Atulya helps LLMs learn which tool to use when tool names are ambiguous.
The Problem
When building AI agents with tool/function calling, tool names and descriptions aren't always clear. An LLM might randomly select between similarly-named tools, leading to incorrect behavior.
The Scenario
This demo simulates a customer service routing system with two channels:
| Tool | Description (What the LLM sees) | Actual Purpose (Hidden) |
|---|---|---|
route_to_channel_alpha | "Routes to channel Alpha for appropriate request types" | Financial issues (refunds, billing, payments) |
route_to_channel_omega | "Routes to channel Omega for appropriate request types" | Technical issues (bugs, features, errors) |
The descriptions are intentionally vague! Without prior knowledge, the LLM must guess which channel handles what.
The Solution: Learning with Atulya
With Atulya memory:
- Store routing feedback about which channel handles which request type
- Retrieve learned knowledge when making routing decisions
- Consistently route correctly based on past experience
Quick Start
Prerequisites
- Atulya Server running (Docker):
docker run -d -p 8888:8888 -p 9999:9999 \
-e ATULYA_API_LLM_PROVIDER=openai \
-e ATULYA_API_LLM_API_KEY=$OPENAI_API_KEY \
-e ATULYA_API_LLM_MODEL=gpt-4o-mini \
ghcr.io/eight-atulya/atulya:latest
- OpenAI API Key:
export OPENAI_API_KEY=your-key-here
Run the Demo
./run.sh
Or manually:
pip install -r requirements.txt
streamlit run app.py
How to Use the Demo
Step 1: Test Without Memory (Baseline)
- Select a Financial Request (e.g., "I need a refund...")
- Click Route Request
- Observe: The "Without Atulya" column may route incorrectly
Step 2: Route First Customer and Learn
- Route a customer → Both LLMs route simultaneously
- Feedback is automatically stored to Atulya
- Wait ~5 seconds for Atulya to index the memory
Step 3: Test With Memory
- Select another request (financial or technical)
- Click Route Request
- Observe: The "With Atulya" column should now route correctly!
Step 4: View Statistics
- See accuracy comparison between "Without Memory" vs "With Atulya"
- Review test history to see the improvement over time
Demo Features
- Side-by-side comparison: See routing results with and without memory
- Pre-defined test requests: Financial and technical scenarios
- Custom requests: Enter your own customer requests
- Memory Explorer: Query stored routing knowledge directly
- Live statistics: Track accuracy improvement
Key Insight
Even when tool names and descriptions don't reveal their purpose, Atulya allows the LLM to learn from experience which tool to use for which type of request.
This is especially valuable for:
- Enterprise systems with legacy tool names
- Multi-tenant systems where tools have generic names
- Agents that need to learn organization-specific workflows
Configuration
| Setting | Default | Description |
|---|---|---|
| Model | gpt-4o-mini | LLM model for routing decisions |
| Temperature (No Memory) | 0.7 | Randomness for baseline tests |
| Atulya API URL | http://localhost:8888 | Atulya server URL |
Files
app.py- Main Streamlit applicationrequirements.txt- Python dependenciesrun.sh- Launch script with dependency checkingREADME.md- This file