Agent Lab: AI Eval Tool
AI Observability & Debugging Platform

01The Problem
Building reliable autonomous agents requires deep visibility into execution traces, tool usage, and cost metrics. Without proper observability, debugging agent failures becomes a nightmare of guesswork and console.log statements.
The challenge: Create a production-grade platform that gives AI engineers the visibility they need to move from experimental prototypes to reliable, production-ready agent workflows.
02Key Features
Multi-Agent Orchestration - Visualize complex agent workflows with Planner, Researcher, Analyst, and Synthesizer agents working in harmony.
Real-Time Trace Visualization - Watch execution steps unfold live with full tool call details, timing, and state transitions.
Cost & Token Metrics - Track input/output tokens and costs per agent step to optimize LLM spending.
Session History - Browse and replay past sessions to debug issues and understand agent behavior patterns.
03Tech Stack
Next.js 15 with App Router and Server Actions
Vercel AI SDK for agent orchestration and streaming
OpenRouter API with multiple free model support
Zustand for state management with session persistence
Neobrutalist design system with Tailwind CSS