Pathfinder AI: Multi-Agent Career Guidance
4-agent career coach that turns user intake into ranked paths and actionable plans.
Overview
Pathfinder AI is a career guidance platform my teammate Sammy Hamouda and I built in 24 hours at DiamondHacks 2026. The core idea: instead of giving you a generic list of careers to research on your own, it interviews you across 12 dimensions — interests, values, skills, risk tolerance, geography, salary expectations, burnout concerns, and more — runs that intake through a multi-agent analysis pipeline, and hands you back scored career cards with concrete next steps.
The system is split across three services: a React frontend, a Fastify API, and a Python agent service built on Fetch.ai's uAgents framework. Sammy drove the agent pipeline and core product logic. I focused almost entirely on making sure the thing would actually hold together under live demo conditions.
Architecture
- Frontend: React 19 + TypeScript with TanStack Query for server state, React Router for navigation, Zod for form validation, Vite for builds
- API: Fastify 5 + TypeScript with Drizzle ORM over SQLite, Server-Sent Events for streaming responses, shared Zod schemas with the frontend
- Agent service: Python + Fetch.ai uAgents, coordinating four specialized agents that each evaluate a different set of career-fit dimensions
The Fastify API streams responses back to the frontend using Server-Sent Events rather than waiting for the full agent pipeline to complete. This means users see partial results as each agent finishes rather than staring at a spinner for the full duration. The shared Zod schemas between frontend and API mean that if the API changes its response shape, the TypeScript compiler catches it on the frontend before runtime.
The intake conversation collects data across 12 dimensions including interests, skills, values alignment, geographic preference, timeline, risk tolerance, and burnout history. Once the intake is complete, four agents run in parallel against this data — each scoring a different cluster of career-fit factors. The results are then combined into ranked career cards with fit scores from 0 to 100, a breakdown of match reasoning, identified concerns, suggested next steps, and salary range estimates.
My Contribution
My instinct going into the hackathon was that a demo that fails halfway through is worse than a demo that does less but works every time. So while Sammy focused on building the product, I focused on the infrastructure around it.
- 29 integration tests covering happy paths, error paths, and edge cases across the API and agent service
- A /ready self-test endpoint that runs a smoke check against all three services and returns structured health status — judges could hit it directly to verify the system was up before the demo
- Observability headers on every response (X-Request-Id, X-Response-Time) so that when something went wrong we could correlate requests across service logs
- A deterministic fallback engine that produces valid career recommendations without calling the LLM — if the API rate-limited or the agent service was slow, the demo could still run
- State machine with transition guards on the intake flow, preventing the frontend from reaching broken states if the API returned unexpected data
- Demo mode with deterministic timing so the streaming animation would look consistent regardless of actual agent response time
- Operator playbooks documenting how to restart individual services, re-seed the database, and recover from the most likely failure modes mid-demo
Building for a live hackathon demo is a different problem than building for production. In production you have logs, alerts, rollback options, and time. In a hackathon you have 24 hours and one shot in front of judges. The reliability work I did — the /ready endpoint, the fallback engine, the operator playbooks — was specifically scoped to that context. None of it would be the right architecture for a production system, but it was exactly the right investment for what we were trying to do.
A demo that fails halfway through is worse than a demo that does less but works every time.