Interview me

Hard production AI questions, answered with sources

Static curated answers for architecture reviews, technical interviews, role-fit screens, and senior AI platform conversations.

Portfolio chat

Ask the portfolio like an interview loop

Ask a hiring-style question and get a concise answer with links back to the relevant public portfolio pages.

Ask about orchestration, evals, cost controls, infra rescue, or where the systems failed before they became reliable.

Your question

Architecture

Why should we not just use LangGraph for orchestration?

The library is not the architecture. Production systems need explicit state, typed interfaces, retry boundaries, observability, evaluation hooks, and cost controls regardless of the orchestration framework.

At Knit, the durable unit was a DAG of inspectable tasks, not a vague autonomous loop.
Explicit workflows made parallel execution, retries, judge verification, and debugging practical.
Frameworks can help, but production ownership lives in the domain-specific boundaries around them.

Agentic Market Research Platform

Decision fork comparing free-form agents with explicit DAG execution.

Cost and infra

How do you control LLM and ML infrastructure costs?

I treat cost as an architectural constraint. Cost control comes from task classification, model routing, sandbox reuse, caching, selective judge coverage, retry limits, and visibility into unit economics.

Knit required model routing, persistent sandbox reuse, and task-level observability.
Epic required infrastructure ownership that reduced cost by 10x and pod usage by 100x.
The Cost Anatomy challenge shows normalized cost units only, never actual internal figures.

ML Infrastructure Rescue

Production ML ownership across cost and reliability.

Cost Anatomy

Workflow unit economics model for routing and verification tradeoffs.

Evals and reliability

How do you keep AI-generated analysis from becoming fluent but wrong?

I separate generation from verification. For analytics, that means executable artifacts, independent checks, source-linked outputs, and failure modes that can be inspected instead of trusted blindly.

Generated Python gave analysis an executable audit path.
Independent judge verification reduced self-confirming errors.
Chart and narrative quality gates made artifact quality visible before deck assembly.

Agentic Market Research Platform

Evaluation and reliability section for the flagship system.

Full-stack execution

Can you own the whole product path, not only the model layer?

My strongest work sits across product, backend, ML, data, frontend, and operations. I can connect model behavior to user-facing artifacts and production constraints.

Knit combined LLM orchestration, Python execution, charting, Deck IR, APIs, streaming, and observability.
Epic combined ML infra, Elasticsearch, recommendations, Kubernetes, and product experiments.
Osmo combined computer vision, learning-product UX, data collection, and real-time constraints.

Case study grid

Breadth across LLM systems, infrastructure, search, CV, and product systems.

Risk and weaknesses

What is the risk in hiring you, and how do you manage it?

The risk is that I can bias toward building robust systems when a team only needs a quick demo. I manage that by making scope gates explicit and choosing the lightest system that preserves correctness.

V1 of this portfolio deliberately avoided a live assistant until content and evals were ready.
I prefer static, typed, reviewable artifacts first, then add dynamic systems when the proof is stable.
That same discipline applies to product work: prototype quickly, but do not confuse a prototype with the architecture.

Portfolio execution alignment

The site itself stages static proof before live AI behavior.

Role fit

What roles are the strongest fit?

The strongest fit is a senior AI engineering role where production LLM systems, evaluation, observability, workflow architecture, and full-stack execution matter.

Strong fit: AI Platform Engineer, Senior AI Engineer, or LLM Systems Architect.
Strong environments: serious AI products, workflow automation, analytics, research tooling, infra-heavy AI applications.
Less ideal: pure research roles, frontend-only roles, or teams optimizing for demos over production systems.

Where I am useful

Signals mapped to practical role fit and shipped systems.

Static curated answers stay available alongside the optional live assistant. Both rely on approved public or sanitized evidence.