builder
AI / ML
/
RAG design→
Chunking, embedding, retrieval, reranking, eval — the full stack.
Prompt engineering→
Harden prompts, review chain-of-thought, design tool-calling.
Eval harness→
Golden set + metrics + regression detection for an LLM feature.
Fine-tuning brief→
Decide between FT, prompt engineering, and RAG with a real framework.
Model selection→
Pick a model with latency / cost / capability tradeoffs named.
Guardrail design→
Input and output filters for a deployed LLM system.
Hallucination triage→
A deployed system is fabricating — what to investigate, in what order.