Building trustworthy AI systems and delightful products

I design, ship, and study AI systems — from secure RAG and evaluation harnesses to multimodal HRI research. Currently a Senior AI/ML Engineer (Associate) at Booz Allen and a graduate TA at UMBC.

See Projects Get in Touch

Featured Projects

A snapshot of recent work. See more on the Projects page.

ResumeTailor
AI-powered resume & cover letter generator with job matching and auto-apply workflow.
Next.jsOpenAIPostgreSQLTailwind
Live Repo
SCOUT++ Toolkit
Toolkit for multimodal HRI experiments and dataset benchmarking of instruction grounding.
PythonPyTorchVision-LanguageEvaluation
Repo
FSR Release Planner
Constraint-aware scheduler for Field Service Representatives across multiple systems.
TypeScriptAlgorithmsUX
Details coming soon

Featured Research

Highlights from publications and works in progress. See the full list on Research.

Grounded Instruction Understanding with Large Language Models: Toward Trustworthy Human-Robot Interaction
AAAI 2025 Fall Symposium · 2025
E. Ogbadu, S. Lukin, C. Matuszek
Understanding natural language as a representational bridge between perception and action is critical for deploying autonomous robots in complex, high-risk environments. This work investigates how large language models (LLMs) can support this bridge by interpreting unconstrained human instructions in urban disaster response scenarios. Leveraging the SCOUT corpus, a multimodal dataset capturing human-robot dialogue through Wizard-of-Oz experiments, we construct SCOUT++, aligning over 11,000 visual frames with language commands and robot actions. We evaluate three instruction classification approaches: a neural network trained on tokenized text, GPT-4 using text alone, and GPT-4 with synchronized visual input. Results show that while GPT-4 (text-only) outperforms traditional models in accuracy, its multimodal variant exhibits degraded performance, often producing vague or hallucinated outputs. These findings expose the challenges of reliably grounding language in visual context and raise questions about the trustworthiness of foundation models in safety-critical settings. We contribute SCOUT++, a reproducible multimodal pipeline, and benchmark results that shed light on the capabilities and current limitations of vision-language models for risk-sensitive human-robot interaction.
PDF

Building trustworthy AI systems and delightful products

ResumeTailor

SCOUT++ Toolkit

FSR Release Planner

Grounded Instruction Understanding with Large Language Models: Toward Trustworthy Human-Robot Interaction