Posts by Tags

Building AI agents using LangChain

less than 1 minute read

Published: September 27, 2025

LangChain

AI Agents: Buzzword or Paradigm Shift?

less than 1 minute read

Published: December 15, 2024

“AI agent”—a fancy buzzword for AI model integration, or is there more to it? With increasing autonomy and decision-making capabilities, are we just rebranding models, or is this a true paradigm shift?

What Automotive AI Taught Me About Responsible Deployment

9 minute read

Published: April 20, 2025

When you ship AI into a vehicle, the consequences of failure aren’t abstract. They’re not a bad user review or a dip in engagement metrics. They’re a driver who takes their eyes off the road because the system said something confidently wrong, or a voice assistant that crashes mid-navigation and leaves someone lost on an unfamiliar highway. Shipping AI where failures have physical consequences changes how you think about deployment — permanently.

Stop Training Your LLM Router: Zero-Shot Confidence Beats Supervised Baselines

5 minute read

Published: May 04, 2026

TL;DR: We tested whether you actually need labeled training data to route queries between a cheap local LLM and an expensive cloud model. You don’t. Average token log-probability — available for free from the first query — matches supervised routing in-distribution and crushes it when the query distribution shifts. We tested across 3 model families, 2 datasets, ~4,500 queries, and $123 in cloud costs. All code and data are open.

Things I got wrong building a confidence evaluator for local LLMs

10 minute read

Published: April 25, 2026

An in-progress lab-notebook post from the Autodidact project.

Building AI agents using LangChain

less than 1 minute read

Published: September 27, 2025

LangChain

AI Agents: Buzzword or Paradigm Shift?

less than 1 minute read

Published: December 15, 2024

“AI agent”—a fancy buzzword for AI model integration, or is there more to it? With increasing autonomy and decision-making capabilities, are we just rebranding models, or is this a true paradigm shift?

Stop Training Your LLM Router: Zero-Shot Confidence Beats Supervised Baselines

5 minute read

Published: May 04, 2026

TL;DR: We tested whether you actually need labeled training data to route queries between a cheap local LLM and an expensive cloud model. You don’t. Average token log-probability — available for free from the first query — matches supervised routing in-distribution and crushes it when the query distribution shifts. We tested across 3 model families, 2 datasets, ~4,500 queries, and $123 in cloud costs. All code and data are open.

Things I got wrong building a confidence evaluator for local LLMs

10 minute read

Published: April 25, 2026

An in-progress lab-notebook post from the Autodidact project.

What Automotive AI Taught Me About Responsible Deployment

9 minute read

Published: April 20, 2025

When you ship AI into a vehicle, the consequences of failure aren’t abstract. They’re not a bad user review or a dip in engagement metrics. They’re a driver who takes their eyes off the road because the system said something confidently wrong, or a voice assistant that crashes mid-navigation and leaves someone lost on an unfamiliar highway. Shipping AI where failures have physical consequences changes how you think about deployment — permanently.

Best Student Paper Award at HOST 2020

less than 1 minute read

Published: May 04, 2020

I’m excited to share that our paper “A Novel Golden-Chip-Free Clustering Technique Using Backscattering Side Channel for Hardware Trojan Detection” won the Best Student Paper Award at the 2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

Stop Training Your LLM Router: Zero-Shot Confidence Beats Supervised Baselines

5 minute read

Published: May 04, 2026

TL;DR: We tested whether you actually need labeled training data to route queries between a cheap local LLM and an expensive cloud model. You don’t. Average token log-probability — available for free from the first query — matches supervised routing in-distribution and crushes it when the query distribution shifts. We tested across 3 model families, 2 datasets, ~4,500 queries, and $123 in cloud costs. All code and data are open.

Things I got wrong building a confidence evaluator for local LLMs

10 minute read

Published: April 25, 2026

An in-progress lab-notebook post from the Autodidact project.

Upcoming Talk: Cybersecurity Lecture Series at Georgia Tech

less than 1 minute read

Published: January 15, 2020

I’m excited to announce that I will be giving a talk as part of the Cybersecurity Lecture Series at Georgia Tech on January 24, 2020.

What Automotive AI Taught Me About Responsible Deployment

9 minute read

Published: April 20, 2025

When you ship AI into a vehicle, the consequences of failure aren’t abstract. They’re not a bad user review or a dip in engagement metrics. They’re a driver who takes their eyes off the road because the system said something confidently wrong, or a voice assistant that crashes mid-navigation and leaves someone lost on an unfamiliar highway. Shipping AI where failures have physical consequences changes how you think about deployment — permanently.

Upcoming Talk: Cybersecurity Lecture Series at Georgia Tech

less than 1 minute read

Published: January 15, 2020

I’m excited to announce that I will be giving a talk as part of the Cybersecurity Lecture Series at Georgia Tech on January 24, 2020.

Best Student Paper Award at HOST 2020

less than 1 minute read

Published: May 04, 2020

I’m excited to share that our paper “A Novel Golden-Chip-Free Clustering Technique Using Backscattering Side Channel for Hardware Trojan Detection” won the Best Student Paper Award at the 2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

Upcoming Talk: Cybersecurity Lecture Series at Georgia Tech

less than 1 minute read

Published: January 15, 2020

I’m excited to announce that I will be giving a talk as part of the Cybersecurity Lecture Series at Georgia Tech on January 24, 2020.

Best Student Paper Award at HOST 2020

less than 1 minute read

Published: May 04, 2020

I’m excited to share that our paper “A Novel Golden-Chip-Free Clustering Technique Using Backscattering Side Channel for Hardware Trojan Detection” won the Best Student Paper Award at the 2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

Stop Training Your LLM Router: Zero-Shot Confidence Beats Supervised Baselines

5 minute read

Published: May 04, 2026

TL;DR: We tested whether you actually need labeled training data to route queries between a cheap local LLM and an expensive cloud model. You don’t. Average token log-probability — available for free from the first query — matches supervised routing in-distribution and crushes it when the query distribution shifts. We tested across 3 model families, 2 datasets, ~4,500 queries, and $123 in cloud costs. All code and data are open.

Things I got wrong building a confidence evaluator for local LLMs

10 minute read

Published: April 25, 2026

An in-progress lab-notebook post from the Autodidact project.

AI Agents: Buzzword or Paradigm Shift?

less than 1 minute read

Published: December 15, 2024

“AI agent”—a fancy buzzword for AI model integration, or is there more to it? With increasing autonomy and decision-making capabilities, are we just rebranding models, or is this a true paradigm shift?

Best Student Paper Award at HOST 2020

less than 1 minute read

Published: May 04, 2020

I’m excited to share that our paper “A Novel Golden-Chip-Free Clustering Technique Using Backscattering Side Channel for Hardware Trojan Detection” won the Best Student Paper Award at the 2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

What Automotive AI Taught Me About Responsible Deployment

9 minute read

Published: April 20, 2025

When you ship AI into a vehicle, the consequences of failure aren’t abstract. They’re not a bad user review or a dip in engagement metrics. They’re a driver who takes their eyes off the road because the system said something confidently wrong, or a voice assistant that crashes mid-navigation and leaves someone lost on an unfamiliar highway. Shipping AI where failures have physical consequences changes how you think about deployment — permanently.

Stop Training Your LLM Router: Zero-Shot Confidence Beats Supervised Baselines

5 minute read

Published: May 04, 2026

TL;DR: We tested whether you actually need labeled training data to route queries between a cheap local LLM and an expensive cloud model. You don’t. Average token log-probability — available for free from the first query — matches supervised routing in-distribution and crushes it when the query distribution shifts. We tested across 3 model families, 2 datasets, ~4,500 queries, and $123 in cloud costs. All code and data are open.

Things I got wrong building a confidence evaluator for local LLMs

10 minute read

Published: April 25, 2026

An in-progress lab-notebook post from the Autodidact project.

Upcoming Talk: Cybersecurity Lecture Series at Georgia Tech

less than 1 minute read

Published: January 15, 2020

I’m excited to announce that I will be giving a talk as part of the Cybersecurity Lecture Series at Georgia Tech on January 24, 2020.

AI Agents: Buzzword or Paradigm Shift?

less than 1 minute read

Published: December 15, 2024

“AI agent”—a fancy buzzword for AI model integration, or is there more to it? With increasing autonomy and decision-making capabilities, are we just rebranding models, or is this a true paradigm shift?

Paul Nguyen

Posts by Tags

ai agents

ai-safety

applied-ai

artificial intelligence

autodidact

automotive

awards

calibration

cybersecurity

deployment

georgia tech

hardware security

hardware trojans

llm

machine learning

research

responsible-ai

routing

talks

technology trends