Darrell S. Best Jr.

01. About

Applied AI engineer building production ML and agentic systems end-to-end — model development, orchestration, evaluation, and deployment — for problems where accuracy, privacy, and reliability are non-negotiable.

  • 3+ years building agents and agentic systems — working hands-on with tool use, orchestration, and multi-step reasoning since the start of the modern agent era. This is where the field is going, and it’s where I spend most of my time.
  • 7+ years shipping applied AI: multilingual LLMs, federated learning, NLP-based classification, and applied transformer research.
  • Comfortable across the stack — training and fine-tuning in PyTorch / Hugging Face, distributed training with DeepSpeed, privacy-preserving training with Flower, and production deployment.
  • Experience spanning enterprise, healthcare, and defense domains — with a focus on systems that have to work in the real world, not just on a benchmark.
  • Published researcher and active graduate student at USC, working at the intersection of NLP, data science, and applied ML.

02. Work Experience

USC ISI 2024 – Present

MADEIRA

Role: Senior Research Engineer II — agent orchestration, agent memory, build automation.

Software debloating at scale is a multi-step problem: you have to build the code before you can prune it, and every prune risks breaking it. MADEIRA agents pull source from git, infer the build requirements, containerize the project in Docker, and drive the build loop themselves — persisting build failures to ECHO, an agent-memory tool I built, so agents learn from past mistakes instead of repeating them. Once the code builds, agents call debloating tools and progressively get more aggressive, backing off to the last working configuration the moment something breaks, then generate a full report of what was removed and why.

Agentic AI Agent Memory Software Debloating Docker
USC ISI 2019 – 2020

Hawkeye

Role: Research Engineer II — modeling, distributed training, deployment.

Cross-language communication is a hard problem when latency, domain vocabulary, and conversational fluency all matter at once. I designed and built a scalable multilingual chat system covering four foreign languages, fine-tuned transformer-based language models with DeepSpeed for distributed training, and delivered it as a real-time conversational surface usable in operational settings.

GPT-2 DeepSpeed PyTorch Hugging Face NLP
USC ISI

Danube

Role: Senior Research Engineer I — federated training pipelines, anomaly handling.

Training across organizations that cannot share raw data is a real-world constraint in healthcare and defense. I engineered federated learning pipelines on Flower that train models across distributed nodes without centralizing sensitive data, and added anomaly-resistant aggregation so a single misbehaving participant cannot poison the global model.

Flower PyTorch Federated Learning
USC ISI

Sonic Screwdriver

Role: Senior Research Engineer I — tokenization, modeling, evaluation.

FPGA bitstreams are not text, but their structure is recoverable. I designed a transformer-based error-correction system that treated bitstreams as a token stream, trained masked language models to repair corrupted regions, and opened a new path for applying NLP techniques to hardware recovery.

Hugging Face MLM FPGA
USC ISI 2024 – 2026

HealthMap

Role: Senior Research Engineer II — agent design, classification pipeline, evaluation.

Unstructured clinical notes do not fit neatly into ICD diagnostic codes, and manual mapping is slow and error-prone. I built an agent-driven NLP pipeline that automates mapping from raw medical text to ICD codes — agents read the notes, reason over candidate codes, and produce the mapping — accelerating clinical workflows and cutting down on the manual review that dominates traditional coding.

Agentic AI NLP Healthcare AI Text Classification
USC ISI

GreenSight

Role: Senior Research Engineer I — encoder design, labeled data strategy.

Understanding narratives at scale requires more than sentiment. I engineered text encoders that classify content against Schwartz’s 19 universal moral values, enabling researchers to study cultural and political narratives at a level of nuance that off-the-shelf sentiment tools cannot reach.

Hugging Face Text Encoding Values AI
QinetiQ US 2018 – 2022

HMDS & IGSR

Role: ML Engineer — model development for sensor and imagery pipelines.

Two high-stakes detection problems, two different sensor modalities. For HMDS I built IED detection models for Army Husky vehicles using ground-penetrating radar; for IGSR I developed ResNet-based computer vision models for border-crossing detection used by the FBI. Both systems had to work reliably in the field, not just on curated datasets.

Computer Vision ResNet Defense

Junior Software Engineer, Windstream — Greenville, SC

Sep 2017 – Mar 2018

Built end-to-end provisioning software (PUMA) for DSLAMs and network devices using multiple databases and remote connections.

Research Assistant, Clemson University — Clemson, SC

May 2014 – Feb 2018

Solved the Midas touch problem in eye-tracking with natural eye gestures. Published A Rotary Dial for Gaze-based PIN Entry at ETRA 2016.

03. Education

MS in Computer Science

2024 – Present

A graduate CS program with AI as the common thread across every course — applied NLP, machine learning, data mining, information retrieval, and even database systems were all taught through an AI lens. Concentrated specifically on building, evaluating, and deploying AI systems.

  • University of Southern California, Viterbi School of Engineering
  • Focus: Data Science (AI-centered coursework)
  • Key areas: AI foundations, applied ML and NLP, data mining, information retrieval, algorithms, and AI-driven data systems.
Graduate coursework
  • CSCI 544 — Applied Natural Language Processing: Transformer-based language models, fine-tuning strategies, and end-to-end NLP pipelines built on modern LLMs.
  • DSCI 552 — Machine Learning for Data Science: Supervised and unsupervised ML, recommendation systems, and adaptive user interfaces, applied to real-world data.
  • DSCI 553 — Foundations and Applications of Data Mining: Large-scale data mining for AI-era datasets — frequent patterns, locality-sensitive hashing, clustering, link analysis, and streaming algorithms at web scale.
  • CSCI 561 — Foundations of Artificial Intelligence: Search, constraint satisfaction, probabilistic reasoning, planning, and game-playing agents — classical AI foundations that still underpin modern agentic systems.
  • CSCI 567 — Machine Learning: Supervised and unsupervised learning, kernel methods, ensembles, and deep learning — the algorithmic and mathematical core of modern ML.
  • CSCI 570 — Analysis of Algorithms: Dynamic programming, graph algorithms, NP-completeness, and complexity analysis — taught through algorithmic problems drawn from ML, optimization, and search.
  • CSCI 572 — Information Retrieval & Web Search Engines: Crawling, indexing, and ranking — from classical inverted indexes through AI/ML-powered modern search and neural retrieval.
  • CSCI 585 — Database Management Systems: Relational and NoSQL databases, query optimization, and distributed data — framed throughout around AI workloads and AI-integrated data systems.

BS in Computer Science

2012 – 2017

A four-year CS degree grounded in systems programming, algorithms, and software engineering — with graduate-level research electives in human-computer interaction and eye tracking through the School of Computing.

  • Clemson University, School of Computing
  • Minor: Philosophy
  • Key areas: systems programming, algorithms, operating systems, networks, software engineering, and graduate-level HCI and eye-tracking research.
Undergraduate coursework
  • CPSC 1010 — Introduction to Programming in C: Fundamentals of C — control flow, pointers, memory management, and basic data structures.
  • CPSC 1020 — Introduction to Programming in C++: Object-oriented programming in C++ — classes, inheritance, templates, and the STL.
  • CPSC 2120 — Algorithms & Data Structures: Lists, trees, heaps, graphs, and algorithmic analysis in C++.
  • CPSC 2150 — Software Development Foundations: Design patterns, interfaces, testing, and disciplined software construction.
  • CPSC 2310 — Computer Organization: Machine-level representation, assembly, memory hierarchy, and architecture fundamentals.
  • CPSC 3220 — Operating Systems: Processes, threads, scheduling, synchronization, memory management, and file systems.
  • CPSC 3500 — Foundations of Computer Science: Formal languages, automata, computability, and the theoretical underpinnings of computation.
  • CPSC 3520 — Programming Systems: Functional and logic programming paradigms, language design, and runtime systems.
  • CPSC 3600 — Networks & Network Programming: TCP/IP, socket programming, protocol design, and distributed communication.
  • CPSC 3720 — Software Engineering: Requirements, design, project management, and the full software development lifecycle.
  • CPSC 4620 — Computer Graphics: 3D rendering pipelines, shaders, geometric transformations, and interactive graphics programming.
  • CPSC 4140 / 6140 — Human-Computer Interaction (Prof. Andrew Duchowski): Graduate-level HCI — interaction design, user studies, and evaluation methodology. Taken as an undergraduate via the 4xx / 6xx slash-listing.
  • CPSC 4120 / 6120 — Eye Tracking Methodology (Prof. Andrew Duchowski): Gaze measurement, fixation and saccade analysis, and applied eye tracking for HCI research — direct foundation for my ETRA ’16 publication. Graduate slash-listed course taken at the undergraduate level.

04. Publications

A Rotary Dial for Gaze-based PIN Entry

Best, Darrell S. and Duchowski, Andrew T. (2016). In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (ETRA '16), pages 69–76. ACM.

https://doi.org/10.1145/2857491.2857527

View on Google Scholar

05. What I Build & Ship

Tools are easy to list. What actually matters is what I can build with them. These are the capabilities I bring to an applied AI team.

Applied LLMs & NLP

Fine-tuning, domain adaptation, and deployment of transformer-based models for classification, generation, and structured extraction on real-world data.

Distributed & privacy-preserving training

Multi-GPU training with DeepSpeed and federated learning with Flower — including anomaly-resistant aggregation for sensitive data across organizations.

Model development & evaluation

Data pipelines, training loops, experiment tracking, and evaluation harnesses that hold up under real-world distribution shift — not just on benchmarks.

Production deployment

Dockerized services, CI/CD, and Linux-first deployment patterns for shipping ML and agent systems into environments with real uptime constraints.

Applied research

Published work and applied research on non-obvious uses of transformer architectures — including NLP techniques for non-text domains like FPGA bitstreams.

Tools & Stack

Languages

Python JavaScript C++ SQL Bash

AI / ML

PyTorch Hugging Face TensorFlow DeepSpeed Flower scikit-learn

DevOps & Tools

Docker Git CI/CD Linux

06. Contact

Open to senior applied AI, AI platform, and solutions architecture roles.

Also available for consulting on production AI workflows, federated learning, and applied NLP.