Founding AI Engineer

Taj Khunkhun

Transforming experimental prototypes into production-grade agentic AI systems. 8+ years in backend systems, 3+ years specializing in Multi-Agent Orchestration and Autonomous Reasoners.

LangGraph · CrewAI · AutoGen · MCP · RAG

Schedule a Meeting Download Resume View My Work LinkedIn GitHub

What I Work With

Technical Skills

🤖

AI & Agents

LangChainLangGraphCrewAIAutoGenAutoGPTMCPAgentic RAGGraphRAGLangSmithMulti-Agent OrchestrationPrompt Engineering

🧠

ML & Deep Learning

PyTorchTensorFlowKerasscikit-learnspaCyNLPLoRA / QLoRAReinforcement LearningNeural NetworksMLOps / LLMOps

⚡

Languages & Frameworks

PythonGolangTypeScriptJavaScriptDjangoFastAPIFlaskNode.jsNext.jsReactAngular

🗄️

Data & Infrastructure

SnowflakeDatabricksSparkKafkaAirflowBigQueryNeo4jRedisPineconeElasticsearchChromaDBPostgreSQL

☁️

Cloud & DevOps

AWSAzureGCPKubernetesDockerTerraformCI/CDMicrosoft FabricCopilot StudioAzure Data Lake

Where I've Worked

Experience

Lextar AI Legal Solutions, Inc.

Founding AI Engineer

Remote

Jan 2026 - Present

Lextar AI - Governance-Grade Legal Reasoning Platform2026-Present

▸Architected a full-stack enterprise legal AI platform for Canadian and US immigration law using FastAPI, Next.js, PostgreSQL, SQLAlchemy, and ChromaDB for hybrid RAG.
▸Engineered an 11-step lawyer-faithful reasoning pipeline integrating DeepSeek LLM, hierarchical issue framing, conflict resolution, and strict grounding verification.
▸Built a hybrid GraphRAG retrieval system combining ChromaDB vector search with an in-memory knowledge graph (1,300+ legal nodes, 445 relationships) across 6,200+ ingested chunks.
▸Implemented multi-tenant SaaS with Clerk authentication, role-based access control, Stripe checkout, PDF invoices, and automated SMTP email receipts.
▸Designed superadmin dashboard with platform-wide analytics, user management, per-organization usage tracking, and real-time reasoning unit consumption monitoring.
▸Built real-time streaming legal analysis workspace with SSE, live reasoning trace visualization, and role-specific output formatting.
▸Implemented strict legal output governance including grounding verification, cross-reference resolution, and confidence-weighted scoring.

Product Demo

Hewlett Packard Enterprise

Senior AI Engineer | Agentic AI Engineer

Spring, Texas (Remote)

Jan 2023 - Dec 2025

HPE Private Cloud AI - NVIDIA AI Computing2024-2025

▸Designed multi-agent Plan-and-Execute architecture using LangGraph, reducing infinite loop failures by 35%.
▸Architected MCP server layer standardizing tool integration, reducing custom tool-binding code by 60%.
▸Reduced inference costs by 45% via Router Agent dynamically triaging between Llama 2 7B and GPT-4.
▸Implemented HITL checkpoint system for financial workflows with 0.85 confidence threshold.
▸Built observability dashboards using LangSmith and Arize Phoenix, resolving 10s+ latency bottlenecks.
▸Architected LLM evaluation framework using DeepEval across 5,000+ test cases.
▸Reduced costs by 52% and p95 latency by 300ms through semantic caching with Redis.
▸Developed Shadow Deployment pipeline for prompt engineering A/B tests on live traffic with 0% user impact.
▸Designed multi-agent collaboration framework with Planner, Executor, and Critic agents via structured message-passing.

HPE Ezmeral Unified Analytics & Data Fabric2023-2024

▸Eliminated catastrophic forgetting in Llama 2 by mixing 15% pre-training replay data.
▸Reduced training VRAM by 65% using QLoRA 4-bit quantization for 70B parameter models.
▸Improved inference throughput 4x via multi-LoRA serving (vLLM/LoRAX) for 10+ adapters.
▸Architected multi-region Kafka with sub-second cross-region replication.
▸Engineered tiered memory system preserving intent across 20+ agent handoffs.

HPE GreenLake Cloud & OpsRamp AIOps2023-2025

▸Led migration from monolithic Django to Microservices with Docker and Kubernetes.
▸Built high-concurrency FastAPI architecture handling 10k+ concurrent WebSocket connections.
▸Built MCP-compliant tool registry enabling dynamic tool discovery at runtime.
▸Architected Self-Correction loop for SQL agent, reducing syntax errors by 50%.
▸Built NER pipeline with Keras Bi-LSTM achieving 25% F1-score improvement.
▸Resolved critical GIL bottlenecks by refactoring to Python Multiprocessing and Celery/Redis worker cluster.
▸Engineered custom spaCy and Transformer models integrated with Fabric Lakehouse using lazy-loading across Spark clusters.

Adobe

Data & Machine Learning Engineer

San Jose, California

Aug 2019 - Jan 2023

Adobe Experience Platform Pipeline & Data Lake2019-2022

▸Architected ETL pipeline processing 5TB+ multi-modal data using Spark.
▸Eliminated vector-relational desync via CDC (Debezium + Kafka) with 99.9% consistency.
▸Engineered Blue-Green re-indexing for zero-downtime migrations across 50M+ vectors.
▸Optimized semantic search by 40% via hierarchical document indexing.
▸Reduced vector storage costs by $8k/month through tiered data strategy.

Adobe Sensei ML Framework & Content Intelligence2020-2023

▸Deployed Semantic Data Guard monitoring data drift with 15% deviation alerting.
▸Standardized AI Data Contracts across four teams enforcing GDPR/CCPA compliance.
▸Reduced inference latency by 65% via model distillation on NVIDIA A100 GPUs.
▸Built synthetic data engine using SDV and GPT-3.5, improving minority tasks by 18%.
▸Engineered Fail-Soft orchestration saving $15k/month in compute costs.
▸Solved multi-modal cold start problem with tiered embedding cache (Redis + Elasticsearch), reducing first-response time from 2.4s to 400ms.
▸Built programmatic A/B testing framework promoting models based on Ground Truth alignment with statistical significance tests.

Enterprise Work

Projects

Lextar AI - Governance-Grade Legal Reasoning Platform2026 - Present

Full-stack legal AI SaaS platform with hybrid GraphRAG retrieval (1,300+ node knowledge graph + vector search), 11-step structured legal reasoning pipeline, multi-tenant RBAC, real-time streaming analysis workspace, and governance-grade output verification for Canadian and US immigration law.

FastAPINext.jsPostgreSQLChromaDBDeepSeekGraphRAGClerkStripeSSE

HPE Private Cloud AI & NVIDIA AI Computing2024 - 2025

Multi-agent Plan-and-Execute architectures, Router Agents, MCP server integrations, HITL checkpoints, semantic caching, shadow deployment pipelines, and LLM evaluation frameworks for HPE's enterprise AI platform.

LangGraphMCPNVIDIA NIMRAGMulti-AgentLangSmithDeepEvalGPT-4Llama 2

HPE Ezmeral Unified Analytics & Data Fabric2023 - 2024

Fine-tuned LLMs with QLoRA on GPU clusters, multi-region Kafka replication, NER pipelines, multi-agent memory systems, and multi-LoRA inference serving across Kubernetes-based ML platform.

SparkKafkaKubernetesspaCyKerasBi-LSTMPyTorchQLoRAvLLM

HPE GreenLake Cloud & OpsRamp AIOps2023 - 2025

Migrated monolithic services to microservices, high-concurrency event-driven APIs, MCP-compliant tool registries, SQL-generating agents, and GIL/integration bottleneck resolution.

DjangoFastAPIDockerKubernetesMCPAgentic AICeleryRedis

Adobe Experience Platform & Sensei ML2019 - 2023

High-throughput ETL/search pipelines, CDC-based vector sync, embedding drift management, synthetic data generation, inference optimization, and A/B testing frameworks.

SparkKafkaDebeziumPineconeElasticsearchGrafanaGPT-3.5NVIDIA A100

Open Source

Side Projects

Personal projects exploring agentic AI patterns, multi-agent architectures, and intelligent automation.

Agentic AI Chat Analyzer

★ 1

AI-powered platform for analyzing agent chat transcripts. Performs exploratory data analysis, LLM-based summarization, and sentiment classification through an interactive Streamlit frontend.

▸Modular data pipeline (ingestion, cleaning, transformation)
▸EDA with word clouds and sentiment visualizations
▸Model caching for offline operation

FastAPIStreamlitHuggingFaceFlan-T5RoBERTaDockerPandas

View on GitHub

AI Recruiter

★ 2

Intelligent recruitment platform that automates candidate discovery by scanning GitHub profiles and Google Scholar to identify qualified AI/ML professionals with relevance scoring.

▸Multi-source profile analysis (GitHub + Google Scholar)
▸Relevance-based intelligence scoring for AI/ML skills
▸Geographic filtering and co-author extraction

PythonFlaskWeb ScrapingDockerNLP

View on GitHub

AI Email Agent (Supervisor Mode)

Email automation system using supervisor-pattern multi-agent architecture that categorizes emails, generates RAG-powered responses, proofreads with AI, and sends replies via Gmail.

▸Supervisor pattern for dynamic agent coordination
▸RAG-powered response generation from knowledge base
▸AI proofreading layer before sending

LangChainLangGraphGroqLlama 3.3ChromaDBGmail APIFastAPI

View on GitHub

Simple Chatbot

Lightweight, rule-based chatbot with Gradio UI that answers questions about healthcare automation agents using fuzzy string matching -- no external LLM calls required.

▸Weighted scoring: string similarity (60%) + keyword matching (40%)
▸Confidence threshold for answer selection
▸Graceful fallback responses listing available topics

PythonGradioNLPFuzzy Matching

View on GitHub

Academic Background

Education

Master of Computer Science

Santa Clara University

Aug 2019 - Jun 2020

Bachelor of Computer Science

Santa Clara University

Sep 2015 - Jun 2019

Let's Connect

Get in Touch

I'm always open to discussing new opportunities in Agentic AI, Multi-Agent Systems, and production ML engineering.

taj479505@gmail.com +1 408 380 2726

linkedin github