Type to search posts and projects to navigate

Subhadip Mitra

Engineering Leader & AI Systems Architect

Senior Engineering Leader with 15+ years building the teams, frameworks, and systems that turn Data and AI from research to production. Currently Head of Data & Analytics for Google Cloud in Southeast Asia - a practice built from zero, delivering enterprise Data and AI transformation across 7 countries.

Dual track as "Player-Coach": leading petabyte-scale data platforms and multi-agent systems for Fortune 500 clients, while driving innovation through published research (5 technical disclosures, 8 published packages on PyPI and Maven Central, plus open-source AI safety tools including sandbagging detection and activation steering). Member of Google Cloud delta, architecting solutions at the intersection of applied AI and enterprise scale.
Last updated: February 2026

Note: This is a public version with certain details removed for privacy. For a comprehensive resume including specific project metrics and contact details, please reach out via email or LinkedIn.

Experience

Head of Data & Analytics, Southeast Asia | Site Lead, PSO Southeast Asia

Google Cloud - Professional Services Organization

Dual-track role combining technical innovation leadership with regional delivery management. Built Google Cloud's Data Analytics practice for Southeast Asia with delivery scope across JAPAC, while serving as Site Lead overseeing cross-practice operations in SEA. Member of delta - Google Cloud's innovation and transformation team architecting enterprise AI solutions at scale.

Strategic Leadership & Delivery

  • Practice & Regional Leadership: Built Data Analytics practice for Southeast Asia from 0 to 1 across 6 countries. Serve as Site Lead overseeing all 7 PSO practices' delivery in SEA, owning utilization and CSAT metrics (97%) and contributing to 100% annual revenue target attainment.
  • Enterprise Delivery: Delivered first-of-kind solutions including GenAI-powered reconciliation framework for a major airline (now replicated across JAPAC), large-scale ML platform migrations (30K+ notebooks), and petabyte-scale data platform modernizations for Fortune 500 clients across financial services, telcos, and consumer electronics.
  • Escalation & Recovery: Led cross-practice rescue operations for at-risk enterprise accounts with multi-million dollar project values, recovering strategic customers and converting potential platform exits into long-term partnerships.
  • Agentic AI Transformation: Pioneered agentic AI adoption across all 7 PSO practices and 6 JAPAC sub-regions. Built SDKs, agent catalog, architecture discovery tools (100M+ node graph modeling), automated pipeline generation, and governance frameworks that reduced delivery costs.
  • Data Strategy: Built Data Strategy competency from 0, delivering 8-figure pursuit value across 14 strategic pitches in Asia Pacific. Partner with C-level stakeholders (CTOs, CDOs) to define data modernization and AI transformation roadmaps.

Technical Innovation & Research

  • Published Research: 5 Google Technical Disclosures (UPIR, FTCS, ARTEMIS, rule generation for tiered systems, cost-benefit routing for risk systems) plus the ETLC whitepaper on context-first data processing for GenAI. 8 open-source packages published on PyPI and Maven Central.
  • CatchMe - Intelligent Trust Engine: Industry-agnostic agentic AI for enterprise trust decisions. APLS self-learning + cascade routing achieving 86% cost reduction, sub-50ms latency.

Principal Engineer - Data & Analytics Transformation

Standard Chartered Bank

Led design and development of retail bank's data & analytics platform serving 11 markets, 100+ systems, and 1200+ users.

  • Developed self-service ML Workbench reducing model deployment time from months to weeks
  • Architected MarTech strategy driving 30% increase in customer acquisition through data-driven personalization
  • Created credit risk models over 15,000+ named entities leveraging news trends and social signals, reducing potential losses by $5M
  • Defined enterprise data strategy including third-party data governance, privacy frameworks, and cloud adoption roadmap

Principal Data Engineer / Solution Architect

Think Big Analytics (a Teradata company)

Architected enterprise-scale data solutions for Fortune 500 clients across APAC.

  • Designed 5 global data lakes with ETL pipelines handling 1.2 PB/hour and 40K daily files
  • Engineered real-time platform processing 2.5M events/second, improving Ad campaign responsiveness by 80%
  • Built ML fraud detection system achieving 60% fewer false positives and 25% higher detection rates, resulting in $3M savings
  • Built and managed large-scale Hadoop clusters (300+ nodes) for banks and telcos across JAPAC

Software Engineering & Technical Leadership

Microsoft, Truckaurbus (Founder), UTU

Progressive advancement through software engineering, entrepreneurship, and technical leadership across systems development, marketplace platforms, and payments infrastructure.

  • Microsoft (2010-2014): Windows Kernel development (Windows 7/8, Server 2012 R2), Azure ML implementations, CDN architecture optimization
  • Truckaurbus (2014-2016): Founded B2B commercial vehicle marketplace - 15 cities, 25+ OEM/bank partnerships
  • UTU Singapore (2016-2017): Led maiden Thailand technical development; bank integration; payment/rewards systems for merchants

Research & Open Source

2025

Spark LLM Eval - Distributed Evaluation Framework

Distributed LLM evaluation framework built on Apache Spark for enterprise-scale model assessment. Addresses the gap in evaluating LLMs at scale with statistical rigor, integrating seamlessly with Databricks infrastructure.

2025 - Present

LLM Inference Efficiency Research

Research implementations addressing the fundamental bottleneck in LLM inference: memory-bandwidth constraints rather than compute limits. Explores acceleration through speculative decoding, custom GPU kernels, and quantization strategies.

2025 - Present

AI Metacognition Toolkit

Activation-level detection of sandbagging, deception, and situational awareness in LLMs. Linear probes achieve 90-96% accuracy across Mistral, Gemma, and Qwen models. Published on PyPI.

2025

Steering Vectors for Agent Behavior Control

Runtime control of LLM agent behaviors through activation steering vectors - modifying model outputs at inference time without retraining. Demonstrates more calibrated control than traditional prompting approaches with LangChain integration.

Education

MBA, Business Analytics

Birla Institute of Technology and Science, Pilani

Financial Risk Analytics, Marketing Models, Strategic Management, Predictive Analytics, Operations Management

MTech, Software Systems

Birla Institute of Technology and Science, Pilani

Algorithms, Distributed Systems, Deep Learning, NLP, Machine Learning, Artificial Intelligence

Publications & Technical Disclosures

Technical Expertise

Technology Leadership & Strategy

Enterprise Architecture, Digital Transformation, AI & Data Strategy, C-Suite Advisory, Innovation Leadership, Strategic Planning

Data Engineering & Architecture

Data Pipelines, Real-Time Processing, Data Mesh & Fabric, Data Governance, Apache Spark, Delta Lake, Apache Kafka, Apache Iceberg

Generative AI & Machine Learning

Multi-Agent Systems, Large Language Models, RAG Architecture, Vector Databases, PyTorch, LangChain, LangGraph, LlamaIndex, Google ADK, MCP, A2A Protocol, MLflow, LLMOps

Cloud Platforms & Infrastructure

Google Cloud Platform, BigQuery, Vertex AI, Dataproc, Cloud Composer, GKE, Terraform, Kubernetes

Programming & Development

Python, SQL, Scala, Triton, CUDA, Algorithm Design, Formal Verification, Program Synthesis, Distributed Systems

Notable Projects

2024 - Present

LLMConsent

Privacy-preserving consent management protocol for LLM training data - enabling transparent opt-in/opt-out mechanisms with cryptographic verification.

  • Decentralized consent registry on public blockchain
  • Cryptographic proof of consent status
  • Real-time opt-out enforcement for AI training
  • GDPR-compliant privacy controls for GenAI era
2022 - 2023

Open Location Proof Protocol

A privacy-aware open protocol for non-repudiable location verification in physical or virtual spaces.

  • Cryptographically secure yet privacy-preserving protocol
  • Fully decentralized architecture resistant to tampering
  • Published comprehensive specifications for industry adoption
2021 - 2022

OConsent - Open Consent Protocol

Production implementation of the OConsent research protocol - a working system for managing user consent and privacy on public blockchains.

  • Full-stack implementation with live deployment at oconsent.io
  • Smart contract suite for on-chain consent management
  • GDPR-compliant with automated audit capabilities
2024 - 2025

SMPP Core

Modern Java 21 SMPP protocol implementation with virtual threads for high-performance SMS messaging.

  • 1.8M PDU decodes/sec, 1.5M encodes/sec, 25K network round-trips/sec
  • Complete SMPP 3.3, 3.4, 5.0 support with modular architecture
  • Published on Maven Central
2016 - 2025

ISO8583 Simulator

High-performance financial message processing tool for ISO 8583, used by banks and payment processors.

  • 180k+ TPS message parsing with Cython optimization
  • Multi-network support (VISA, Mastercard, AMEX, Discover, JCB, UnionPay)
  • Published on PyPI with AI-powered test generation
Subhadip Mitra