projects
Open-source frameworks and protocols advancing AI, privacy, and distributed systems
Open Source Projects
Building frameworks and protocols that advance AI safety, privacy engineering, and distributed systems
Production-Ready Frameworks
UPIR: Universal Plan Intermediate Representation
Framework for automated synthesis and verification of distributed systems - bridging the gap between high-level specifications and provably correct implementations.
- Compositional verification with proof caching (274x speedup)
- CEGIS-based synthesis for automated implementation generation
- Constrained RL with PPO for optimization under correctness guarantees
- First framework to unify synthesis, verification, and optimization
AI Metacognition Toolkit
Activation-level detection of sandbagging, deception, and situational awareness in LLMs. Linear probes achieve 90-96% accuracy across Mistral, Gemma, and Qwen models. Includes steering vectors for runtime behavior control.
- Sandbagging detection via linear probes with 90-96% accuracy
- Steering vectors reduce sandbagging behavior by 20%
- Bayesian situational awareness detection with KL divergence
- 275 tests, 95% code coverage, production-hardened
ARTEMIS: Multi-Agent Debate Framework
Production-ready multi-agent debate framework implementing Adaptive Reasoning Through Evaluation of Multi-agent Intelligent Systems. Orchestrates structured debates between AI agents with hierarchical argument generation, jury-based evaluation, and integrated safety monitoring.
- H-L-DAG: Hierarchical argument synthesis at strategic, tactical, and operational levels
- L-AE-CR: Adaptive evaluation with causal reasoning and jury scoring mechanism
- Built-in safety: sandbagging detection, deception monitoring, behavioral drift tracking
- Native support for reasoning models (o1, R1, Gemini 2.5) and multimodal evidence
Speculative Decoding
Reference implementation of LLM inference acceleration techniques - achieving faster generation through speculative decoding, tree speculation, EAGLE, Medusa, KV-cache compression, and diffusion efficiency optimizations.
- 1.10x speedup with exact target distribution guarantee
- Six major techniques: speculative decoding, tree speculation, EAGLE, Medusa
- KV-cache compression with H2O, sliding window, and INT8/INT4 quantization
- Production-quality code with comprehensive benchmarks and interactive demos
Triton Kernels for LLM Inference
High-performance GPU kernels for LLM inference operations using OpenAI Triton. Educational implementations demonstrating memory-bandwidth optimization techniques for transformer operations on A100 GPUs.
- RMSNorm kernel: 8.1x faster than PyTorch, achieving 88% of A100 peak bandwidth (1365 GB/s)
- Fused RMSNorm+Residual: 6.0x speedup through operation fusion
- SwiGLU activation: 1.6x improvement with custom kernel
- INT8 GEMM: 2x memory savings through weight quantization
Steering Vectors for Agent Behavior Control
Runtime control of LLM agent behaviors through activation steering vectors - modifying model outputs at inference time without retraining. Demonstrates more calibrated control than traditional prompting approaches.
- Contrastive Activation Addition (CAA) for steering vector extraction
- Dynamic strength adjustment per-request for behavior intensity control
- Multi-vector composition with interference mitigation
- LangChain integration for production deployment
Spark LLM Eval
Distributed LLM evaluation framework built on Apache Spark for enterprise-scale model assessment. Handles millions of examples with statistical rigor, integrating seamlessly with Databricks infrastructure.
- Pandas UDFs with Arrow for efficient distributed batching
- Bootstrap confidence intervals and statistical significance testing
- Multi-provider support: OpenAI, Anthropic Claude, Google Gemini
- LLM-as-judge evaluation patterns with agent trajectory support
LLMConsent
Privacy-preserving consent management protocol for LLM training data - enabling transparent opt-in/opt-out mechanisms with cryptographic verification.
- Decentralized consent registry on public blockchain
- Cryptographic proof of consent status
- Real-time opt-out enforcement for AI training
- GDPR-compliant privacy controls for GenAI era
OConsent Protocol
Open-source blockchain-based protocol for transparent personal data consent management - enabling granular control and tamper-proof audit trails.
- Published research paper (2021, BITS Pilani)
- Transparent consent lifecycle management
- GDPR-compliant with automated compliance reporting
- Real-time privacy breach alerts
Open Location Proof Protocol
Privacy-aware open protocol for non-repudiable location verification in physical or virtual spaces - with cryptographic proof and decentralized architecture.
- Cryptographically secure location attestation
- Privacy-preserving proof mechanisms
- Fully decentralized, tamper-resistant
- Complete technical specifications published
SMPP Gateway
Modern Java 21 implementation of the SMPP protocol - the actively maintained replacement for Cloudhopper. Built on Netty with virtual threads for high-performance SMS messaging at scale.
- Java 21 virtual threads, records, and sealed interfaces for clean APIs
- 1.8M PDU decodes/sec, 1.5M encodes/sec, 25K network round-trips/sec
- Complete SMPP 3.3, 3.4, and 5.0 protocol support
- Modular architecture: core, netty transport, server, client, metrics
SMPP Kafka Producer
Production-ready bridge between SMPP protocol and Apache Kafka - receives SMS messages via SMPP and publishes to Kafka topics. Features HTTP/2 REST API aligned with 3GPP TS 29.540 SMSF standards for 5G compatibility.
- Dual protocol support: SMPP 3.x/5.x and HTTP/2 REST API
- Java 21 virtual threads for high-throughput async processing
- Prometheus metrics and health check endpoints
- Cloud-native: Docker, Kubernetes manifests, and Helm charts
ISO8583 Simulator
High-performance financial message processing tool for ISO 8583 payment protocol. Features CLI, Python SDK, and AI-powered message explanation. Supports VISA, Mastercard, AMEX, Discover, JCB, and UnionPay networks.
- 180k+ TPS message parsing with Cython optimization
- LLM integration: explain messages in plain English, generate from natural language
- Multi-provider AI: OpenAI, Anthropic, Google, local Ollama
- Native EMV/chip card data support (Field 55)
Want to Contribute?
These projects are open source and welcome contributions. Whether you want to report issues, suggest features, or submit pull requests - your input helps advance the field.