-
Engineering Autonomous Multi-Agent Systems - A Technical Deep Dive into Telecom Customer Service
January 05, 2025
Note: This blog post covers generative AI / autonomous agents. For traditional software agents system’s view refer this post for Banking/ FSI here.
-
Engineering Multi-Agent Systems - A Retail Banking Case Study
December 28, 2024
Note: this blog post covers traditional software agents and doesn’t cover generative AI or autonomous agents. For a GenAI Agents System design case study, refer to this post.
-
ETLC 2.0 - Building Context-Aware Data Pipelines
December 07, 2024
This blog post proposes a novel concept - ETL-C, a context-first approach for building Data / AI platforms in the Generative AI dominant era. Read prior post here.
-
The End of Data Warehouses? Enter the Age of Dynamic Context Engines
November 18, 2024
Before diving in, you might find it helpful to explore some foundational ideas around ETLC (Extract, Transform, Load, Contextualize) that I’ve previously discussed:
- Adaptive Contexts and Contextual Joins
- The Context-First Paradigm
- ETLC 2.0 - Building Context-Aware Data Pipelines
These posts explore how context reshapes data pipelines and lays the groundwork for understanding the transformative potential of Dynamic Context Engines discussed here.
-
(Part 3/3) - Reimagining ETL with Large Language Models—The Path to Intelligent Pipelines
October 20, 2024
Introduction: A New Era of ETL
-
Data Pipelines Gone Wild - 10 WTF Moments That'll Make You Rethink Your Architecture
August 02, 2024
If you’ve ever stared at a cryptic error message in the middle of the night, muttering “WTF is happening with this pipeline?”, then buckle up. You’re about to embark on a wild ride through the data engineering twilight zone, where we’ll dissect ten mind-blowing pipeline snafus that’ll make you question everything you thought you knew. From ancient code that refuses to die to algorithms with a mind of their own, this ain’t your grandma’s data plumbing.
-
Introducing ETL-C (Extract, Transform, Load, Contextualize) - a new data processing paradigm
May 04, 2024
This blog post proposes a novel concept - ETL-C, a context-first approach for building Data / AI platforms in the Generative AI dominant era. More discussions to follow.
-
(Part 2/3) Rethinking ETLs - How Large Language Models (LLM) can enhance Data Transformation and Integration
April 20, 2024
Part 2: Exploring examples and optimization goals
-
(Part 1/3) Rethinking ETLs - How Large Language Models (LLM) can enhance Data Transformation and Integration
April 15, 2024
Part 1: Searching for an Optimal Algorithm for ETL planning
-
Who Needs Exact Answers Anyway? The Joy of Approximate Big Data
January 16, 2024
The explosion of big data has created an insatiable demand for analytical insights. However, traditional computational methods often struggle to keep up with the sheer volume and velocity of data in many real-world applications. This is where approximation techniques offer a lifeline — trading a small degree of accuracy for a significant boost in processing speed and efficiency.
-
Evolutionary Bytes - Harnessing Genetic Algorithms for Smarter Data Platforms (Part 2/2)
December 29, 2023
In part 1 of this series, we explored the power of genetic algorithms in shaping data platforms and powering e-commerce personalization. Now, we’ll take a more platform-specific technical turn. Let’s uncover how genetic algorithms revolutionize database query optimization, leading to lightning-fast responses and efficient resource usage.
-
Evolutionary Bytes - Harnessing Genetic Algorithms for Smarter Data Platforms (Part 1/2)
December 25, 2023
Genetically-Inspired Data Platforms leverage the principles of genetic algorithms (GAs), a class of evolutionary algorithms, to solve optimization and search problems through mechanisms inspired by natural selection and genetics. These platforms can be highly effective in environments where the solution space is large, complex, and not well-understood. Integrating such algorithms into data platforms allows for dynamic optimization and adaptation of data management processes, including data organization, indexing, query optimization, and more.
-
Quantum vs. Classical - Data Management Computational Complexity
December 10, 2023
In the ever-evolving landscape of data management, the distinction between quantum and classical computing is becoming increasingly significant. Traditional methods of searching and processing vast amounts of data are being challenged by the advent of quantum algorithms, which promise to drastically improve efficiency and performance. Among these quantum innovations, Grover’s Algorithm stands out as a revolutionary development in the field of quantum search efficiency.
-
Quantum Experiment Data Exchange (QEDX) - Building an Interoperability Standard
November 20, 2023
In this post, we will design the foundations for an interoperability standard for our Quantum Data Management (QDM) Platform. Read more about interoperability in QDM here.
-
Data at Quantum Speed - The Promise and Potential of QDP
October 28, 2023
For a comprehensive understanding of Quantum Data Platforms, it is recommended to read this blog in conjunction with related posts on Quantum vs. Classical Data Management Complexity and Quantum Data Exchange, which delve deeper into related complexities and interactions.
-
The Next Frontier - Envisioning the Future of Data Platforms Beyond Data Mesh, Data Lakehouse, and Data Hub/Fabric
October 12, 2023
In the rapidly evolving landscape of data management, the progress from traditional data warehouses to more innovative structures like Data Mesh, Data Lakehouse, and Data Hub has marked significant milestones in how businesses handle and leverage their data. As we peer into the future, it’s clear that the next evolution of data platforms is on the horizon, promising even more robust capabilities and revolutionary approaches to data architecture. Following are some conceptual and potential directional innovations that could define the next generation of data platforms, including an exciting integration of concepts inspired by genetic algorithms.
-
Part 4 - Building a Massive-Scale Real-Time Data Platform - Memory Management with Apache Ignite
December 05, 2022
In Parts 1-3, we explored our system architecture, data partitioning, and memory management. Today, we’ll dive deep into how we optimized Apache Kafka to handle 2.5 million events per second while ensuring reliable message delivery and processing.
-
Part 3 - Building a Massive-Scale Real-Time Data Platform - Memory Management with Apache Ignite
November 27, 2022
In Parts 1 and 2, we explored our system architecture and data partitioning strategies. Today, we’ll dive deep into how we managed memory using Apache Ignite to handle 2.5 million events per second while maintaining sub-millisecond response times.
-
Part 2 - Building a Massive-Scale Real-Time Data Platform - Data Partitioning and Flow
November 18, 2022
In Part 1 of this series, we introduced our telecommunications data platform that processes 2.5 million events per second and handles 350GB of DPI data every 15 minutes. Today, we’ll dive deep into how we designed and implemented the data partitioning strategy and managed the massive data flow through the system.
-
Part 1 - Building a Massive-Scale Real-Time Data Platform - System Overview and Architecture
November 12, 2022
In the telecommunications industry, the ability to process and analyze data in real-time can mean the difference between proactive customer service and missed opportunities. This blog series details our journey in building a real-time data platform capable of processing 2.5 million events per second and handling 350GB of Deep Packet Inspection (DPI) data every 15 minutes for a major telecommunications provider.
-
Overcoming Synchronization Hurdles in Cellular Network Positioning
April 22, 2022
In the rapidly evolving landscape of cellular technologies, precision in network synchronization isn’t just a technical requirement—it’s the backbone of effective positioning and navigation services. Time Difference of Arrival (TDOA) methods, encompassing E-OTD, OTDOA, and U-TDOA, are at the forefront of this technology but face substantial challenges due to synchronization requirements.
-
Reimagining System Design: Balancing Time-Tested Principles with Modern Innovations
January 16, 2021
In the ever-evolving landscape of technology, the principles that once defined robust system design are now being revisited and revised to address new challenges and harness emerging opportunities. From Amazon’s pioneering strategies that effectively shaped the cloud services realm to the latest frameworks emphasizing sustainability and ethics, the journey of system design principles reflects a broader narrative of adaptation and foresight. This blog delves into the timeless wisdom of Amazon’s early system design principles, juxtaposed with newly proposed concepts tailored for today’s digital and societal demands. We’ll explore how these frameworks not only coexist but also complement each other in driving innovation and ensuring technology serves broader human and environmental needs.
-
Designing a Real Time Data Processing System
January 16, 2021
Designing and developing large-scale distributed real-time data processing systems is a complex undertaking. These systems must handle massive volumes of data, often with strict latency requirements, while maintaining reliability and scalability. Over the years, I’ve encountered various challenges and developed a set of principles to guide the design process. These principles have evolved alongside changing usage patterns and the increasing demands placed on these systems. This guide aims to share these insights, covering key considerations in various stages of the design and implementation process.
-
Introducing OConsent — Open Consent Protocol
December 10, 2020
In the current connected world — Websites, Mobile Apps, IoT Devices collect a large volume of users’ personally identifiable activity data. These collected data is used for varied purposes of analytics, marketing, personalisation of services, etc. Data is assimilated through site cookies, tracking device IDs, embedded JavaScripts, Pixels, etc. to name a few. Many of these tracking and usage of collected data happens behind the scene and is not apparent to an average user. Consequently, many Countries and Regions have formulated legislations (e.g. GDPR, EU) — that allow users to be able to control their personal data, be informed and consent to its processing in an comprehensible and user-friendly manner.
-
Reading List
January 02, 1970
last updated: 2024-05-09
-
Welcome to my blog
January 01, 1970
Welcome to my blog - binary breakthroughs!