February 2024 - Present

Senior Software Engineer (SDE 3)
Mosaic Digital

Gurugram, India (Hybrid)

Role Overview

As a Senior Software Engineer (SDE 3) at Mosaic Digital, I lead the development of scalable data processing systems, microservices architecture, and AI-driven solutions. My work focuses on building high-performance systems that handle millions of data points while ensuring reliability and efficiency.

Key Achievements & Projects

Apache Airflow Web Scraping System

Created a comprehensive web scraping system to download financial filings from the Ministry of Corporate Affairs portal for 4M+ companies.

  • Implemented rate-limiting and proxy rotation to handle large-scale scraping operations
  • Built auto-retry mechanisms for reliable data ingestion
  • Designed scalable architecture using Apache Airflow for workflow orchestration
  • Processed and stored millions of financial documents with proper indexing

Query Engine with Cube.js

Architected a flexible Query Engine using Cube.js and Java-based Orchestrator Service to enable multi-platform integrations.

  • Drove API-as-a-Service model generating INR 50 lakh in new revenue
  • Enabled flexible querying across multiple data sources
  • Built Java-based orchestrator for service coordination
  • Integrated with various analytics platforms and BI tools

High-Performance Spring Boot Microservice

Developed a Spring Boot microservice with advanced aggregation pipelines and full-text search capabilities.

  • Efficiently processed 2-3 million document aggregations across multiple collections
  • Implemented proper MongoDB indexes for optimal query performance
  • Built full-text search functionality with relevance scoring
  • Designed RESTful APIs with comprehensive error handling

Database Migration Project

Led migration of 200+ unindexed MySQL tables to MongoDB with enhanced schemas and real-time monitoring.

  • Designed enhanced MongoDB schemas for better performance
  • Integrated Kafka-based pipeline for real-time data change monitoring
  • Migrated diverse data sources with zero downtime
  • Improved query performance by 10x through proper indexing

AI-Driven Business Categorization Service

Built an AI service for business model categorization using RAG and Brave Search API.

  • Implemented Retrieval-Augmented Generation (RAG) for accurate categorization
  • Integrated Brave Search API for real-time company information extraction
  • Created taxonomies for business model differentiation
  • Achieved 95% accuracy in automated business categorization

Technical Stack

Backend

  • Java & Spring Boot
  • Apache Airflow
  • Cube.js
  • Python for AI/ML

Database & Infrastructure

  • MongoDB
  • MySQL
  • Apache Kafka
  • Docker & Kubernetes

AI/ML

  • Retrieval-Augmented Generation (RAG)
  • Large Language Models
  • Brave Search API
  • Natural Language Processing

DevOps

  • CI/CD Pipelines
  • Monitoring & Logging
  • Cloud Infrastructure
  • Performance Optimization

Impact & Results

  • Generated INR 50 lakh in new revenue through API-as-a-Service model
  • Improved data processing speed by 300% through optimized microservices
  • Reduced query response time from seconds to milliseconds
  • Successfully migrated 200+ tables with 100% data integrity
  • Achieved 95% accuracy in AI-driven business categorization