Senior Software Engineer (SDE 3)
Mosaic Digital
Gurugram, India (Hybrid)
Role Overview
As a Senior Software Engineer (SDE 3) at Mosaic Digital, I lead the development of scalable data processing systems, microservices architecture, and AI-driven solutions. My work focuses on building high-performance systems that handle millions of data points while ensuring reliability and efficiency.
Key Achievements & Projects
Apache Airflow Web Scraping System
Created a comprehensive web scraping system to download financial filings from the Ministry of Corporate Affairs portal for 4M+ companies.
- Implemented rate-limiting and proxy rotation to handle large-scale scraping operations
- Built auto-retry mechanisms for reliable data ingestion
- Designed scalable architecture using Apache Airflow for workflow orchestration
- Processed and stored millions of financial documents with proper indexing
Query Engine with Cube.js
Architected a flexible Query Engine using Cube.js and Java-based Orchestrator Service to enable multi-platform integrations.
- Drove API-as-a-Service model generating INR 50 lakh in new revenue
- Enabled flexible querying across multiple data sources
- Built Java-based orchestrator for service coordination
- Integrated with various analytics platforms and BI tools
High-Performance Spring Boot Microservice
Developed a Spring Boot microservice with advanced aggregation pipelines and full-text search capabilities.
- Efficiently processed 2-3 million document aggregations across multiple collections
- Implemented proper MongoDB indexes for optimal query performance
- Built full-text search functionality with relevance scoring
- Designed RESTful APIs with comprehensive error handling
Database Migration Project
Led migration of 200+ unindexed MySQL tables to MongoDB with enhanced schemas and real-time monitoring.
- Designed enhanced MongoDB schemas for better performance
- Integrated Kafka-based pipeline for real-time data change monitoring
- Migrated diverse data sources with zero downtime
- Improved query performance by 10x through proper indexing
AI-Driven Business Categorization Service
Built an AI service for business model categorization using RAG and Brave Search API.
- Implemented Retrieval-Augmented Generation (RAG) for accurate categorization
- Integrated Brave Search API for real-time company information extraction
- Created taxonomies for business model differentiation
- Achieved 95% accuracy in automated business categorization
Technical Stack
Backend
- Java & Spring Boot
- Apache Airflow
- Cube.js
- Python for AI/ML
Database & Infrastructure
- MongoDB
- MySQL
- Apache Kafka
- Docker & Kubernetes
AI/ML
- Retrieval-Augmented Generation (RAG)
- Large Language Models
- Brave Search API
- Natural Language Processing
DevOps
- CI/CD Pipelines
- Monitoring & Logging
- Cloud Infrastructure
- Performance Optimization
Impact & Results
- Generated INR 50 lakh in new revenue through API-as-a-Service model
- Improved data processing speed by 300% through optimized microservices
- Reduced query response time from seconds to milliseconds
- Successfully migrated 200+ tables with 100% data integrity
- Achieved 95% accuracy in AI-driven business categorization