Building intelligent data systems at the intersection of ML, cloud engineering, and real-time pipelines. Currently researching LLMs & RAG for rare disease diagnosis at Auburn University.
I'm a Data Engineering master's candidate at Auburn University, where I also serve as a Graduate Teaching Assistant across Software Engineering, Web Dev, and Data Structures courses.
My research sits at the frontier of LLMs and Retrieval-Augmented Generation applied to rare disease diagnosis β where clean data pipelines meet meaningful human impact.
From building serverless AWS pipelines with Terraform and Step Functions, to training deep learning gesture recognition models at 99% accuracy, I thrive at the intersection of data engineering and applied ML.
Deep learning pipeline on the HaGRID 30K dataset achieving 99% training accuracy with ResNet, MobileNetV2, EfficientNet, and ConvNeXt. Real-time inference pipeline with OpenCV and MediaPipe.
Fully automated serverless data pipeline on AWS using Terraform IaC, ingesting YouTube statistics through Bronze/Silver/Gold data lake layers in S3. Streaming-ready with Kinesis Data Streams.
Relational SQL data models with complex CTEs and window functions for scalable analytical reporting. Python-driven ETL with data integrity validation and interactive BI dashboards for revenue insights.
Live from github.com/NIKHILSAI9390
Open to Data Engineering, Data Science, Analytics, and ML roles. Let's connect.
nzr0066@auburn.edu nikhilsairaishetty2026@gmail.com