Open to opportunities

Data Engineer building production SaaS with AI

I ship end-to-end data platforms — from ETL pipelines to AI-powered analytics tools. Currently running 3 live SaaS products in production.

3
Live SaaS Products
7
GitHub Projects
60+
Features Shipped
14+
ML Models Integrated

Production SaaS Platforms

Three fully-deployed products built from scratch — from backend to frontend to AI integration

Data Engineering SaaS

DataPulse AI

Intelligent data quality monitoring with AI-powered SQL chat, multi-database support, and enterprise security.

  • AI chat — ask questions in plain English, get SQL + results
  • 14+ quality checks with auto health scores
  • Multi-tenant with JWT auth + encryption
  • PostgreSQL / MySQL / Snowflake support
  • Groq (cloud) or Ollama (local) LLM options
  • Scheduled checks, email alerts, audit logs
FastAPI Streamlit PostgreSQL JWT Groq LLM Railway
Data Science Workbench

DataLab AI

No-code data science platform with 21 features — from EDA to Auto-ML to SHAP explanations, all in one dashboard.

  • Upload any CSV → trained ML model in 5 min
  • Auto-ML trains 14+ models, picks the winner
  • SHAP explanations + ROC + feature selection
  • Clustering (K-Means, DBSCAN), PCA, t-SNE
  • Time series forecasting (Prophet + ARIMA)
  • Export as Jupyter notebook or .pkl model
Streamlit scikit-learn SHAP Prophet Plotly Pandas
Data Analytics Platform

DataInsight AI

Automated data analyst — upload any dataset and get complete analysis with statistical tests, visualizations, and PDF reports.

  • Full EDA: skewness, kurtosis, outliers, correlations
  • Statistical tests: T-test, ANOVA, Chi-square, regression
  • 25+ chart types (Matplotlib + Seaborn + Plotly)
  • AI explains every metric in plain English
  • Hypothesis testing with auto-test selection
  • Professional PDF report with recommendations
Streamlit SciPy Matplotlib Seaborn Groq LLM FPDF2

Data Engineering Projects

ETL pipelines, analytics platforms, and data processing systems

NY

NYC Taxi ETL Pipeline

Scalable big data ETL using PySpark and Delta Lake to process millions of NYC taxi trip records.

PySpark Delta Lake Python
View on GitHub →
EC

E-Commerce Analytics Platform

Full analytics pipeline on 1M+ records with star schema, RFM analysis, and complex window functions.

PostgreSQL SQL Python
View on GitHub →
KW

Keyword Analytics Pipeline

Automated data pipeline for keyword trend analysis with scheduled processing and visualization.

Python APIs Analytics
View on GitHub →
WX

Weather Analytics Pipeline

Real-time weather data ingestion from APIs with transformation, storage, and trend analysis.

Python REST APIs Pandas
View on GitHub →

Data Engineer with a builder's mindset

I'm Aswin, a Data Engineer from Chennai, India. I build production data pipelines by day and ship SaaS products that solve real data problems by night.

I care about end-to-end ownership — designing the schema, writing the ETL, deploying the API, building the UI, and making sure users can actually get value from data.

Location

Chennai, India

Role

Data Engineer

Focus

Python, ETL, AI

Status

Open to work

From data to decisions

Data pipelines that handle millions of rows, clean messy data, and transform it into analytics-ready schemas using PySpark, Python, and SQL.

Analytics platforms with star schemas, window functions, and performance optimization on PostgreSQL and Snowflake.

AI-powered apps that let non-technical users query databases in plain English, build ML models without writing code, and monitor data quality automatically.

Production SaaS with authentication, encryption, multi-tenancy, scheduled jobs, and real deployments on Railway + Streamlit Cloud.

Skills & Technologies

The tools I use to ship data products

Data Engineering

  • Python
  • PySpark
  • ETL / ELT
  • Airflow
  • dbt
  • Data Modeling
  • Star Schema
  • Delta Lake

Databases & Storage

  • PostgreSQL
  • MySQL
  • Snowflake
  • Advanced SQL
  • Window Functions
  • Indexing
  • Query Tuning

Backend & APIs

  • FastAPI
  • SQLAlchemy
  • REST APIs
  • JWT Auth
  • Pydantic
  • Encryption
  • Rate Limiting

Machine Learning

  • scikit-learn
  • SHAP
  • Prophet
  • Auto-ML
  • Feature Engineering
  • Clustering
  • Classification
  • Regression

AI & LLMs

  • Groq API
  • Ollama
  • Llama 3.3
  • Prompt Engineering
  • Text-to-SQL
  • RAG Patterns

Deployment & DevOps

  • Railway
  • Streamlit Cloud
  • Git / GitHub
  • Docker Basics
  • CI/CD
  • Environment Variables

Certifications

Verified courses and programs I've completed

Great Lakes

Post Graduate Program in Data Science and Engineering

May 2021 Verify →
Microsoft × Great Learning

Master Data Analytics in Excel

January 2026 Verify →
Great Learning

Master Generative AI

March 2026 Verify →

Let's build something together

I'm open to Data Engineer, ML Engineer, and full-stack data roles. Feel free to reach out — I usually reply within a day.