INITIALIZING_PORTFOLIO
AJ
Data Scientist · ML Engineer
Open to Full-Time · 2026
🥇 GOLD MEDALIST

ATHARVA
JOSHI

View Work ↓
3.73 GPA
MS Data Science
SUNY Buffalo
🥇 Gold Medalist · B.E. E&TC
♞ Chess 2000
SYS TIME: --:--:-- UTC
scroll
01 · Selected Works

Projects That Ship01

~/finreg-ml · python3
finreg-ml HuggingFace demo
HuggingFace Demo
Open on HF ↗
PROJECT_01 / 05 · FLAGSHIP
finreg-ml
Regulation-aware ML pipeline for finance. GovernedModel, SHAP explainability, fairness audits, EU AI Act compliance, drift detection (KS+PSI), consolidated reports. Published on PyPI.
46
Tests Passing
v0.2.0
PyPI Published
10
Modules
MIT
Open Source
Pythonscikit-learnSHAPPyPIGitHub ActionsFastAPI
# finreg-ml v0.2.0 $ pip install finreg-ml Successfully installed finreg-ml-0.2.0 $ python -c "from finreg import GovernedModel; print(GovernedModel.__doc__)" EU AI Act compliant ML pipeline with SHAP explainability, fairness audits, drift detection (KS + PSI), and auto-reports. Supports: scikit-learn estimators | 10 modules | 46 tests $ pytest tests/ -ra ✓ 46 passed in 4.31s
~/stocksense · typescript+python
StockSense dashboard preview
Live Dashboard
Open in new tab ↗
PROJECT_02 / 05 · LIVE DEMO
stocksense
End-to-end demand forecasting and inventory health platform. HGBT + SARIMAX with walk-forward backtesting, Supabase Postgres backing realtime dashboard, PySpark feature parity. Next.js + Recharts UI deployed on Vercel.
22%
RMSE Lift
31
Tests Passing
12 × 2
SKU Panels
live
Realtime
PythonPySparkNext.jsSupabaseVercel
# stocksense v0.1.0 $ python -m stocksense.run → Walk-forward CV across 24 panels · 3 folds · 14d horizon → HGBT MAPE 21.9% vs Seasonal Naive 27.4% → Selected HGBT on every panel · bias near zero ✓ 31 tests · Pandas/PySpark parity verified
~/clarify · typescript
Clarify LLM agent preview
AI Agent
Open in new tab ↗
PROJECT_03 / 05 · LIVE DEMO
clarify
Production LLM agent that turns a free-text brief into a validated BA artifact pack. Multi-step reasoning with a self-correcting clarification loop, typed IDs, RACI matrix, traceability. Llama 3.3 70B fallback chain via OpenRouter.
LLM AgentTool Calling
Next.jsAI SDKOpenRouterVercel
# clarify $ # input: "vendor invoice approval tool" → Agent asks 6 clarifying questions before assuming → Restates assumptions, then ships artifact pack → Typed IDs (BR/SR/FR/NFR/TR/TC) · RTM · RACI register ✓ First-draft turnaround: ~6 hours → under 5 minutes
~/agenteval · python3
agenteval CLI preview
CLI Framework
PROJECT_04 / 05
agenteval
AI agent evaluation framework with AgentRunner, LLM-as-judge scoring (OpenAI + Anthropic), CLI, safety checks for PII & prompt injection, multi-format export.
LLM-as-judgeSafety Checks
PythonOpenAIAnthropicpydanticCLI
# agenteval v0.2.0 $ agenteval run --agent my_agent.py --suite evals/suite.json → Loading 11 modules · 62 tests passing → LLM Judge: claude-sonnet-4 (Anthropic) → Safety: PII scan ✓ · Injection scan ✓ → Export: JSON · CSV · Markdown ✓ Report saved: eval_results.json
~/crypto-stat-arb · python3
crypto-stat-arb backtest preview
Backtest Engine
PROJECT_05 / 05
crypto-stat-arb
Statistical arbitrage engine. Engle-Granger cointegration, Johansen basket trading, Kalman filter hedge ratios, walk-forward backtesting, regime detection, paper trading via Kraken API.
Kraken APIKalman Filter
Pythonstatsmodelsscipypandas
# crypto-stat-arb v0.1.0 $ python -m cryptoarb.backtest --pair BTC-ETH --window 90d → Engle-Granger cointegration test → Kalman filter hedge ratio (rolling) → Walk-forward backtest · regime detection → Market neutral confirmed (BTC corr ≈ 0.03) ✓ 107 tests · paper trading via Kraken API
02 · Open Source DNA

Building in Public02

0+
COMBINED GITHUB STARS ACROSS CONTRIBUTIONS
Merged Pull Requests
#549
TTauricResearch/TradingAgents ⭐ 50.8k
Unicode encoding fix
+12 −3
MERGED
#776
Microsoft/agent-governance-toolkit ⭐ 1.2k
EU AI Act risk classifier
+247 −18
MERGED
#786
Microsoft/agent-governance-toolkit ⭐ 1.2k
Docs follow-up
+31 −8
MERGED
#1410
FAI4Finance-Foundation/FinRL ⭐ 14.6k
Threading bug fix
+7 −14
MERGED
Open Pull Requests
#345
GSgoldmansachs/gs-quant ⭐ 10k
Pandas 2.x compatibility
+89 −41
OPEN
#113
google/tf-quant-finance ⭐ 5.3k
MD5 to SHA-256 security fix
+3 −3
OPEN
#9809
sksktime/sktime ⭐ 9.7k
NaiveForecaster bug fix
+18 −22
OPEN
#512
qsranaroussi/quantstats ⭐ 7k
Compounded flag for calmar/rar
+14 −6
OPEN
#364
tabukosabino/ta ⭐ 5k
Rank + Percentile indicators
+67 −0
OPEN
03 · Tech Stack

The Toolbox03

// Languages
PythonSQLC++TypeScript
// ML & Deep Learning
scikit-learnPyTorchSHAPstatsmodelsscipy
// Quant Finance
pandasnumpyEngle-GrangerKalman FilterGARCH
// DevOps & Tools
DockerFastAPIGitHub ActionspydanticGit
// AI & LLMs
ClaudeChatGPTGeminiLLMsPrompt Eng.
◉ Currently Learning
 → Reinforcement Learning  → CUDA Programming  → Rust  → Transformer Fine-tuning  → JAX/XLA  → Options Pricing  → RLHF  →   → Reinforcement Learning  → CUDA Programming  → Rust  → Transformer Fine-tuning  → JAX/XLA  → Options Pricing  → RLHF  → 
04 · Where I've Worked

Professional Experience04

Research · Buffalo, NY
University at Buffalo
Research Data Analyst
Built data pipelines and statistical models for university research stakeholders. Combined Python and R for hypothesis testing and regression analysis, plus SQL data integration into automated reporting.
Oct 2025 to Apr 2026 · Buffalo, NY
★ Impact
Reduced 696-feature dataset to 80 high-signal variables (88% cut) · Integrated 3 sources into 41 production reports · Reclaimed 15+ hrs/week of stakeholder time
Stack
Python, R, SQL, Statistical Analysis, Regression, Hypothesis Testing, Data Pipelines
Data Science · Buffalo, NY
Machinery Monitoring Systems LLC
Data Scientist
Shipped a Python AI service end to end, from EDA through validation and deployment. Built model evaluation infrastructure tracking accuracy, latency, drift, and exception rate across iterative retraining cycles.
Aug 2025 to Dec 2025 · Buffalo, NY
★ Impact
99.98% production accuracy · Automated manual triage costing ~4 hrs/day · 3 retraining cycles sustaining 99%+ accuracy across releases
Stack
Python, FastAPI, Docker, Model Evaluation, MLOps, Drift Monitoring
ML Engineer · India
Rucha Yantra LLP
Machine Learning Engineer
Designed and trained classification models on multi-sensor industrial telemetry. Built Python and SQL backend pipelines on AWS processing 10K+ daily records across multiple device classes.
Feb 2023 to Jul 2024 · India
★ Impact
Shipped 3 ML modules from scoping to deployment · Zero rollbacks across 17 months · ~$80K estimated annual customer cost savings
Stack
Python, SQL, AWS, Classification Models, Multi-Sensor Telemetry, Industrial ML
Internship · India
Chandra Engineering
Data Analyst Intern
Trained 3 forecasting models on 12+ months of multi-product sales data using walk-forward backtesting, replacing a manual Excel workflow. Surfaced insights through Tableau dashboards for ops planning.
Sep 2021 to Dec 2022 · India
★ Impact
+20% forecast accuracy · -15% stock-outs · -10% overstocking · $10K annual savings · +5% on-time fulfillment
Stack
Random Forest, XGBoost, TensorFlow, Python, SQL, Tableau, Walk-Forward Backtesting
05 · Education

Education05

MS · STEM · GPA 3.73
State University of New York at Buffalo
MS, Data Science
STEM-designated Data Science program combining rigorous statistical modeling with production ML engineering, quantitative finance, and real-world systems work.
Aug 2024 to Dec 2025 · Buffalo, NY
★ Achievement
GPA 3.73 · 5 open source projects · 11 PRs across Microsoft, Google, Goldman Sachs ecosystems
Focus Areas
Machine Learning, Statistical Modeling, Quantitative Finance, MLOps, AI Evaluation, Data Engineering
B.E. · 🥇 Gold Medalist
Jawaharlal Nehru Engineering College, India
B.E., Electronics & Telecommunication Engineering
Graduated as university topper with the Gold Medal, the highest academic distinction. Built the foundation in algorithms, electronics, systems, and applied mathematics.
Aug 2019 to Jul 2023 · India
🥇 Gold Medalist · 8.41/10 GPA · Electronics & Telecom
Focus Areas
Algorithms, Data Structures, Signal Processing, Communication Systems, Applied Mathematics, Embedded Systems
06 · The Human

About06

I'm a data scientist and ML engineer who builds production-grade ML systems. My work bridges statistical modeling with software engineering, from cointegration-based stat arb engines to quantitative finance pipelines. Passionate about open source, with PRs across repos at Microsoft, Google, and Goldman Sachs (100k+ combined stars).

🥇
Featured Achievement
Bachelor's Gold Medalist
Graduated top of batch with university gold medal in B.E. Electronics & Telecommunication Engineering. A credential earned through consistent excellence, not circumstance.
// Always Supporting
Man United Real Madrid India Cricket Buffalo Bills
// Over the Board
atharva@portfolio:~
Atharva Joshi
// Right Now
Me Before You
Reading
Me Before You
Jojo Moyes
The Night We Met
Listening
The Night We Met
Lord Huron
▶ NOW PLAYING
Grave of the Fireflies
Watching
Grave of the Fireflies
Isao Takahata · 1988
// FUEL
Cristiano Ronaldo
Cristiano Ronaldo
Football
"Hard work beats talent"
Novak Djokovic
Novak Djokovic
Tennis
"Believe in yourself"
Kobe Bryant
Kobe Bryant
Basketball
"The details are not the details"
Bobby Fischer
Bobby Fischer
Chess
"Chess is life"
// Local Time · New York
07 · Let's Connect
Let's Build Something.
Open for full-time roles starting 2026 · Data Scientist · ML Engineer · Quant Developer · AI Engineer