Best-in-class F1 on both synthetic and real-world CodeQL alerts.
Takes raw CodeQL alerts and runs them through contextual reasoning + structured evidence validation to filter false positives. Evaluated 10 LLMs across 6 model families on two benchmarks (synthetic + real-world Java CVE-grounded alerts).
- LLMs
- CodeQL
- Python
- Static Analysis
- Multi-Stage Prompting
Read more → Repo →
LLM-powered agentic IDE. I own the multi-agent DAG orchestration, subagent system, and context-management layers.
Authored the empirical study behind Fabric’s externally-published March-2026 benchmark report — 99% of frontier accuracy at 18% of frontier cost on Aider Polyglot (225+ exercises, 6 languages).
Production agentic IDE in the Cursor product space. Shipped: a six-tool subagent surface (DelegateTask/SendMessage/WaitForTask/etc.), a TDD-style RED→GREEN DAG orchestrator with Mission Control dashboard, chain-of-density + KV-cache-aware summarization, the prepare→permission→execute tool lifecycle, SWE-Bench evaluation, and an MCP server exposing the test-and-break loop to AI agents.
- TypeScript
- Electron
- React
- LLM Agents
- MCP
- SWE-Bench
- Docker
Read more →
Iran’s leading crypto social-trading platform — ~40k users in 18 months, then acquired.
Architected the trading engine: smart order routing across 5+ exchanges (Binance, KuCoin, regional), best-execution price aggregation, per-exchange adapter pattern, async Python + Celery, copy-replication idempotency with slippage controls, circuit breakers, ~99.5% uptime SLO. Shipped MVP in ~2 months; platform reached ~40k users in 18 months.
- Python
- Django
- PostgreSQL
- Celery
- Redis
- asyncio
- WebSocket
- Docker
- Real-time Systems
SnappFood — ETA, Churn, Fraud Models (10M+ users)
Production ML on Iran’s largest food-delivery platform: 27% better ETA, 13% lower churn, 10% CSAT lift.
27% ETA accuracy improvement, 13% churn reduction, 10% CSAT lift — measured.
Customer Experience team — built the Octopus BI layer (department-specific KPI dashboards), adapted Uber’s DeepETA to motorbike delivery for 27% ETA-accuracy improvement and 24% fewer delivery delays, shipped a churn-prediction pipeline (RFM features + logistic regression on 3M+ users) that fed reactivation campaigns dropping monthly churn by 13%, and a vendor-fraud detection system that lifted CSAT by 10%.
- Python
- PyTorch
- Keras
- scikit-learn
- SQL
- Power BI
- Pandas
Three shipped components: desktop agent + LoRA fine-tune + MCP server.
Personal project. Global-hotkey audio capture, dual-path Whisper (OpenAI API + local whisper.cpp), Claude-Haiku prompt structuring with project context, auto-paste via osascript. Also shipped a companion Whisper-Large-V3 LoRA fine-tuned on a bilingual Persian-English technical-speech corpus, and an MCP-server wrapper exposing the pipeline to Claude Code / Cursor.
- Rust
- Tauri
- Svelte
- Whisper
- LoRA
- MCP
- Anthropic SDK