Hi, I'm
Mohsen Iranmanesh
M.Sc. Computing Science @ SFU · Research Engineer Intern @ Farpoint
I build agentic LLM systems that ship — and write papers about them. Currently working on the agent layer of an LLM-powered IDE at Farpoint, and on LLM-driven static analysis under Dr. Mohammad Tayebi at SFU.
Research
I work on agentic AI systems for software engineering — specifically, on
pipelines that combine LLMs with static-analysis tooling for vulnerability
detection, triage, and remediation. M.Sc. Computing Science at SFU under
Dr. Mohammad Tayebi.
Mohsen Iranmanesh, Sina Moradi Sabet, Sina Marefat, Ali Javidi Ghasr, Allison Wilson, Iman Sharafaldin, Mohammad A. Tayebi
A multi-stage LLM pipeline that takes raw static-analyzer alerts and triages them through contextual reasoning and structured evidence validation, reducing false positives without sacrificing recall. Evaluated 10 LLMs across 6 model families on two benchmarks; achieves best-in-class F1 on both synthetic and real-world CodeQL alerts.
First-author submission, currently under review.
arXiv
Full list and project pages on Research.
Selected work
Engineering projects from internship work, founding-team builds, and
personal experiments. A two-layer read: the headline claim is for fast
scans; click into a project to see architecture, tradeoffs, and eval.
Best-in-class F1 on both synthetic and real-world CodeQL alerts.
Takes raw CodeQL alerts and runs them through contextual reasoning + structured evidence validation to filter false positives. Evaluated 10 LLMs across 6 model families on two benchmarks (synthetic + real-world Java CVE-grounded alerts).
- LLMs
- CodeQL
- Python
- Static Analysis
- Multi-Stage Prompting
Read more → Repo →
LLM-powered agentic IDE. I own the multi-agent DAG orchestration, subagent system, and context-management layers.
Authored the empirical study behind Fabric’s externally-published March-2026 benchmark report — 99% of frontier accuracy at 18% of frontier cost on Aider Polyglot (225+ exercises, 6 languages).
Production agentic IDE in the Cursor product space. Shipped: a six-tool subagent surface (DelegateTask/SendMessage/WaitForTask/etc.), a TDD-style RED→GREEN DAG orchestrator with Mission Control dashboard, chain-of-density + KV-cache-aware summarization, the prepare→permission→execute tool lifecycle, SWE-Bench evaluation, and an MCP server exposing the test-and-break loop to AI agents.
- TypeScript
- Electron
- React
- LLM Agents
- MCP
- SWE-Bench
- Docker
Read more →
Iran’s leading crypto social-trading platform — ~40k users in 18 months, then acquired.
Architected the trading engine: smart order routing across 5+ exchanges (Binance, KuCoin, regional), best-execution price aggregation, per-exchange adapter pattern, async Python + Celery, copy-replication idempotency with slippage controls, circuit breakers, ~99.5% uptime SLO. Shipped MVP in ~2 months; platform reached ~40k users in 18 months.
- Python
- Django
- PostgreSQL
- Celery
- Redis
- asyncio
- WebSocket
- Docker
- Real-time Systems
SnappFood — ETA, Churn, Fraud Models (10M+ users)
Production ML on Iran’s largest food-delivery platform: 27% better ETA, 13% lower churn, 10% CSAT lift.
27% ETA accuracy improvement, 13% churn reduction, 10% CSAT lift — measured.
Customer Experience team — built the Octopus BI layer (department-specific KPI dashboards), adapted Uber’s DeepETA to motorbike delivery for 27% ETA-accuracy improvement and 24% fewer delivery delays, shipped a churn-prediction pipeline (RFM features + logistic regression on 3M+ users) that fed reactivation campaigns dropping monthly churn by 13%, and a vendor-fraud detection system that lifted CSAT by 10%.
- Python
- PyTorch
- Keras
- scikit-learn
- SQL
- Power BI
- Pandas
Three shipped components: desktop agent + LoRA fine-tune + MCP server.
Personal project. Global-hotkey audio capture, dual-path Whisper (OpenAI API + local whisper.cpp), Claude-Haiku prompt structuring with project context, auto-paste via osascript. Also shipped a companion Whisper-Large-V3 LoRA fine-tuned on a bilingual Persian-English technical-speech corpus, and an MCP-server wrapper exposing the pipeline to Claude Code / Cursor.
- Rust
- Tauri
- Svelte
- Whisper
- LoRA
- MCP
- Anthropic SDK
All projects — including research infrastructure and smaller experiments —
on Projects.
About
I'm a software & ML engineer with a research line. Eight years of
production engineering across fintech, food-delivery (10M+ users), and
developer tools — and a publication track in applied LLM systems for
software security. Right now I'm at
Farpoint, where I own the
multi-agent DAG orchestration, subagent system, and context-management
layers of an LLM-powered agentic IDE, and at SFU where I'm finishing
my M.Sc. thesis on LLM-driven vulnerability remediation.
I'm targeting full-time AI Engineer / ML Engineer / SWE / AI Researcher
/ ML Researcher roles starting in late 2026. If you're building
agentic systems, LLM evaluation infrastructure, code-intelligence
tooling, or research-product engineering — I'd love to chat.
Email is the fastest way.