Hi, I'm

Mohsen Iranmanesh

M.Sc. Computing Science @ SFU · Research Engineer Intern @ Farpoint

I build agentic LLM systems that ship — and write papers about them. Currently working on the agent layer of an LLM-powered IDE at Farpoint, and on LLM-driven static analysis under Dr. Mohammad Tayebi at SFU.

Research

I work on agentic AI systems for software engineering — specifically, on pipelines that combine LLMs with static-analysis tooling for vulnerability detection, triage, and remediation. M.Sc. Computing Science at SFU under Dr. Mohammad Tayebi.

ZeroFalse: Improving Precision in Static Analysis with LLMs

Under review RAID 2026 2026

Mohsen Iranmanesh, Sina Moradi Sabet, Sina Marefat, Ali Javidi Ghasr, Allison Wilson, Iman Sharafaldin, Mohammad A. Tayebi

A multi-stage LLM pipeline that takes raw static-analyzer alerts and triages them through contextual reasoning and structured evidence validation, reducing false positives without sacrificing recall. Evaluated 10 LLMs across 6 model families on two benchmarks; achieves best-in-class F1 on both synthetic and real-world CodeQL alerts.

First-author submission, currently under review.

Full list and project pages on Research.


Selected work

Engineering projects from internship work, founding-team builds, and personal experiments. A two-layer read: the headline claim is for fast scans; click into a project to see architecture, tradeoffs, and eval.

ZeroFalse

Multi-stage LLM pipeline that reduces false positives in static analysis.

Best-in-class F1 on both synthetic and real-world CodeQL alerts.

Takes raw CodeQL alerts and runs them through contextual reasoning + structured evidence validation to filter false positives. Evaluated 10 LLMs across 6 model families on two benchmarks (synthetic + real-world Java CVE-grounded alerts).

  • LLMs
  • CodeQL
  • Python
  • Static Analysis
  • Multi-Stage Prompting

Fabric — Agentic IDE (Farpoint)

LLM-powered agentic IDE. I own the multi-agent DAG orchestration, subagent system, and context-management layers.

Authored the empirical study behind Fabric’s externally-published March-2026 benchmark report — 99% of frontier accuracy at 18% of frontier cost on Aider Polyglot (225+ exercises, 6 languages).

Production agentic IDE in the Cursor product space. Shipped: a six-tool subagent surface (DelegateTask/SendMessage/WaitForTask/etc.), a TDD-style RED→GREEN DAG orchestrator with Mission Control dashboard, chain-of-density + KV-cache-aware summarization, the prepare→permission→execute tool lifecycle, SWE-Bench evaluation, and an MCP server exposing the test-and-break loop to AI agents.

  • TypeScript
  • Electron
  • React
  • LLM Agents
  • MCP
  • SWE-Bench
  • Docker

Pabla — Crypto Social-Trading Engine

Real-time copy-trading engine for crypto markets. Acquired by Nobitex (Middle East’s largest crypto exchange).

Iran’s leading crypto social-trading platform — ~40k users in 18 months, then acquired.

Architected the trading engine: smart order routing across 5+ exchanges (Binance, KuCoin, regional), best-execution price aggregation, per-exchange adapter pattern, async Python + Celery, copy-replication idempotency with slippage controls, circuit breakers, ~99.5% uptime SLO. Shipped MVP in ~2 months; platform reached ~40k users in 18 months.

  • Python
  • Django
  • PostgreSQL
  • Celery
  • Redis
  • asyncio
  • WebSocket
  • Docker
  • Real-time Systems

SnappFood — ETA, Churn, Fraud Models (10M+ users)

Production ML on Iran’s largest food-delivery platform: 27% better ETA, 13% lower churn, 10% CSAT lift.

27% ETA accuracy improvement, 13% churn reduction, 10% CSAT lift — measured.

Customer Experience team — built the Octopus BI layer (department-specific KPI dashboards), adapted Uber’s DeepETA to motorbike delivery for 27% ETA-accuracy improvement and 24% fewer delivery delays, shipped a churn-prediction pipeline (RFM features + logistic regression on 3M+ users) that fed reactivation campaigns dropping monthly churn by 13%, and a vendor-fraud detection system that lifted CSAT by 10%.

  • Python
  • PyTorch
  • Keras
  • scikit-learn
  • SQL
  • Power BI
  • Pandas

Clarion — Voice-to-Prompt Desktop Agent

Tauri/Rust macOS menu-bar agent: hotkey → Whisper → Haiku rewrite → paste. Built for bilingual developers.

Three shipped components: desktop agent + LoRA fine-tune + MCP server.

Personal project. Global-hotkey audio capture, dual-path Whisper (OpenAI API + local whisper.cpp), Claude-Haiku prompt structuring with project context, auto-paste via osascript. Also shipped a companion Whisper-Large-V3 LoRA fine-tuned on a bilingual Persian-English technical-speech corpus, and an MCP-server wrapper exposing the pipeline to Claude Code / Cursor.

  • Rust
  • Tauri
  • Svelte
  • Whisper
  • LoRA
  • MCP
  • Anthropic SDK

All projects — including research infrastructure and smaller experiments — on Projects.


About

I'm a software & ML engineer with a research line. Eight years of production engineering across fintech, food-delivery (10M+ users), and developer tools — and a publication track in applied LLM systems for software security. Right now I'm at Farpoint, where I own the multi-agent DAG orchestration, subagent system, and context-management layers of an LLM-powered agentic IDE, and at SFU where I'm finishing my M.Sc. thesis on LLM-driven vulnerability remediation.

I'm targeting full-time AI Engineer / ML Engineer / SWE / AI Researcher / ML Researcher roles starting in late 2026. If you're building agentic systems, LLM evaluation infrastructure, code-intelligence tooling, or research-product engineering — I'd love to chat. Email is the fastest way.