Skip to main content

Feature Roadmap

Overview

Transform PolySuggest from a market ideation tool into the most precise decision-support terminal for Polymarket traders. Total duration: 12 months across 4 phases.

Phase 1: MVP Market Intelligence Hub (4-6 weeks)

Goal: Real-time market data + edge detection + basic calibration tracking

Week 1: Real-Time Market Data Pipeline

Objectives:

  • Async Gamma API polling (1-second updates)
  • OHLCV storage in TimescaleDB
  • In-memory cache for latest prices
  • REST endpoints for market data

Deliverables:

  • market_data.py – Async poller, caching, rate limiting
  • liquidity_analyzer.py – Spread, depth, volume metrics
  • time_series.py – OHLCV storage & retrieval
  • Tests: test_market_data.py
  • Docs: "Real-Time Data" section in README

Acceptance Criteria:

  • Sub-1s latency for price updates
  • 99.5% API availability (with circuit breaker)
  • Support 50+ markets simultaneously

PR: #1-real-time-gamma-api


Week 2: Edge Detection & Alerts

Objectives:

  • Implement edge detector (fair prob vs market prob)
  • Compare trend sentiment to market probability
  • Flag mismatches > 25%
  • Real-time alert system (CLI + webhook)

Deliverables:

  • edge_detector.py – Divergence scoring, decay
  • alerts.py – Alert broker (stdout, slack, webhook)
  • Extend schemas.py with Edge model
  • CLI command: polysuggest edges --top 10

Acceptance Criteria:

  • Edge detection with >70% historical accuracy
  • Alerts fire within 10 seconds of detection
  • Webhook integration with Slack/Discord

PR: #2-edge-detection


Week 3: Calibration Tracking

Objectives:

  • Track all suggestions in database
  • Link suggestions to actual resolutions
  • Compute calibration metrics (Brier score, accuracy, ROI)
  • Generate calibration heatmap

Deliverables:

  • calibration.py – Compute historical accuracy metrics
  • Extend storage.py with resolution tracking
  • Backfill historical suggestions from database
  • CLI command: polysuggest calibrate

Acceptance Criteria:

  • 100% of past suggestions linked to resolutions (if available)
  • Calibration curve within 5% of actual resolution rate
  • Brier score calculated correctly

PR: #3-calibration


Week 4: Dashboard MVP

Objectives:

  • Real-time price ticker (top 10 trending markets)
  • Edge leaderboard (sorted by EV)
  • Calibration heatmap (confidence vs actual accuracy)
  • WebSocket server for live updates

Deliverables:

  • Next.js components:
    • PriceTicker.tsx – Live prices
    • EdgeLeaderboard.tsx – Top edges
    • CalibrationChart.tsx – Historical accuracy
  • FastAPI WebSocket endpoint: WS /ws/edges
  • Export to CSV

Acceptance Criteria:

  • Dashboard refreshing every 5 seconds
  • WebSocket connection stable for 1+ hour
  • <100ms latency for UI updates

PR: #4-dashboard-mvp


Phase 1 Success Metrics

  • Sub-1s latency for price updates
  • Edge detection with >70% historical accuracy
  • Calibration curve within 5% of actual rate
  • Dashboard refreshing every 5 seconds
  • 10+ beta users testing MVP

Phase 2: Edge Engine & Probability Calibration (6-8 weeks)

Goal: Calibrated probability estimates + EV ranking + Bayesian inference

Week 5-6: Bayesian Probability Estimation

Objectives:

  • Implement Bayesian inference engine
  • Train on 12 months of historical Polymarket data
  • Compute fair probability with credible intervals
  • Integrate base rate calculator (by market category)

Deliverables:

  • inference_engine.py – Bayesian model (PyMC3/Pyro)
  • base_rates.py – Historical priors by category
  • feature_engineering.py – Sentiment, volume, order flow features
  • Training pipeline: train_bayesian_model.py
  • Tests: test_inference_engine.py

Acceptance Criteria:

  • Fair prob estimates within ±0.08 credible interval
  • Model trained on 500+ historical markets
  • Inference time <5s per market

PR: #5-bayesian-inference


Week 7: Expected Value (EV) Ranking

Objectives:

  • Compute fair price for all markets
  • Compare to Gamma market price
  • Rank by risk-adjusted EV
  • Implement Kelly fraction sizing

Deliverables:

  • ev_calculator.py – EV, sizing, risk metrics
  • portfolio_optimizer.py – Kelly fraction, risk parity
  • API endpoint: GET /api/edges?sort=ev
  • CLI: polysuggest edges list --sort ev

Acceptance Criteria:

  • EV calculated correctly (fair_prob / market_prob - 1)
  • Leaderboard updates every 1 minute
  • Kelly sizing never exceeds bankroll limits

PR: #6-ev-ranking


Week 8: Advanced Calibration

Objectives:

  • Compute Brier score, accuracy, Sharpe ratio
  • Group calibration by: confidence bucket, category, time horizon
  • Detect model bias (overconfident? underconfident?)
  • ROI tracking per suggestion

Deliverables:

  • calibration.py – Enhanced metrics (Brier, Sharpe, ROI)
  • Dashboard: CalibrationBuckets.tsx – Breakdowns by confidence
  • CLI: polysuggest calibrate show --by-confidence

Acceptance Criteria:

  • Calibration curve shows increasing accuracy with confidence
  • Model bias <5% (predictions vs actuals)

PR: #7-advanced-calibration


Phase 2 Success Metrics

  • Fair probability estimates within ±0.08 credible interval
  • Edge detection accuracy improved to 75%+
  • Calibration heatmap shows proper confidence scaling
  • EV ranking leaderboard updated every 1 min

Phase 3: Portfolio & Execution (8-12 weeks)

Goal: Position tracking, risk management, order execution

Week 9-10: Position Tracking & Risk Management

Objectives:

  • Track all user positions (entry price, qty, date)
  • Compute real-time P&L against Gamma prices
  • Calculate portfolio heat (worst-case loss)
  • Estimate correlation between markets
  • Liquidation risk warnings

Deliverables:

  • portfolio_tracker.py – Position ledger, P&L, exposure
  • correlation_analyzer.py – Outcome correlations
  • risk_manager.py – VaR, stress testing, hedging
  • PostgreSQL tables: positions, orders, trades
  • API endpoints: GET /api/portfolio/*

Acceptance Criteria:

  • Position P&L accurate within 0.1%
  • Portfolio heat computed in <1s
  • Correlation estimates from historical data (or crowdsourced)

PR: #8-portfolio-tracking


Week 11-12: Order Management & Execution

Objectives:

  • Place limit orders via Gamma API
  • Ladder orders (split into tranches)
  • Estimate slippage from order book depth
  • Track fills and execution quality
  • Paper trading mode (log but don't execute)

Deliverables:

  • order_manager.py – Order lifecycle
  • execution_simulator.py – Backtest fills
  • order_ledger.py – Order history + analytics
  • CLI: polysuggest orders place
  • CLI: polysuggest orders status

Acceptance Criteria:

  • Limit orders placed successfully
  • Slippage estimate within ±2% of actual
  • Paper trading mode fully functional

PR: #9-order-management (paper trading)


Week 13-14: Backtesting Engine

Objectives:

  • Replay historical market data
  • Simulate edge detection at each timestamp
  • Execute orders at historical prices
  • Compute realized P&L, Sharpe, max drawdown
  • Benchmark vs buy-and-hold

Deliverables:

  • backtester.py – Historical simulation engine
  • benchmarks.py – Baseline strategies
  • Dashboard: BacktestRunner.tsx + BacktestResults.tsx
  • CLI: polysuggest backtest run

Acceptance Criteria:

  • Backtest 12-month period in <5 minutes
  • P&L results match manual calculation
  • Sharpe ratio computed correctly

PR: #10-backtester


Phase 3 Success Metrics

  • Position P&L accurate within 0.1%
  • Portfolio heat calculations <1s
  • Order slippage estimates within ±2%
  • Backtest engine handles 12-month periods
  • Paper trading validated against live data

Phase 4: Research & ML (12+ weeks)

Goal: Multi-model ensemble, semantic search, automated research reports

Week 15-16: Multi-Model Consensus

Objectives:

  • Train SVM classifier on historical patterns
  • Train LSTM on price momentum & volume
  • Ensemble voting (confidence = % agreement)
  • Flag high-uncertainty predictions

Deliverables:

  • models/svm_classifier.py – Historical pattern matching
  • models/lstm_momentum.py – Time series model
  • consensus.py – Ensemble voting + weighting
  • Model registry + versioning

Acceptance Criteria:

  • Ensemble beats GPT-4o baseline by 5%+
  • Model agreement correlates with accuracy

PR: #11-ensemble-models


Week 17-18: Semantic Search (RAG)

Objectives:

  • Embed market descriptions + context
  • Semantic search on market history
  • RAG: retrieve similar markets as examples
  • Build vector database (Chroma)

Deliverables:

  • semantic_search.py – Market history RAG
  • embedding_service.py – Maintain vector DB
  • CLI: polysuggest research search "..."
  • Vector DB schema + ingestion pipeline

Acceptance Criteria:

  • Search returns semantically similar markets
  • Embedding quality validated by manual review

PR: #12-semantic-search


Week 19-20: Automated Research Reports

Objectives:

  • Daily digest: top edges, emerging trends, P&L
  • Market calendar with close-to-resolution warnings
  • Portfolio rebalance suggestions
  • Delivery: email + Slack + web dashboard

Deliverables:

  • report_generator.py – Daily digest builder
  • email_service.py – Email delivery
  • slack_integration.py – Slack posting
  • Scheduled task (Celery cron)
  • Dashboard: ResearchReport.tsx

Acceptance Criteria:

  • Reports generated daily at fixed time
  • Email delivery 99%+ success rate
  • Report content manually reviewed for quality

PR: #13-research-reports


Phase 4 Success Metrics

  • Ensemble model beats GPT-4o by 5%+
  • Semantic search returns relevant markets
  • Daily reports delivered on schedule
  • Report quality validated by 5+ beta users

Parallel Work Streams

Infrastructure (Throughout All Phases)

  • PostgreSQL setup + migrations
  • TimescaleDB setup + hypertables
  • Chroma vector DB setup
  • Redis cache setup
  • Docker Compose for local dev
  • CI/CD pipeline (GitHub Actions)
  • Monitoring + observability (Prometheus, Grafana)
  • Logging centralization (ELK or Loki)

Testing & QA (Throughout All Phases)

  • Unit tests (>80% coverage)
  • Integration tests for API endpoints
  • Load testing (50+ concurrent users)
  • End-to-end tests (CLI + dashboard)
  • Historical data validation (backtests)

Documentation (Throughout All Phases)

  • API reference (OpenAPI/Swagger)
  • CLI command reference
  • Architecture guide
  • Deployment guide
  • Development setup
  • Database schema documentation

Success Criteria by Phase

Phase 1 (MVP)

  • ✅ Sub-1s price updates
  • ✅ >70% edge detection accuracy
  • ✅ Calibration within 5% of actual
  • ✅ 10+ beta users

Phase 2 (Calibration)

  • ✅ Fair probability within ±0.08 credible interval
  • ✅ 75%+ edge accuracy
  • ✅ Proper confidence scaling in calibration

Phase 3 (Execution)

  • ✅ Position P&L accurate within 0.1%
  • ✅ Order slippage within ±2%
  • ✅ Backtest engine validated

Phase 4 (ML)

  • ✅ Ensemble beats baseline by 5%+
  • ✅ Semantic search quality validated
  • ✅ 5+ beta users actively using reports

Monetization Timeline

  • Phase 1-2: Free tier (public edges, CLI-only)
  • Phase 3: Launch Pro tier ($99/mo)
  • Phase 4: Enterprise tier (custom, SLA)

Risk Mitigation

RiskLikelihoodImpactMitigation
Gamma API downtimeMediumHighCache + fallback to CCIP oracles
Model overfittingMediumHighWalk-forward validation, Sharpe caps
Regulatory (US)LowHighPosition as "decision support", not automated algo
Competitor execution platformsMediumMediumOwn research/ML, not just API wrapper
LLM API costsLowMediumFine-tune open-source (Llama) fallback
Database scalingLowMediumUse managed Postgres (AWS RDS), TimescaleDB

Resource Requirements

  • 1 Backend Engineer (Python/FastAPI, DB, inference)
  • 1 Frontend Engineer (React/Next.js, dashboard)
  • 1 ML Engineer (Bayesian inference, model training)
  • 1 DevOps/Infrastructure (databases, deployments, monitoring)
  • 1 PM/QA (roadmap, testing, user feedback)

Total: 5 FTE (can be compressed with overlap)


Metrics Dashboard

Track progress with:

  • Product KPIs: Active users, edge accuracy, calibration Brier score
  • Technical KPIs: API latency, uptime, backtest duration
  • Business KPIs: Paid users, ARR, retention
  • ML KPIs: Ensemble vs baseline accuracy, edge decay time

Success Definition (Year 1)

Product Metrics

  • 500+ active users (free or paid)
  • 100+ ARR ($100k from Pro users)
  • 65%+ calibration accuracy (Brier <0.20)
  • 15%+ monthly active trader retention

Business Metrics

  • 50+ paid users ($99/mo Pro tier)
  • 5+ prop shop partnerships
  • Featured in 3+ major crypto/trading publications

ML Metrics

  • Ensemble beats GPT-4o by 5%+
  • Edge decay: median 5 minutes
  • Backtested P&L: +3-8% monthly (risk-adjusted)

Next: Review roadmap with stakeholders, prioritize features, begin Week 1 implementation.