Market Discovery

Why Smart Discovery Matters

Polymarket lists hundreds of active markets at any given time, but only a small subset are suitable for automated trading. Smart discovery is the process of identifying these high-potential markets while filtering out noise and low-quality data.

Effective market discovery serves several critical purposes:

Efficiency: Focuses computational and capital resources on markets where a statistical edge is most likely to be found.
Risk Mitigation: Avoids markets with thin liquidity or excessive spreads that can lead to significant slippage.
Accuracy: Prioritizes markets where the strike price is close to the current spot price, as these are the most sensitive to latency-based opportunities.
Profitability: Incorporates theoretical pricing to identify discrepancies between the market price and the statistical probability of an outcome.

Scoring Factors

The MarketScorer class evaluates each market based on five primary factors. Each factor is assigned a weight that determines its influence on the final score.

Strike Proximity (35%)

This is the most significant factor. It measures how close the market’s strike price is to the current reference price of the underlying asset.

Strike-based Markets: For markets with explicit price levels (e.g., “BTC above 100,000”), the score decreases as the distance between the strike and the spot price increases, reaching zero if the distance exceeds 15%.
Daily “Up or Down” Markets: These markets (e.g., “BTC Up or Down on January 11”) compare the price to the opening price of the day or time window.
- Opening Price Tracking: The bot automatically captures the first reference price at the start of the market’s time window as the “Opening Price”.
- Strike Handling: This captured price is used as the effective strike price for all probability calculations. Until the price is captured, the market is marked as “(awaiting capture)” in the TUI.
- These markets are treated as “at the money” (ATM) by definition for scoring purposes.

Trading Volume (25%)

A measure of the 24-hour trading activity. Higher volume indicates a more active and established market, reducing the risk of participating in “dead” or abandoned markets. The bot requires a minimum 24-hour volume (defaulting to 100 USD) to consider a market.

Liquidity Depth (20%)

Evaluates the total depth of the orderbook (combined bids and asks). Markets with high liquidity allow for larger position sizes with minimal price impact. The bot requires minimum liquidity (defaulting to 100 USD) and scores markets higher as they approach 50,000 USD in depth.

Bid-Ask Spread (10%)

The difference between the highest bid and the lowest ask. Tighter spreads are preferred as they lower the entry and exit costs for trades. Any market with a spread exceeding 15% is automatically filtered out.

Theoretical Edge (10%)

Compares the current market price of the “Yes” or “No” tokens against a calculated theoretical probability. A higher discrepancy indicates a potential mispricing. The scoring for this factor scales between a minimum edge of 50 basis points and a maximum of 500 basis points.

The Discovery Process

The bot follows a systematic multi-step process to discover and rank markets. It supports two discovery modes: Series (recommended) and Legacy.

Series-Based Discovery

This is the modern discovery method that leverages Polymarket’s Series API. It is specifically designed to find structured “Up or Down” markets for major crypto assets.

Categorization: It targets specific categories like crypto-updown.
Time Window Prioritization: Discovers markets across multiple time windows (e.g., 15m, hourly, daily) and prioritizes them based on the configured order.
Reliability: Uses structured API data rather than fragile text parsing of market titles.

Legacy Discovery

The legacy mode uses keyword matching against market slugs and titles.

Asset Selection: Fetches all active markets from the Polymarket API and filters them using keywords to isolate crypto-specific assets like Bitcoin, Ethereum, and Solana.
Slug and API Parsing: Analyzes the market’s metadata to extract key information. The bot uses a dual-source approach:
- Gamma API: Primary source for strike prices using the groupItemTitle field (e.g., “86,000”, ”↑ 150,000”), providing high reliability.
- Slug Regex: Fallback source that parses the unique identifier (e.g., “bitcoin-above-100k”) for asset name, strike, and direction.
- Daily Market Detection: Recognizes “up or down” patterns and assigns a daily strike price type.

Ranking and Filtering

Regardless of the discovery mode, all candidates go through the same ranking pipeline:

Price Synchronization: Retrieves real-time reference prices from Binance for all identified assets.
Statistical Modeling: Calculates the theoretical probability of the market outcome based on current price, time remaining until expiration, and historical volatility.
Multi-Factor Scoring: Applies the scoring weights to each market.
Hard Filtering: Discards any markets that fail to meet minimum requirements for volume, liquidity, spread, or strike proximity.
Ranking: Sorts the remaining candidates by their final score to produce a prioritized list for the trading strategy.

Market Lifecycle and Rotation

The bot continuously monitors the lifecycle of active markets to ensure it only trades relevant, liquid opportunities.

Expiration Monitoring: Markets reaching their expiry (or within the closeoutBufferMs window) are automatically removed from the active trading list.
Market Rotation: When a market expires, the bot triggers a new discovery cycle to find the next most profitable market, maintaining the configured maxMarkets limit.
Error Handling: If an orderbook becomes unavailable (e.g., 404 error from the API), the market is immediately removed and replaced.

Theoretical Probability Calculation

The bot uses a model inspired by option pricing mathematics to estimate the probability of an outcome. Conceptually, it treats a “Price Above” market as a digital call option and a “Price Below” market as a digital put option.

The calculation considers the following variables:

Reference Price: The current spot price of the asset.
Strike Price: The price level specified in the market.
Time to Expiry: The duration remaining until the market is settled.
Volatility: A measure of how much the asset’s price typically fluctuates, annualized for the calculation.

The result is a probability between 0 and 1, representing the statistical likelihood of the outcome occurring. This value is compared to the market price (also between 0 and 1) to determine if there is a theoretical edge.

Example: Scoring a Bitcoin Market

Consider a market asking if BTC will be above 65,000 USD with the following characteristics:

Current BTC Price: 64,800 USD
24h Volume: 50,000 USD
Total Liquidity: 15,000 USD
Bid-Ask Spread: 2%
Theoretical Probability: 0.48
Current Market Price: 0.45

In this scenario:

Strike Proximity: High score because 65,000 is very close to 64,800 (0.3% distance).
Volume & Liquidity: Solid scores based on meeting minimum thresholds and showing significant activity.
Spread: Receives a high score for being well below the 15% maximum allowed spread.
Edge: Provides a positive score as the theoretical probability (0.48) is higher than the market price (0.45), indicating a 300 basis point edge.

The weighted sum of these factors would result in a high final score, likely placing this market at the top of the discovery list for the trading engine.