Skip to content

Operational Runbooks

This guide covers common operational tasks for running Hydra in production.

Before starting the bot:

  1. Verify configuration

    Terminal window
    cat config.yaml | grep mode # Should match intended mode
  2. Check credentials (live mode only)

    Terminal window
    # Ensure secrets are set
    echo $POLYMARKET_PRIVATE_KEY | wc -c # Should be > 60 chars
  3. Verify network connectivity

    Terminal window
    curl -s https://gamma-api.polymarket.com/health
    curl -s https://api.binance.com/api/v3/ping
  4. Check network latency (critical for latency arbitrage)

    Terminal window
    hydra latency # Using binary
    bun run latency # From source

    Expected results for production deployment:

    • Binance WebSocket: <50ms (ideally <10ms from Tokyo)
    • Polymarket REST: <200ms

    If Binance latency >100ms, consider deploying to AWS ap-northeast-1 (Tokyo).

  5. Check disk space

    Terminal window
    df -h ./runs # Ensure adequate space for logs
Terminal window
# Paper trading
hydra run # Using binary
bun run paper # From source
# Live trading
hydra run --mode live # Using binary
bun run bot # From source
# With Docker
docker compose up -d hydra
  1. Check IPC server is listening:

    Terminal window
    curl http://localhost:8787/health
  2. Connect TUI to verify data flow:

    Terminal window
    hydra tui # Using binary
    bun run tui # From source
  3. Check logs for market discovery:

    Terminal window
    tail -f runs/*/events.jsonl | grep MarketSelected
MetricNormal RangeAlert Threshold
Orders/minute0-30>30 (rate limited)
Drawdown0-5%>10%
Data staleness<500ms>3000ms
Fill rate>80%<50%

Healthy system:

  • TUI shows real-time price updates
  • Reference prices updating every ~100ms
  • No staleness warnings in logs
  • Positions reconcile with exchange

Unhealthy indicators:

  • Stale data warnings
  • Kill switch triggered
  • No orders for extended periods
  • Position mismatch warnings
Terminal window
# Watch for errors
tail -f runs/*/events.jsonl | grep -E '"type":"(Error|KillSwitch|RiskTrip)"'
# Watch order flow
tail -f runs/*/events.jsonl | grep -E '"type":"Order(Placed|Filled)"'
# Monitor reference prices
tail -f runs/*/events.jsonl | grep ReferencePriceEvent | tail -1
  1. Stop new orders - The bot handles SIGTERM gracefully

    Terminal window
    docker compose stop hydra
    # or
    kill -TERM $(pgrep -f "bun.*main.ts")
  2. Verify open orders cancelled Check logs for order cancellation confirmations

  3. Archive session data

    Terminal window
    tar -czf session-$(date +%Y%m%d-%H%M%S).tar.gz runs/

If graceful shutdown fails:

Terminal window
# Force stop
docker compose kill hydra
# or
kill -9 $(pgrep -f "bun.*main.ts")
# IMPORTANT: Manually cancel any open orders via Polymarket UI

Symptoms: Bot stops trading, “KILL SWITCH TRIGGERED” in logs

What happens automatically:

  1. Risk mode is set to killed, blocking all new orders
  2. All open orders are cancelled via tradingService.cancelAllOrders()
  3. All positions are neutralized (UP/DOWN tokens sold to close positions)
  4. Detailed warnings are logged with the trigger reason

Actions:

  1. Check trigger reason in logs (look for “KILL SWITCH” entries)
  2. If data staleness: Check network/API connectivity
  3. If drawdown: Review recent trades, check position sizing
  4. If position/exposure limit: Review limit configuration vs trading strategy
  5. If manual: Intentional, verify before restarting

Recovery:

  1. Fix underlying issue
  2. Verify all positions were properly neutralized via Polymarket UI
  3. Restart bot with fresh state

Symptoms: “Position mismatch” warnings, PnL looks wrong

Actions:

  1. Check Polymarket UI for actual positions
  2. Compare with bot’s reported positions in TUI
  3. If mismatch persists, restart bot to force resync

Symptoms: Stale data warnings, no price updates

Actions:

  1. Check Binance/Polymarket API status
  2. Verify network connectivity
  3. Check for IP rate limiting

Recovery: Bot automatically reconnects. If persistent, restart.

Symptoms: Fills at worse prices than expected

Actions:

  1. Check market liquidity in TUI
  2. Review slippage protection settings
  3. Consider reducing position sizes

Prevention:

  • Increase minLiquidityUSDC
  • Decrease maxSlippagePercent
  • Use more conservative edgeThreshold
  • Review overnight PnL
  • Check for warning/error logs
  • Verify positions match exchange
  • Monitor fill rates
  • Archive old session logs
  • Review market discovery scores
  • Check for Polymarket API changes
  • Update dependencies if needed
  • Full strategy performance review
  • Adjust risk parameters based on data
  • Test disaster recovery procedures
  • Rotate API credentials if policy requires
  1. Restore config from backup/version control
  2. Regenerate API keys if needed
  3. Start in paper mode to verify
  4. Switch to live after validation
Terminal window
# Stop bot
docker compose stop hydra
# Clear state (positions will resync from exchange)
rm -rf runs/*
# Restart
docker compose up -d hydra

Bot automatically handles reconnection. After extended outage:

  1. Check all positions reconciled correctly
  2. Review any orders that may have filled during outage
  3. Verify reference prices are fresh before trading resumes
  • ~30 orders/minute (Polymarket rate limit)
  • ~15 markets simultaneously (recommended max)
  • ~1GB memory typical usage

Not currently supported. Running multiple instances will cause:

  • Order conflicts
  • Position tracking issues
  • Rate limit exhaustion

If issues persist after following runbooks:

  1. Collect logs: tar -czf debug.tar.gz runs/
  2. Note exact error messages
  3. Document steps taken
  4. Open GitHub issue with details