Getting Started

What Can ExoBench Do For You?

Overview of ExoBench capabilities with real examples

ExoBench is an MCP-powered SQL benchmarking service. You connect it to your AI tool, describe your SQL performance problem in plain English, and the AI autonomously spins up real ephemeral databases, generates test data, benchmarks your queries at different scales, and iterates on indexes and rewrites until it converges on the fastest solution.

You don't call ExoBench directly. You describe your problem to your AI assistant and it uses ExoBench behind the scenes. Here's the kind of thing you'd type:

  • Query Optimization"This query takes 30 seconds on 500K rows. Find me the right index." Paste your query and schema, ExoBench benchmarks before/after and proves the fix.
  • Scaling Analysis"We look up a user's recent events. 10K rows now, growing to 500K. Will it hold up?" ExoBench tests at five scales, shows the degradation curve, and you can ask the AI to chart the results.
  • Query Technique Validation"Does cursor pagination actually outperform OFFSET/LIMIT? Benchmark both." Give ExoBench two approaches and it runs them head-to-head with real numbers.
  • Data Distribution Impact"90% of our rows are status='completed'. Why is the same query 71x slower for that status?" ExoBench shows how the planner picks completely different strategies based on data skew.

Each section below includes the full prompt, actual ExoBench output, and an explanation of what happened.


Query Optimization

You have a slow query. You want to figure out why and fix it.

What you'd type:

I have an orders table joined to customers. This query is slow at scale:

SELECT o.id, o.total, o.created_at, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'shipped'
ORDER BY o.created_at DESC
LIMIT 20

Orders has about 500K rows, customers about 10K. Status is evenly distributed across 5 values. There are no indexes besides the primary keys. Find me the right index.

What ExoBench does:

ExoBench runs the query at 10K, 100K, and 500K orders, first without any indexes, then with a composite index on (status, created_at DESC):

OrdersWithout IndexWith IndexSpeedup
10K1.7 ms0.10 ms17x
100K14.9 ms0.13 ms115x
500K110.9 ms Qo10.14 ms Qo2792x

Why: Without the index, Postgres does a Parallel Seq Scan on the orders table, finds all ~100K matching rows, sorts them, and returns 20. With the index on (status, created_at DESC), Postgres walks the index in order, finds the first 20 shipped orders by date, and stops. The execution time stays flat regardless of table size because it never looks past the 20th match.

Key takeaway: For filtered + sorted + limited queries, a composite index matching the filter and sort order eliminates both the scan and the sort. The LIMIT becomes effective immediately.

Other prompts that work:

  • "Compare these two query approaches and tell me which scales better." (paste both)
  • "Will adding a composite index on (status, created_at) help this query?"
  • "My DBA suggested a partial index. Benchmark my query with and without it."

Scaling Analysis

You have a query that works fine today, but you need to know if it'll hold up at 10x or 50x the data.

What you'd type:

We have an events table that stores user activity. This query fetches a single user's 10 most recent events. It's fast now at 10K rows, but we're growing to 500K. No indexes besides the primary key. Will it hold up?

SELECT id, action, created_at
FROM events
WHERE user_id = 42
ORDER BY created_at DESC
LIMIT 10

There are about 5,000 distinct users. Test at 10K, 50K, 100K, 250K, and 500K rows.

What ExoBench does:

ExoBench tests the query at five scales without an index, then adds a composite index on (user_id, created_at DESC) and re-runs:

RowsWithout indexWith index
10K0.54 ms0.04 ms
50K2.50 ms0.07 ms
100K5.12 ms0.04 ms
250K25.76 ms0.04 ms
500K40.26 ms Sc10.04 ms Sc2

The without-index curve grows with the table. At 250K, Postgres even spins up parallel workers to try to compensate, but it still takes 25ms because it's scanning the entire table to find ~50 matching rows. The with-index curve is flat, it jumps directly to user 42's section in the index, reads 10 entries in order, and stops. At 500K rows the difference is ~1000x.

Why: The query returns a fixed-size result (10 rows) from a growing table. Without the index, Postgres must scan all N rows, filter, and sort. The cost grows linearly with table size even though the result never changes. With the index on (user_id, created_at DESC), Postgres walks the B-tree to user 42, reads 10 entries in descending order, and stops. The cost is constant regardless of table size.

Key takeaway: Scaling analysis reveals whether your query's cost grows with the table or with the result set. If the table is growing but the result set isn't, an index can make the difference between "works fine" and "production incident."

Charting the results

Most AI assistants can generate charts natively — SVG, Mermaid diagrams, Chart.js, or their own built-in visualization tools. Once you have benchmark data, you can follow up in the same conversation and ask the AI to visualize it.

What you'd type:

Now chart those results — show me execution time vs table size, with and without the index.

What the AI produces:

Execution time vs table size chart

The "Without index" curve climbs as the table grows. The "With index" line stays flat — that's O(N) vs O(1) in one picture. A follow-up prompt turns raw benchmark numbers into visual proof you can share with your team.

Other prompts that work:

  • "At what data volume does this query plan change from index scan to seq scan?"
  • "Chart this query's performance across 1K, 10K, 100K, 500K, 1M rows."
  • "Will this join strategy still work at 5M rows or will it switch to a hash join?"

Query Technique Validation

You want to know if a specific SQL pattern actually helps or if it's folklore. Give ExoBench both approaches and it runs them head-to-head.

What you'd type:

Does cursor-based pagination actually outperform OFFSET/LIMIT? Test both at 10K, 100K, and 500K posts. 85% of posts are published. We have an index on (published, created_at DESC).

OFFSET version:

SELECT id, title, created_at FROM posts
WHERE published = true
ORDER BY created_at DESC
OFFSET 50000 LIMIT 20

Cursor version:

SELECT id, title, created_at FROM posts
WHERE published = true AND created_at < '2025-06-15T12:00:00'
ORDER BY created_at DESC
LIMIT 20

What ExoBench does:

PostsOFFSET/LIMITCursor-basedSpeedup
10K4.2 ms0.07 ms60x
100K39.6 ms0.10 ms396x
500K223.4 ms Tv10.12 ms Tv21,861x

Why: OFFSET walks through 50,000 entries and discards them. Cursor pagination (WHERE created_at < ?) starts at the right position in the index and reads exactly 20 rows. It stays constant regardless of how many rows come before the cursor.

Key takeaway: Stack Overflow says cursor pagination is better. Now you have the numbers to prove it to your team. ExoBench turns an opinion into evidence.

Other prompts that work:

  • "Is a CTE faster than a subquery for this pattern? Benchmark both."
  • "Should I use EXISTS or IN for this anti-join? Test at scale."
  • "Is a lateral join faster than a correlated subquery here?"

Data Distribution Impact

You suspect your performance issue is data-related, not query-related.

What you'd type:

90% of our transactions have status='completed', about 9.5% are 'pending', and only 0.5% are 'failed'. We have an index on status. This aggregation runs fine when filtering by 'failed' but is slow when filtering by 'completed':

SELECT account_id, SUM(amount), COUNT(*)
FROM transactions
WHERE status = ?
GROUP BY account_id
ORDER BY SUM(amount) DESC
LIMIT 20

We have 500K transactions. Why does the same query with the same index perform completely differently depending on which status we filter?

What ExoBench does:

ExoBench runs the same query twice at 500K rows, once filtering by 'failed' (0.5% of rows) and once by 'completed' (90% of rows):

FilterMatching rowsPlanExecution time
status = 'failed'~2,500 (0.5%)Index Scan → HashAggregate3.9 ms Dd1
status = 'completed'~450,000 (90%)Parallel Seq Scan → HashAggregate280.5 ms Dd2

Why: The query planner knows the statistics. When only 0.5% of rows match 'failed', the index is highly selective, so it's cheaper to jump into the index and grab just those rows. When 90% match 'completed', the index is nearly useless, so Postgres ignores it and scans the entire table sequentially. Same query, same index, 71x performance difference purely because of data distribution.

Key takeaway: An index only helps when it's selective. If your filter matches most of the table, the planner will ignore it entirely. This is also why telling ExoBench about your data distribution matters. If you don't mention that 90% of rows have the same status, ExoBench will assume an even distribution and produce a plan that doesn't match what happens in production.

Other prompts that work:

  • "Our user_id column has extreme skew, one user has 500K rows, everyone else has <100. Benchmark the query with that distribution."
  • "We have a boolean column that's 99% true. Does the index on it even help?"
  • "Does the join strategy change when one side of the join has 10x more rows than the other?"