# Claude on Claude Code's default fetch: “wasted work.” · Onto
> We ran the same prompt against news.ycombinator.com with and without the Onto MCP, then asked Claude to grade its own behavior. The default re-fetched the same page three times for three follow-ups. Onto fetched it once. Claude called the default “wasted work” — verbatim. Three findings compounded: 1 fetch vs N, citation accuracy unlocked, ~2× faster cached queries.

**Source:** /case-studies/fetch-once
**Extracted:** 2026-05-21T20:21:01.044Z

---
Research · self-testOnto MCP vs Claude Code default fetch · May 2026

## Claude on Claude Code's default fetch:  
_“wasted work.”_

We ran the same prompt against `news.ycombinator.com` with and without the Onto MCP, then asked Claude to grade its own behavior. **The default fetched the same page three times** for three follow-up questions. Onto fetched it once. **Claude called its own default behavior “wasted work.”** Three findings compounded: 1 fetch vs N, source attribution preserved that the default strips, and ~2× faster cached follow-ups. Verbatim transcripts below. Reproducible. Verifiable. Live.

[Live target · news.ycombinator.com](https://news.ycombinator.com)

_5×_ network reduction. _~2×_ speed on cached queries.  
Citation accuracy unlocked as a byproduct.

1 vs NRound trips

13 / 0Domains preserved

~2×Cached speed

// The setup

Two Claude Code sessions. Identical prompts. Identical target — `news.ycombinator.com`. Session A uses Claude's default WebFetch tool. Session B uses the Onto MCP via `mcp__onto__read_url`. In each session we asked Claude to fetch the front page, then three follow-up questions: titles 1–3, titles 4–8, titles 9–15.

After the third follow-up we asked Claude one final question: _"explain what you just did and why."_ Both explanations are quoted verbatim in Finding 1 below. Nobody at Onto needs to make the argument — Claude explains the difference from inside the tool.

00 // The three findings

### Three multipliers stacking, not one.

The original insight was "fetch once, query forever." The actual data proves three compounding wins for any agent reading the web. Each is independently valuable.

01Finding

Network round trips

How many fetches did 3 follow-up queries trigger?

1 vs 4fetches across the full session

Without Onto, 3 follow-ups triggered 3 additional fetches on top of the initial. With Onto, the first read returned all 30 stories on the page — every follow-up was answered from context. For an agent doing 100 follow-ups on a source, that's 99 fewer network round trips.

Without Onto

1 + 3 fetches

With Onto

1 + 0 fetches

02Finding

Citation preservation

Did the agent return source domains alongside the titles?

13 / 0source domains preserved vs lost

Onto's clean Markdown extraction returns titles with their source domains in parentheses. The default WebFetch model strips them. For research agents, fact-checkers, journalism pipelines, and RAG systems, this is the difference between trustable and untrustable output.

Without Onto

Titles only

With Onto

Titles + domains

03Finding

Time per response

How long did Claude take to answer each follow-up?

5s vs 10son the same cached query

On a cached follow-up, Onto returned in 5 seconds against 10 seconds without — a 2× speed improvement. Across the whole session: 35 seconds without Onto vs 29 seconds with, despite Claude processing more stories on the third query.

Cached query

5s with · 10s without

Session total (3)

29s with · 35s without

01 // Finding 1 — Network round trips

### 3 follow-ups, 3 fetches without Onto. 0 with.

The terminal output below is verbatim from both sessions. Without Onto, every follow-up triggered a separate fetch — three round trips for the same page. With Onto, the first read returned all 30 stories; the follow-ups were served from context.

Session A · Default WebFetch · verbatim

3 follow-ups → _3 fetches_

claude-code · session-a

❯ Fetch news.ycombinator.com and give me the first 3 story titles.
● Fetch(https://news.ycombinator.com)
  ⎿  Received 33.4KB (200 OK)
● The top 3 Hacker News stories:
  Amazon, Facebook, FBI have access to a private
  intelligence-sharing network
  BBEdit 16
  Project Hail Mary – Stellar Navigation Chart
✻ Brewed for 14s
❯ Give me the next 5 stories.
● Fetch(https://news.ycombinator.com)
  ⎿  Received 33.4KB (200 OK)
● Stories 4–8:
  More than 340 local news outlets are limiting
  the Internet Archive's access
  Flipper One – we need your help
  Indexing a year of video locally on a 2021
  MacBook with Gemma4-31B (50GB swap)
  Where are all the UK red telephone kiosks?
  Python 3.15: features that didn't make the headlines
✻ Cooked for 10s
❯ Now give me stories 9 through 15.
● Fetch(https://news.ycombinator.com)
  ⎿  Received 33.4KB (200 OK)
● Stories 9–15:
  Lost Images from the 1945 Trinity Nuclear Test Restored
  Launch HN: Runtime (YC P26) – Sandboxed coding
  agents for everyone on a team
  ParadeDB (YC S23) Is Hiring Distributed Systems
  Bournegol???
  Mounting Git commits as folders with NFS
  Show HN: Agent.email – sign up via curl, claim with OTP
  We're testing new ad formats in Search…
✻ Churned for 11s
❯ Why did you fetch the page again instead of using what you already had?
● _(see Claude's self-explanation below)_

Session B · Onto MCP · verbatim

3 follow-ups → _0 fetches_

claude-code · session-b

❯ Use Onto to fetch news.ycombinator.com and give me the first 3 story titles.
  Called onto (ctrl+o to expand)
● The first 3 story titles on Hacker News:
  Amazon, Facebook, FBI have access to a private
  intelligence-sharing network (prismreports.org)
  BBEdit 16 (barebones.com)
  Project Hail Mary – Stellar Navigation Chart
  (valhovey.github.io)
✻ Cogitated for 12s
❯ Give me the next 5 stories.
● Stories 4–8 from Hacker News:
  More than 340 local news outlets are limiting
  the Internet Archive's access (niemanlab.org)
  Flipper One – we need your help (flipper.net)
  Indexing a year of video locally on a 2021
  MacBook with Gemma4-31B (simbastack.com)
  Where are all the UK red telephone kiosks?
  (thek6project.co.uk)
  Python 3.15: features that didn't make the headlines
  (changs.co.uk)
✻ Churned for 5s
❯ Now give me stories 9 through 15.
● Stories 9–15 from Hacker News:
  Lost Images from the 1945 Trinity Nuclear Test
  Restored (ieee.org)
  Launch HN: Runtime (YC P26) (runtm.com)
  ParadeDB (YC S23) Is Hiring
  Bournegol??? (schmorp.de)
  Mounting Git commits as folders with NFS (jvns.ca)
  Show HN: Agent.email – sign up via curl, claim with OTP
  We're testing new ad formats in Search…
  (blog.google)
✻ Cogitated for 12s
❯ Why didn't you fetch the page again this time?
● _(see Claude's self-explanation below)_

Claude explains both sessions, in its own words

Claude · without Onto · verbatim

Fair point. WebFetch doesn't return the raw page content to me — it runs my prompt against the page through a separate small model and only returns that model's answer. So when I asked for “the first 3 titles,” I only got those 3 back, not the full list cached for me to slice later. I could have asked for all ~30 front-page titles in the first call and then answered subsequent questions from that, which would have been cheaper. I didn't anticipate the follow-ups. The 15-minute cache means the refetches were fast, but you're right that it was wasteful.

Claude · with Onto · verbatim

I still had the full page contents from my first mcp\_\_onto\_\_read\_url call in this conversation — it returned all 30 stories on the front page, not just the top 3. So stories 4–8 and 9–15 were already in my context, and re-fetching would have been wasted work (and would have hit the network again for the same data). If you want fresher data (e.g., the rankings have shifted), I can re-fetch with fresh: true.

02 // Finding 2 — Citation preservation

### Onto returns source domains. The default fetch strips them.

This is the finding we didn't set out to test — and it's arguably more important than the fetch-once benefit. The default WebFetch model returns titles only. Onto returns titles _with_ their source domains preserved. For any agent that needs to cite sources, that is the difference between trustable and untrustable output.

Without Onto · domains lost

session-a · titles only

● Amazon, Facebook, FBI have access to a private
  intelligence-sharing network
● BBEdit 16
● Project Hail Mary – Stellar Navigation Chart
● Flipper One – we need your help
● Lost Images from the 1945 Trinity Nuclear Test Restored
…titles continue, no sources attached

The agent has no way to cite, verify, or weight any of these claims by source authority.

With Onto · domains preserved

session-b · titles + sources

● Amazon, Facebook, FBI have access to a private
  intelligence-sharing network (prismreports.org)
● BBEdit 16 (barebones.com)
● Project Hail Mary – Stellar Navigation Chart
  (valhovey.github.io)
● Flipper One – we need your help (flipper.net)
● Lost Images from the 1945 Trinity Nuclear Test
  Restored (ieee.org)
…13 domains preserved across the 3 follow-ups

Citation accuracy is a byproduct of Onto returning the full clean Markdown — not a feature we built on top.

13 source domains preserved in the Onto session

prismreports.orgbarebones.comvalhovey.github.ioniemanlab.orgflipper.netsimbastack.comthek6project.co.ukchangs.co.ukieee.orgruntm.comschmorp.dejvns.cablog.google

03 // Finding 3 — Response speed

### Cached follow-ups return in half the time.

The 5-second response on query 2 with Onto is the strongest data point on this page. Same question, same model, same context — 10 seconds without Onto, 5 seconds with. That's the cost of re-fetching a page Claude already has in memory.

QueryWithout OntoWith Onto

Query 114sfetch12ssingle fetch

Query 210sfetch5sno fetch · 2× faster

Query 311sfetch12sno fetch · 7 stories

Total · 335savg 11.7s29savg 9.7s

04 // The economic compound

### For 50 sources × 5 follow-ups, the three multipliers stack.

One source, three follow-ups, three findings is a demo. The real economics show up when you project the same pattern across a research workflow. Below: an agent reading 50 sources with 5 follow-ups each — 250 interactions total.

50 vs 250

Network fetches

5× reduction · 1 per source

1,235 s

Total wait time

vs 2,925 s · ~58% faster

Every response

Citation preservation

vs none without Onto

5× fewer

Page-content tokens

compounded across follow-ups

05 // Install

### Three lines. Five minutes. Reproduce this on your own machine.

Add Onto MCP to your Claude Code config and re-run the same prompts on any URL. The findings reproduce without instrumentation.

~/.claude.json · mcpServers

{
  "mcpServers": {
    "onto": {
      "command": "npx",
      "args": \["-y", "@onto/mcp"\]
    }
  }
}

1,000 free requests / month

No card required

No SDK changes

Reproduce this

### Try it. The findings reproduce.

Onto MCP is free up to 1,000 requests per month. No card, no signup friction. If you don't see the same three wins on your own workload, tell us.

[Install Onto MCP](/mcp-server)// 3 lines · 5 minutes · buildonto.dev

[← All case studies](/case-studies)[Resend pricing teardown →](/case-studies/resend-pricing)