NAME
readable — Article extraction service built for AI agents—handles client-side rendered content with a real browser backend
SYNOPSIS
status: active
tags: service, web
website: https://readable.page
Overview
Readable is a web service that extracts article content from any URL and returns clean markdown. It runs as an MCP server at readable.page/mcp and as a standalone reader at readable.page.
The core problem it solves: AI agents trying to fetch modern web pages often get incomplete HTML—the actual content only appears after JavaScript runs. Readable puts a real browser in the backend to handle this automatically.
Why
Standard HTTP fetching fails on client-side rendered sites. The initial HTML is often just a <div id"root"></div>= placeholder—the agent gets nothing useful. Running a headless browser in the agent's environment is expensive and complicated. Most agent frameworks don't handle it well.
Even when they do, the output is raw HTML soup that wastes tokens. Readable solves both problems: browser rendering server-side when needed, and clean markdown instead of HTML.
How It Works
The extraction pipeline has two stages:
- Fast path: Fetch HTML directly and parse with Mozilla Readability
- Fallback: If Readability returns minimal content, retry using Cloudflare Browser Rendering to execute JavaScript before extraction
Most static sites hit the fast path. JavaScript-heavy sites like Medium, Substack, or modern docs automatically get browser rendering. The caller doesn't need to know which path ran—they just get markdown.
Processed articles cache in Cloudflare R2. Subsequent requests serve cached versions without re-rendering.
The app runs on Cloudflare Workers using TanStack Start with SolidJS. Effect-TS handles the service layer with functional patterns throughout. Browser sessions are managed by Cloudflare's Browser Rendering API—no infrastructure to maintain.
MCP Server
The primary interface is the MCP server:
// Endpoint: https://readable.page/mcp
// Tool: read
// Parameters: { url: string, format: 'md' | 'html' }
// Returns: Clean article content in the requested formatAI coding assistants call this tool instead of trying to fetch and parse web content themselves. Rate-limited: anonymous users get 50k tokens per day, authenticated users get 500k.
Web Reader and Chat
The web interface at readable.page shows extracted articles in a clean reading view. A side panel lets you ask questions about the current article using Gemini. Chat history is persisted per user+URL in Cloudflare D1, so you can return to an article and continue a previous discussion.
Chrome Extension
A Chrome extension provides an overlay reader view on any page—it runs Mozilla Readability directly in the content script, so no server call is needed for extraction. The side panel loads the chat UI from readable.page. Built but not yet published to the Chrome Web Store.
Project Links
AUTHOR
Xiaoxing Hu