How to Fetch a Single Page with CLI
Learn how to fetch a single URL using crawler.sh CLI without crawling the entire site. Get clean output with smart, path-based filenames.
Sometimes you do not need to crawl an entire website. You just want to grab one page - check its metadata, extract its content, or spot-check a single URL. Running a full crawl for that is overkill. The crawler fetch command solves this by fetching a single URL and saving the result with a clean, path-based filename.
This guide shows you how to use crawler fetch to capture individual pages.
Step 1: Install crawler.sh CLI
Install the CLI with a single command:
curl -fsSL https://install.crawler.sh | shThis downloads the correct binary for your operating system and architecture, places it in ~/.crawler/bin/, and adds it to your PATH. Restart your terminal or run source ~/.bashrc (or ~/.zshrc) to pick up the new PATH entry.
Verify the installation:
crawler --versionStep 2: Fetch a single URL
Run the fetch command with the URL you want to capture:
crawler fetch https://example.com/about/teamThis creates a file called example-com-about-team.crawl in the current directory. The filename is derived from the URL path, so each page gets a unique, readable output file.
Compare this with crawler crawl, which would name the file example-com.crawl regardless of which pages were crawled.
Step 3: Understand the filename convention
The fetch command generates filenames using the pattern {domain}-{slug}.{ext}. Here is how different URLs map to filenames:
| URL | Output file |
|---|---|
https://example.com/ | example-com-index.crawl |
https://example.com/about/team | example-com-about-team.crawl |
https://example.com/blog/my-post.html | example-com-blog-my-post.crawl |
https://example.com/products/widgets?sort=price | example-com-products-widgets.crawl |
The rules are simple: path segments are joined with hyphens, file extensions like .html are stripped, and query parameters are ignored. If you fetch the root URL, the slug becomes index.
Step 4: Override the output filename
If you want a custom filename instead of the auto-generated one, use the --output flag:
crawler fetch -o team-page.crawl https://example.com/about/teamThis is useful when you want a shorter name or when organizing fetched pages into a specific naming scheme.
Step 5: Choose an output format
The default output format is NDJSON (.crawl), but you can also fetch as JSON or sitemap XML:
# JSON formatcrawler fetch --format json https://example.com/about
# Sitemap XML formatcrawler fetch --format sitemap https://example.com/aboutThe format flag also changes the file extension: .json for JSON and .xml for sitemap.
Step 6: Extract or skip content
By default, crawler fetch extracts the page content as Markdown, including word count, byline, and excerpt. If you only need metadata (status code, title, meta description, canonical URL) and want a faster, smaller result, disable content extraction:
crawler fetch --no-extract https://example.com/aboutThis is useful when you are checking page health rather than reading content.
Step 7: Analyze the fetched page
The output file from fetch is the same format as crawl, so all existing analysis commands work on it:
# View page metadatacrawler info example-com-about.crawl
# Run SEO checks on the single pagecrawler seo example-com-about.crawl
# Convert to JSONcrawler export example-com-about.crawl --format jsonThis makes fetch a fast way to spot-check SEO issues on a specific page without waiting for a full site crawl.
When to use fetch vs crawl
Use crawler fetch when:
- You need to check a single page quickly
- You want to extract content from a specific article or blog post
- You are debugging how the crawler handles a particular URL
- You want to compare the same page over time by re-fetching it
Use crawler crawl when:
- You need to analyze an entire website
- You want to find broken links, orphan pages, or site-wide SEO issues
- You need to discover all pages on a domain