Documentation Index
Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
Use this file to discover all available pages before exploring further.
Firecrawl Tools
Firecrawl provides powerful web scraping, search, crawling, and data extraction capabilities for AI applications.
Installation
npm install firecrawl-aisdk
Environment Variables
FIRECRAWL_API_KEY=fc-your-api-key
Get your API key from Firecrawl.
| Tool | Description |
|---|
scrapeTool | Scrape a single URL |
searchTool | Search the web |
mapTool | Discover URLs on a site |
crawlTool | Crawl multiple pages |
batchScrapeTool | Scrape multiple URLs |
extractTool | Extract structured data |
Quick Start
import { Agent } from 'praisonai';
import { firecrawlScrape, firecrawlCrawl } from 'praisonai/tools';
const agent = new Agent({
name: 'WebScraper',
instructions: 'You scrape and analyze web content.',
tools: [firecrawlScrape(), firecrawlCrawl()],
});
const result = await agent.run('Scrape https://example.com and summarize the content');
console.log(result.text);
Scrape a single URL and get clean markdown content.
import { firecrawlScrape } from 'praisonai/tools';
const scrapeTool = firecrawlScrape({
// Output format
formats: ['markdown', 'html'],
// Wait for page to load
waitFor: 1000,
// Include/exclude tags
includeTags: ['article', 'main'],
excludeTags: ['nav', 'footer'],
// Screenshot options
screenshot: true,
});
const agent = new Agent({
name: 'Scraper',
tools: [scrapeTool],
});
Crawl multiple pages starting from a URL.
import { firecrawlCrawl } from 'praisonai/tools';
const crawlTool = firecrawlCrawl({
// Maximum pages to crawl
limit: 10,
// Crawl depth
maxDepth: 2,
// URL patterns to include/exclude
includePaths: ['/docs/*', '/blog/*'],
excludePaths: ['/admin/*'],
// Allow external links
allowExternalLinks: false,
});
const agent = new Agent({
name: 'Crawler',
tools: [crawlTool],
});
Using with AI SDK Directly
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { scrapeTool, searchTool, mapTool, crawlTool } from 'firecrawl-aisdk';
// Scrape a page
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'Scrape https://firecrawl.dev and summarize what it does',
tools: { scrape: scrapeTool },
});
// Search the web
const { text: searchResult } = await generateText({
model: openai('gpt-4o'),
prompt: 'Search for Firecrawl and summarize what you find',
tools: { search: searchTool },
});
// Map a site
const { text: mapResult } = await generateText({
model: openai('gpt-4o'),
prompt: 'Map https://docs.firecrawl.dev and list the main sections',
tools: { map: mapTool },
});
Batch Scraping
Scrape multiple URLs efficiently.
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { batchScrapeTool, pollTool } from 'firecrawl-aisdk';
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'Scrape https://firecrawl.dev and https://docs.firecrawl.dev, then compare',
tools: {
batchScrape: batchScrapeTool,
poll: pollTool
},
});
Extract specific data from web pages.
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { extractTool, pollTool } from 'firecrawl-aisdk';
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'Extract the main features from https://firecrawl.dev',
tools: {
extract: extractTool,
poll: pollTool
},
});
Scrape Result
interface FirecrawlScrapeResult {
url: string;
markdown: string;
html?: string;
metadata?: {
title?: string;
description?: string;
language?: string;
};
screenshot?: string;
}
Crawl Result
interface FirecrawlCrawlResult {
status: string;
total: number;
completed: number;
data: Array<{
url: string;
markdown: string;
metadata?: object;
}>;
}
Advanced Example
import { Agent } from 'praisonai';
import { firecrawlScrape, firecrawlCrawl } from 'praisonai/tools';
const agent = new Agent({
name: 'ContentAnalyzer',
instructions: `You are a content analyst.
1. Scrape the provided URL
2. Extract key information
3. Provide a structured summary`,
tools: [
firecrawlScrape({ formats: ['markdown'] }),
firecrawlCrawl({ limit: 5, maxDepth: 1 }),
],
});
const result = await agent.run(
'Analyze the documentation structure of https://docs.firecrawl.dev'
);
console.log(result.text);
Error Handling
import { firecrawlScrape } from 'praisonai/tools';
const tool = firecrawlScrape();
try {
const result = await tool.execute({ url: 'https://example.com' });
console.log(result);
} catch (error) {
if (error.message.includes('FIRECRAWL_API_KEY')) {
console.error('Missing API key');
} else if (error.message.includes('rate limit')) {
console.error('Rate limited - try again later');
} else {
console.error('Scrape failed:', error.message);
}
}
Best Practices
- Use appropriate tool - Scrape for single pages, crawl for multiple
- Set limits - Always set crawl limits to avoid excessive API usage
- Filter content - Use includeTags/excludeTags to get relevant content
- Handle async jobs - Use pollTool for crawl and batch operations
- Cache results - Store scraped content to avoid repeated requests
- Tavily - Web search and extraction
- Exa - Semantic web search
- Parallel - Token-efficient web search