data
Firecrawl Lua API for KosmoKrator Agents
Agent-facing Lua documentation and function reference for the Firecrawl KosmoKrator integration.Lua Namespace
Agents call this integration through app.integrations.firecrawl.*.
Use lua_read_doc("integrations.firecrawl") inside KosmoKrator to discover the same reference at runtime.
Call Lua from the Headless CLI
Use kosmo integrations:lua when a shell script, CI job, cron job, or another coding CLI should run a deterministic
Firecrawl workflow without starting an interactive agent session.
kosmo integrations:lua --eval 'dump(app.integrations.firecrawl.scrape_url({url = "example_url", formats = "example_formats", onlyMainContent = true, includeTags = "example_includeTags", excludeTags = "example_excludeTags", waitFor = 1, timeout = 1, actions = "example_actions"}))' --json kosmo integrations:lua --eval 'print(docs.read("firecrawl"))' --json
kosmo integrations:lua --eval 'print(docs.read("firecrawl.scrape_url"))' --json Workflow file
Put repeatable logic in a Lua file, then execute it with JSON output for the calling process.
local firecrawl = app.integrations.firecrawl
local result = firecrawl.scrape_url({url = "example_url", formats = "example_formats", onlyMainContent = true, includeTags = "example_includeTags", excludeTags = "example_excludeTags", waitFor = 1, timeout = 1, actions = "example_actions"})
dump(result) kosmo integrations:lua workflow.lua --json
kosmo integrations:lua workflow.lua --force --json integrations:lua exposes app.integrations.firecrawl, app.mcp.*, docs.*, json.*, and regex.*. Use app.integrations.firecrawl.default.* or app.integrations.firecrawl.work.* when you configured named credential accounts.
MCP-only Lua
If the script only needs configured MCP servers and does not need Firecrawl, use the narrower mcp:lua command.
# Use mcp:lua for MCP-only scripts; use integrations:lua for this integration namespace.
kosmo mcp:lua --eval 'dump(mcp.servers())' --json Agent-Facing Lua Docs
This is the rendered version of the full Lua documentation exposed to agents when they inspect the integration namespace.
Firecrawl — Lua API Reference
Namespace: app.integrations.firecrawl
This integration targets Firecrawl v2 JSON endpoints under https://api.firecrawl.dev/v2.
Covered endpoints include scrape, crawl, map, search, batch scrape, extract status, agent jobs, browser sessions, team usage, queue status, and activity. File upload parsing is intentionally not exposed by this JSON-only package slice.
Core Content Tools
local page = app.integrations.firecrawl.scrape({
url = "https://example.test",
formats = { "markdown", "links" },
onlyMainContent = true
})
local results = app.integrations.firecrawl.search({
query = "Firecrawl v2 batch scrape",
limit = 5,
scrapeOptions = { formats = { "markdown" } }
})
local links = app.integrations.firecrawl.map({
url = "https://example.test",
limit = 100
})
Crawl Jobs
local crawl = app.integrations.firecrawl.crawl({
url = "https://example.test/docs",
limit = 50,
formats = { "markdown" }
})
local status = app.integrations.firecrawl.get_crawl_status({ id = crawl.id })
local errors = app.integrations.firecrawl.get_crawl_errors({ id = crawl.id })
local active = app.integrations.firecrawl.get_active_crawls({})
app.integrations.firecrawl.cancel_crawl({ id = crawl.id })
Use preview_crawl_params to turn a plain-English crawl intent into a candidate crawl config before running an expensive crawl.
local preview = app.integrations.firecrawl.preview_crawl_params({
url = "https://example.test",
prompt = "Crawl only the developer docs and ignore changelog pages."
})
Batch Scrape
local batch = app.integrations.firecrawl.batch_scrape({
urls = {
"https://example.test/a",
"https://example.test/b"
},
formats = { "markdown" },
ignoreInvalidURLs = true
})
local status = app.integrations.firecrawl.get_batch_scrape_status({ id = batch.id })
local errors = app.integrations.firecrawl.get_batch_scrape_errors({ id = batch.id })
app.integrations.firecrawl.cancel_batch_scrape({ id = batch.id })
Extract And Agent Jobs
local extract = app.integrations.firecrawl.extract({
urls = { "https://example.test/product/1" },
prompt = "Extract product name, price, and availability."
})
local extract_status = app.integrations.firecrawl.get_extract_status({
id = extract.id
})
local agent = app.integrations.firecrawl.agent({
url = "https://example.test",
prompt = "Find the pricing page and extract all plan names."
})
local agent_status = app.integrations.firecrawl.get_agent_status({
job_id = agent.id
})
app.integrations.firecrawl.cancel_agent({ job_id = agent.id })
Browser Sessions
local browser = app.integrations.firecrawl.create_browser({
url = "https://example.test"
})
local result = app.integrations.firecrawl.execute_browser({
session_id = browser.sessionId,
prompt = "Click the pricing link and return the page title."
})
local sessions = app.integrations.firecrawl.list_browsers({})
app.integrations.firecrawl.delete_browser({ session_id = browser.sessionId })
Team Usage And Activity
local credits = app.integrations.firecrawl.credit_usage({})
local credit_history = app.integrations.firecrawl.historical_credit_usage({})
local tokens = app.integrations.firecrawl.token_usage({})
local token_history = app.integrations.firecrawl.historical_token_usage({})
local queue = app.integrations.firecrawl.queue_status({})
local activity = app.integrations.firecrawl.activity({ limit = 20 })
Multi-Account Usage
app.integrations.firecrawl.scrape({ url = "https://example.test" })
app.integrations.firecrawl.default.scrape({ url = "https://example.test" })
app.integrations.firecrawl.production.scrape({ url = "https://example.test" })Raw agent markdown
# Firecrawl — Lua API Reference
Namespace: `app.integrations.firecrawl`
This integration targets Firecrawl v2 JSON endpoints under `https://api.firecrawl.dev/v2`.
Covered endpoints include scrape, crawl, map, search, batch scrape, extract status, agent jobs, browser sessions, team usage, queue status, and activity. File upload parsing is intentionally not exposed by this JSON-only package slice.
## Core Content Tools
```lua
local page = app.integrations.firecrawl.scrape({
url = "https://example.test",
formats = { "markdown", "links" },
onlyMainContent = true
})
local results = app.integrations.firecrawl.search({
query = "Firecrawl v2 batch scrape",
limit = 5,
scrapeOptions = { formats = { "markdown" } }
})
local links = app.integrations.firecrawl.map({
url = "https://example.test",
limit = 100
})
```
## Crawl Jobs
```lua
local crawl = app.integrations.firecrawl.crawl({
url = "https://example.test/docs",
limit = 50,
formats = { "markdown" }
})
local status = app.integrations.firecrawl.get_crawl_status({ id = crawl.id })
local errors = app.integrations.firecrawl.get_crawl_errors({ id = crawl.id })
local active = app.integrations.firecrawl.get_active_crawls({})
app.integrations.firecrawl.cancel_crawl({ id = crawl.id })
```
Use `preview_crawl_params` to turn a plain-English crawl intent into a candidate crawl config before running an expensive crawl.
```lua
local preview = app.integrations.firecrawl.preview_crawl_params({
url = "https://example.test",
prompt = "Crawl only the developer docs and ignore changelog pages."
})
```
## Batch Scrape
```lua
local batch = app.integrations.firecrawl.batch_scrape({
urls = {
"https://example.test/a",
"https://example.test/b"
},
formats = { "markdown" },
ignoreInvalidURLs = true
})
local status = app.integrations.firecrawl.get_batch_scrape_status({ id = batch.id })
local errors = app.integrations.firecrawl.get_batch_scrape_errors({ id = batch.id })
app.integrations.firecrawl.cancel_batch_scrape({ id = batch.id })
```
## Extract And Agent Jobs
```lua
local extract = app.integrations.firecrawl.extract({
urls = { "https://example.test/product/1" },
prompt = "Extract product name, price, and availability."
})
local extract_status = app.integrations.firecrawl.get_extract_status({
id = extract.id
})
local agent = app.integrations.firecrawl.agent({
url = "https://example.test",
prompt = "Find the pricing page and extract all plan names."
})
local agent_status = app.integrations.firecrawl.get_agent_status({
job_id = agent.id
})
app.integrations.firecrawl.cancel_agent({ job_id = agent.id })
```
## Browser Sessions
```lua
local browser = app.integrations.firecrawl.create_browser({
url = "https://example.test"
})
local result = app.integrations.firecrawl.execute_browser({
session_id = browser.sessionId,
prompt = "Click the pricing link and return the page title."
})
local sessions = app.integrations.firecrawl.list_browsers({})
app.integrations.firecrawl.delete_browser({ session_id = browser.sessionId })
```
## Team Usage And Activity
```lua
local credits = app.integrations.firecrawl.credit_usage({})
local credit_history = app.integrations.firecrawl.historical_credit_usage({})
local tokens = app.integrations.firecrawl.token_usage({})
local token_history = app.integrations.firecrawl.historical_token_usage({})
local queue = app.integrations.firecrawl.queue_status({})
local activity = app.integrations.firecrawl.activity({ limit = 20 })
```
## Multi-Account Usage
```lua
app.integrations.firecrawl.scrape({ url = "https://example.test" })
app.integrations.firecrawl.default.scrape({ url = "https://example.test" })
app.integrations.firecrawl.production.scrape({ url = "https://example.test" })
``` local result = app.integrations.firecrawl.scrape_url({url = "example_url", formats = "example_formats", onlyMainContent = true, includeTags = "example_includeTags", excludeTags = "example_excludeTags", waitFor = 1, timeout = 1, actions = "example_actions"})
print(result) Functions
scrape_url Read
Scrape a single URL and extract its content. Returns the page content in the requested format (markdown by default). Supports actions like waiting for JavaScript, taking screenshots, and extracting specific elements.
- Lua path
app.integrations.firecrawl.scrape_url- Full name
firecrawl.firecrawl_scrape
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | yes | The URL to scrape (e.g., "https://example.com"). |
formats | array | no | Output formats to return. Options: "markdown", "html", "rawHtml", "content", "links", "screenshot", "actions". Default: ["markdown"]. |
onlyMainContent | boolean | no | Extract only the main content, removing navigation, footers, etc. Default: true. |
includeTags | array | no | CSS selectors to include. Only these elements will be scraped. |
excludeTags | array | no | CSS selectors to exclude. These elements will be removed from the result. |
waitFor | integer | no | Time in milliseconds to wait for dynamic content to load before scraping. |
timeout | integer | no | Timeout in milliseconds for the scrape request. Default: 30000. |
actions | array | no | List of actions to perform before scraping (e.g., click, scroll, wait, screenshot). |
search Read
Search the web with Firecrawl and optionally scrape result pages using scrapeOptions.
- Lua path
app.integrations.firecrawl.search- Full name
firecrawl.firecrawl_search
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | yes | Search query. |
limit | integer | no | Maximum results per source, 1 to 100. |
sources | array | no | Sources to search, such as web, images, or news. |
categories | array | no | Category filters, such as github, research, or pdf. |
includeDomains | array | no | Restrict results to these domains. |
excludeDomains | array | no | Exclude results from these domains. |
tbs | string | no | Time-based search filter such as qdr:w or custom date range. |
location | string | no | Geo-targeted search location. |
country | string | no | ISO country code for search results. |
scrapeOptions | object | no | Optional Firecrawl scrape options for each result. |
website Read
Start a crawl job to scrape all pages from a website starting at the given URL. Returns a crawl job ID — use firecrawl_get_crawl_status to check progress and retrieve results.
- Lua path
app.integrations.firecrawl.website- Full name
firecrawl.firecrawl_crawl
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | yes | The root URL to start crawling from (e.g., "https://example.com"). |
limit | integer | no | Maximum number of pages to crawl. Default: 10. |
maxDepth | integer | no | Maximum crawl depth from the root URL. Default: based on plan. |
formats | array | no | Output formats for each page. Options: "markdown", "html", "rawHtml", "content", "links". Default: ["markdown"]. |
excludePaths | array | no | URL path patterns to exclude from crawling (e.g., ["/blog/*"]). |
includePaths | array | no | Only crawl URLs matching these path patterns (e.g., ["/docs/*"]). |
allowBackwardLinks | boolean | no | Allow crawling links that go back to parent pages. Default: false. |
allowExternalLinks | boolean | no | Allow crawling links to external domains. Default: false. |
onlyMainContent | boolean | no | Extract only main content from each page. Default: true. |
status Read
Check the status and retrieve results of a crawl job. Returns the current status (scraping, completed, failed, cancelled) and all scraped data once complete.
- Lua path
app.integrations.firecrawl.status- Full name
firecrawl.firecrawl_get_crawl_status
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | The crawl job ID returned by the firecrawl_crawl tool. |
cancel Write
Cancel a running Firecrawl crawl job.
- Lua path
app.integrations.firecrawl.cancel- Full name
firecrawl.firecrawl_cancel_crawl
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Crawl job ID. |
errors Read
List failed pages and errors for a Firecrawl crawl job.
- Lua path
app.integrations.firecrawl.errors- Full name
firecrawl.firecrawl_get_crawl_errors
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Crawl job ID. |
active Read
List currently active Firecrawl crawl jobs for the configured team.
- Lua path
app.integrations.firecrawl.active- Full name
firecrawl.firecrawl_get_active_crawls
| Parameter | Type | Required | Description |
|---|---|---|---|
| No parameters. | |||
preview_params Read
Preview crawl parameters generated from a natural language crawl prompt.
- Lua path
app.integrations.firecrawl.preview_params- Full name
firecrawl.firecrawl_preview_crawl_params
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | no | Target site URL. |
prompt | string | yes | Natural-language crawl intent. |
map_urls Read
Map a website to discover all linked URLs. Returns a list of all URLs found on the site without scraping full content. Useful for understanding site structure before crawling.
- Lua path
app.integrations.firecrawl.map_urls- Full name
firecrawl.firecrawl_map
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | yes | The root URL to map (e.g., "https://example.com"). |
limit | integer | no | Maximum number of URLs to return. Default: based on plan. |
includeSubdomains | boolean | no | Include URLs from subdomains. Default: false. |
search | string | no | Filter URLs that match a search term (only returns URLs containing this string). |
ignoreSitemap | boolean | no | Skip sitemap.xml discovery and only use on-page links. Default: false. |
includePaths | array | no | Only include URLs matching these path patterns. |
excludePaths | array | no | Exclude URLs matching these path patterns. |
batch_scrape Read
Scrape multiple URLs in one Firecrawl batch job and poll the status with firecrawl_get_batch_scrape_status.
- Lua path
app.integrations.firecrawl.batch_scrape- Full name
firecrawl.firecrawl_batch_scrape
| Parameter | Type | Required | Description |
|---|---|---|---|
urls | array | yes | URLs to scrape. |
formats | array | no | Output formats such as markdown, html, links, screenshot, images, or json. |
onlyMainContent | boolean | no | Extract only main content. |
ignoreInvalidURLs | boolean | no | Skip invalid URLs instead of failing the whole batch. |
webhook | object | no | Optional webhook config for batch events. |
batch_scrape_status Read
Check Firecrawl batch scrape status and retrieve available results.
- Lua path
app.integrations.firecrawl.batch_scrape_status- Full name
firecrawl.firecrawl_get_batch_scrape_status
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Batch scrape job ID. |
cancel_batch_scrape Write
Cancel a running Firecrawl batch scrape job.
- Lua path
app.integrations.firecrawl.cancel_batch_scrape- Full name
firecrawl.firecrawl_cancel_batch_scrape
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Batch scrape job ID. |
batch_scrape_errors Read
List failed URLs and errors for a Firecrawl batch scrape job.
- Lua path
app.integrations.firecrawl.batch_scrape_errors- Full name
firecrawl.firecrawl_get_batch_scrape_errors
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Batch scrape job ID. |
extract_data Read
Extract structured data from one or more URLs using AI. Provide a prompt describing what to extract, or a JSON schema for the expected output format. Ideal for pulling specific data points from web pages.
- Lua path
app.integrations.firecrawl.extract_data- Full name
firecrawl.firecrawl_extract
| Parameter | Type | Required | Description |
|---|---|---|---|
urls | array | yes | List of URLs to extract data from (e.g., ["https://example.com/about"]). |
prompt | string | no | Natural language description of what data to extract from the pages. |
schema | object | no | JSON schema defining the expected output structure. The response will conform to this schema. |
systemPrompt | string | no | System prompt to guide the AI extraction behavior. |
allowExternalLinks | boolean | no | Allow following links to external domains during extraction. Default: false. |
enableWebSearch | boolean | no | Enable web search to supplement extraction with additional context. Default: false. |
includeSubdomains | boolean | no | Include subdomains when following links. Default: false. |
extract_status Read
Check status and retrieve results for a Firecrawl extract job.
- Lua path
app.integrations.firecrawl.extract_status- Full name
firecrawl.firecrawl_get_extract_status
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | yes | Extract job ID. |
agent_task Read
Start a Firecrawl agent task for autonomous web navigation and data extraction.
- Lua path
app.integrations.firecrawl.agent_task- Full name
firecrawl.firecrawl_agent
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | yes | Agent task prompt. |
url | string | no | Optional starting URL. |
schema | object | no | Optional JSON schema for structured output. |
agent_status Read
Check status and retrieve results for a Firecrawl agent job.
- Lua path
app.integrations.firecrawl.agent_status- Full name
firecrawl.firecrawl_get_agent_status
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id | string | yes | Agent job ID. |
cancel_agent Write
Cancel a running Firecrawl agent job.
- Lua path
app.integrations.firecrawl.cancel_agent- Full name
firecrawl.firecrawl_cancel_agent
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id | string | yes | Agent job ID. |
create_browser Write
Create a Firecrawl browser session for interactive web tasks.
- Lua path
app.integrations.firecrawl.create_browser- Full name
firecrawl.firecrawl_create_browser
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | no | Optional URL to open when the session starts. |
timeout | integer | no | Optional session timeout in milliseconds. |
list_browsers Read
List Firecrawl browser sessions, optionally filtered by status.
- Lua path
app.integrations.firecrawl.list_browsers- Full name
firecrawl.firecrawl_list_browsers
| Parameter | Type | Required | Description |
|---|---|---|---|
status | string | no | Optional browser session status filter. |
execute_browser Write
Execute browser automation code or an AI prompt in a Firecrawl browser session.
- Lua path
app.integrations.firecrawl.execute_browser- Full name
firecrawl.firecrawl_execute_browser
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | string | yes | Browser session ID. |
code | string | no | Browser automation code to execute. |
prompt | string | no | Natural-language browser task prompt. |
delete_browser Write
Delete or stop a Firecrawl browser session.
- Lua path
app.integrations.firecrawl.delete_browser- Full name
firecrawl.firecrawl_delete_browser
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | string | yes | Browser session ID. |
credit_usage Read
Get remaining Firecrawl credits for the configured team.
- Lua path
app.integrations.firecrawl.credit_usage- Full name
firecrawl.firecrawl_credit_usage
| Parameter | Type | Required | Description |
|---|---|---|---|
| No parameters. | |||
historical_credit_usage Read
Get historical Firecrawl credit usage for the configured team.
- Lua path
app.integrations.firecrawl.historical_credit_usage- Full name
firecrawl.firecrawl_historical_credit_usage
| Parameter | Type | Required | Description |
|---|---|---|---|
startDate | string | no | Optional start date filter. |
endDate | string | no | Optional end date filter. |
token_usage Read
Get remaining Firecrawl extract tokens for the configured team.
- Lua path
app.integrations.firecrawl.token_usage- Full name
firecrawl.firecrawl_token_usage
| Parameter | Type | Required | Description |
|---|---|---|---|
| No parameters. | |||
historical_token_usage Read
Get historical Firecrawl extract token usage for the configured team.
- Lua path
app.integrations.firecrawl.historical_token_usage- Full name
firecrawl.firecrawl_historical_token_usage
| Parameter | Type | Required | Description |
|---|---|---|---|
startDate | string | no | Optional start date filter. |
endDate | string | no | Optional end date filter. |
queue_status Read
Get Firecrawl scrape queue metrics for the configured team.
- Lua path
app.integrations.firecrawl.queue_status- Full name
firecrawl.firecrawl_queue_status
| Parameter | Type | Required | Description |
|---|---|---|---|
| No parameters. | |||
activity Read
List recent Firecrawl API activity for the configured team.
- Lua path
app.integrations.firecrawl.activity- Full name
firecrawl.firecrawl_activity
| Parameter | Type | Required | Description |
|---|---|---|---|
limit | integer | no | Maximum activities to return. |
cursor | string | no | Pagination cursor. |
endpoint | string | no | Optional endpoint filter. |