D Diagent docs

API reference

Sources ingest API

The Sources REST API lets you push knowledge into a Pitchbar workspace from any system that can make an HTTP request. Useful for piping content out of an internal wiki, a CMS, a data pipeline, or a curated CSV — without going through the admin UI.

Authentication

Every request needs a workspace API token issued from Settings → API tokens. The token must carry the sources:write ability. The plaintext is shown once at creation; only the SHA-256 hash is persisted.

Authorization: Bearer pbar_…48-character-token…

Tokens are workspace-scoped — anything you push lands in the token's workspace, never another. Revoke from the same page; revoked tokens fail the next call without affecting historical audit rows.

Endpoints

List sources

GET https://{your-pitchbar-host}/api/v1/workspace/sources

Returns up to 100 most-recent sources for the token's workspace.

{
    "data": [
        {
            "id": "019e2000-…",
            "agent_id": "019e1fec-…",
            "kind": "url",
            "status": "indexed",
            "config": { "url": "https://example.com/pricing" },
            "last_synced_at": "2026-05-13T08:00:00+00:00",
            "created_at": "2026-05-13T07:50:00+00:00"
        }
    ]
}

Create a source

POST https://{your-pitchbar-host}/api/v1/workspace/sources
Content-Type: application/json
Authorization: Bearer pbar_…

Three kinds are supported. agent_id must reference an agent in the token's workspace; the call 404s otherwise.

1. Crawl a single URL

{
    "agent_id": "019e1fec-…",
    "kind": "url",
    "url": "https://example.com/pricing"
}

2. Crawl a sitemap (fans out to every URL inside)

{
    "agent_id": "019e1fec-…",
    "kind": "sitemap",
    "url": "https://example.com/sitemap.xml"
}

3. Push raw text (skips the crawler — bulletproof for pages behind auth)

{
    "agent_id": "019e1fec-…",
    "kind": "text",
    "title": "Q3 pricing breakdown",
    "content": "Our Starter plan is …",
    "source_url": "https://example.com/internal/pricing"
}

The response (201 Created) returns the new source with status: "pending". The crawler / indexer runs on the Pitchbar queue and flips the status to indexed when chunks land in the vector store. Existing sources with the same content are deduped by SHA-256 hash, so re-running the same call is safe.

Fetch one source

GET https://{your-pitchbar-host}/api/v1/workspace/sources/{id}

Same shape as the list endpoint, single row.

Curl example

curl -X POST https://app.example.com/api/v1/workspace/sources \
  -H "Authorization: Bearer pbar_xxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "019e1fec-…",
    "kind": "text",
    "title": "Refund policy",
    "content": "Customers have 30 days to request a refund …"
  }'

Error shape

StatusBodyMeaning
401{"error":{"code":"missing_token"}}No token in Authorization header.
401{"error":{"code":"invalid_token"}}Token revoked or doesn't match a workspace.
403{"error":{"code":"missing_ability"}}Token doesn't carry sources:write.
404{"error":{"code":"agent_not_found"}}agent_id isn't in the token's workspace.
422Laravel validation envelopeMissing or malformed field.

Rate limits

Per-token throttling kicks in at 120 requests / minute. Over-quota requests return 429 with a Retry-After header.

What happens after a successful push

  1. A Source row is created in the agent's workspace.
  2. For kind=url / kind=sitemap a CrawlSourceJob is queued. Pitchbar's crawler hits the URL (Cloudflare Browser Rendering → Browserless → plain HTTP).
  3. For kind=text an IndexTextSourceJob bypasses the crawler and sends the content straight into chunk extraction.
  4. Chunks are embedded with the workspace's configured embedding model and upserted to Cloudflare Vectorize / Qdrant.
  5. The next visitor question that matches the content gets it as a citation.

Status reaches indexed typically within 10-60 seconds of a push (longer for very large sitemaps that fan out to many pages). Watch the source row in the dashboard or poll the GET endpoint until status transitions.