Ingest API Reference

The Ingest API allows external scanners and pipelines to submit structured signals into the APAC Tech Signals platform. Designed for integration with local OpenClaw scanner instances, RSS processors, and custom data pipelines.

Authentication

All API requests require a Bearer token in the Authorization header. The token must match the OPENCLAW_API_KEY configured on the server.

Authorization: Bearer <YOUR_OPENCLAW_API_KEY>

Keep your API key secure. Do not expose it in client-side code or public repositories.

Base URL
https://your-domain.manus.space

Replace with your actual deployment URL. All endpoints are relative to this base.

Deduplication

The API performs two-layer deduplication to prevent duplicate signals:

Layer 1

URL Hash Match

SHA-256 hash of the normalized source URL. Exact match returns 409 Conflict.

Layer 2

Headline Similarity

Jaccard similarity on word sets (threshold: 0.8) within the same company. Prevents near-duplicate headlines from different URLs.

POST/api/openclaw/ingest

Submit a single signal for ingestion. The signal will be created with 'candidate' status for admin review.

Request Body

FieldTypeRequiredDescription
headlinestringrequiredSignal headline / title
sourceUrlstringrequiredOriginal source URL (used for deduplication)
summaryEnstringoptionalEnglish summary paragraph
summaryZhstringoptionalChinese summary paragraph
companySlugstringoptionalCompany slug for matching (e.g., 'samsung-electronics')
companyNamestringoptionalCompany name for fuzzy matching (fallback if slug not found)
topicSlugstringoptionalIndustry/topic slug (e.g., 'ai-models')
topicNamestringoptionalIndustry/topic name for fuzzy matching
eventTypestringoptionalEvent classification (e.g., 'product_launch', 'partnership', 'funding')
sourceClassstringoptional'canonical' (official source) or 'discovery' (third-party). Default: 'canonical'
sourceNamestringoptionalHuman-readable source name (e.g., 'Samsung Newsroom')
originalPublishedAtstringoptionalISO 8601 date string of original publication
regionstringoptionalGeographic region (e.g., 'East Asia', 'Southeast Asia')
countrystringoptionalCountry name (e.g., 'South Korea', 'Japan')
keyTakeawaysstring | string[]optionalKey takeaways (single string or array)
whyItMattersstringoptionalAnalysis of why this signal matters
whyItMattersZhstringoptionalChinese version of why it matters
apacRelevanceNotestringoptionalNote on APAC-specific relevance
confidenceScorenumberoptionalConfidence score (0-100) from the extraction pipeline
rawContentstringoptionalRaw extracted content for audit trail
processingNotesstringoptionalNotes from the processing pipeline
languagestringoptionalSource language code (e.g., 'en', 'zh', 'ja')
canonicalSourceTypestringoptionalCanonical source type (e.g., 'newsroom', 'ir_page', 'press_release')
discoverySourceUrlstringoptionalURL where the signal was first discovered (if different from sourceUrl)

Example Request

curl -X POST https://your-domain.manus.space/api/openclaw/ingest \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "headline": "Samsung Unveils New AI Chip for Edge Computing",
    "sourceUrl": "https://news.samsung.com/global/ai-chip-2026",
    "summaryEn": "Samsung announced a new AI accelerator chip...",
    "companySlug": "samsung-electronics",
    "topicSlug": "semiconductors",
    "eventType": "product_launch",
    "sourceClass": "canonical",
    "sourceName": "Samsung Newsroom",
    "originalPublishedAt": "2026-03-08T09:00:00Z",
    "region": "East Asia",
    "country": "South Korea",
    "confidenceScore": 85,
    "keyTakeaways": [
      "New chip targets edge AI workloads",
      "40% improvement in power efficiency"
    ]
  }'

Success Response (201)

{
  "success": true,
  "signal": {
    "id": 142,
    "slug": "samsung-unveils-new-ai-chip-for-edge-computing-a1b2c3d4",
    "status": "candidate",
    "companyId": 5,
    "companyMatchMethod": "slug",
    "topicId": 3,
    "topicMatchMethod": "slug",
    "deduplicationHash": "e3b0c44298fc1c149afb..."
  }
}

Error Responses

400

Missing required fields (headline, sourceUrl)

401

Missing or invalid Authorization header

403

Invalid API key

409

Duplicate signal detected (includes duplicateType and existingSignalId)

POST/api/openclaw/ingest/batch

Submit multiple signals in a single batch. Creates an ingest batch record for tracking and audit. Maximum 100 signals per batch.

Request Body

FieldTypeRequiredDescription
signalsSignalInput[]requiredArray of signal objects (max 100 per batch)
batchLabelstringoptionalHuman-readable label for this batch
sourceMethodstringoptional'rss' | 'api' | 'crawler' | 'manual' | 'bulk_import'
triggeredBystringoptionalIdentifier of who/what triggered this import

Each item in the signals array follows the same schema as the single ingest endpoint above.

Example Request

curl -X POST https://your-domain.manus.space/api/openclaw/ingest/batch \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "batchLabel": "Samsung RSS Feed - March 8",
    "sourceMethod": "rss",
    "triggeredBy": "openclaw-scanner-v1",
    "signals": [
      {
        "headline": "Samsung Unveils New AI Chip for Edge Computing",
        "sourceUrl": "https://news.samsung.com/global/ai-chip-2026",
        "companySlug": "samsung-electronics",
        "topicSlug": "semiconductors",
        "eventType": "product_launch",
        "confidenceScore": 85
      },
      {
        "headline": "Samsung Partners with TSMC on 2nm Process",
        "sourceUrl": "https://news.samsung.com/global/tsmc-partnership",
        "companySlug": "samsung-electronics",
        "topicSlug": "semiconductors",
        "eventType": "partnership",
        "confidenceScore": 72
      }
    ]
  }'

Success Response (200)

{
  "success": true,
  "batch": {
    "id": 5,
    "batchKey": "batch-1709884800000-a1b2c3d4",
    "totalItems": 2,
    "newSignals": 1,
    "duplicates": 1,
    "errors": 0,
    "durationMs": 342
  },
  "results": [
    {
      "index": 0,
      "success": true,
      "signalId": 143,
      "slug": "samsung-unveils-new-ai-chip-...",
      "status": "candidate",
      "companyId": 5,
      "companyMatchMethod": "slug",
      "topicId": 3,
      "topicMatchMethod": "slug"
    },
    {
      "index": 1,
      "success": false,
      "error": "Duplicate signal (url_exact)",
      "duplicateType": "url_exact",
      "existingSignalId": 98
    }
  ]
}
Company & Topic Matching

The API resolves companies and topics using a two-step matching strategy:

Step 1: Slug Match (exact)

If companySlug is provided, the API looks for an exact match in the database. This is the preferred method.

Step 2: Name Fuzzy Match (fallback)

If slug match fails and companyName is provided, the API performs a LIKE query. Less precise but useful for discovery sources.

The response includes companyMatchMethod andtopicMatchMethod fields so you can verify how the match was resolved.

Signal Lifecycle

All ingested signals enter the review pipeline with the following status flow:

candidatein_reviewapprovedpublished
Alternative paths:rejectedarchiveddraft

Signals with low confidenceScore are automatically flagged for priority review by the admin team.

Integration Patterns

RSS Scanner

# Python example: RSS feed scanner
import feedparser
import requests

feed = feedparser.parse("https://news.samsung.com/global/feed")
signals = []

for entry in feed.entries[:20]:
    signals.append({
        "headline": entry.title,
        "sourceUrl": entry.link,
        "summaryEn": entry.get("summary", ""),
        "companySlug": "samsung-electronics",
        "topicSlug": "semiconductors",
        "eventType": "news",
        "sourceClass": "canonical",
        "sourceName": "Samsung Newsroom RSS",
        "originalPublishedAt": entry.get("published", ""),
        "confidenceScore": 90,
    })

response = requests.post(
    "https://your-domain.manus.space/api/openclaw/ingest/batch",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "signals": signals,
        "batchLabel": "Samsung RSS Feed",
        "sourceMethod": "rss",
        "triggeredBy": "rss-scanner-v1",
    },
)
print(response.json())

OpenClaw Crawler

# Python example: OpenClaw crawler integration
import requests

# After OpenClaw extracts structured data from a webpage:
extracted = {
    "headline": "TSMC Reports Record Q4 Revenue",
    "sourceUrl": "https://www.tsmc.com/english/news/2026-q4",
    "summaryEn": "TSMC reported record quarterly revenue of...",
    "companySlug": "tsmc",
    "topicSlug": "semiconductors",
    "eventType": "earnings",
    "sourceClass": "canonical",
    "sourceName": "TSMC Investor Relations",
    "confidenceScore": 75,
    "rawContent": "<extracted HTML content for audit>",
}

response = requests.post(
    "https://your-domain.manus.space/api/openclaw/ingest",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json=extracted,
)

result = response.json()
if result["success"]:
    print(f"Signal created: #{result['signal']['id']}")
elif response.status_code == 409:
    print(f"Duplicate: {result.get('duplicateType')}")
Limits & Best Practices

Batch Size

Max 100 signals per batch request

Dedup Window

URL hash is permanent; headline similarity checks last 50 signals per company

Slug Matching

Always prefer companySlug over companyName for reliable matching

Confidence Score

Set 70-100 for high confidence, 40-69 for medium, 0-39 for low (triggers priority review)