Ingest API Reference
The Ingest API allows external scanners and pipelines to submit structured signals into the APAC Tech Signals platform. Designed for integration with local OpenClaw scanner instances, RSS processors, and custom data pipelines.
All API requests require a Bearer token in the Authorization header. The token must match the OPENCLAW_API_KEY configured on the server.
Authorization: Bearer <YOUR_OPENCLAW_API_KEY>Keep your API key secure. Do not expose it in client-side code or public repositories.
https://your-domain.manus.spaceReplace with your actual deployment URL. All endpoints are relative to this base.
The API performs two-layer deduplication to prevent duplicate signals:
URL Hash Match
SHA-256 hash of the normalized source URL. Exact match returns 409 Conflict.
Headline Similarity
Jaccard similarity on word sets (threshold: 0.8) within the same company. Prevents near-duplicate headlines from different URLs.
/api/openclaw/ingestSubmit a single signal for ingestion. The signal will be created with 'candidate' status for admin review.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| headline | string | required | Signal headline / title |
| sourceUrl | string | required | Original source URL (used for deduplication) |
| summaryEn | string | optional | English summary paragraph |
| summaryZh | string | optional | Chinese summary paragraph |
| companySlug | string | optional | Company slug for matching (e.g., 'samsung-electronics') |
| companyName | string | optional | Company name for fuzzy matching (fallback if slug not found) |
| topicSlug | string | optional | Industry/topic slug (e.g., 'ai-models') |
| topicName | string | optional | Industry/topic name for fuzzy matching |
| eventType | string | optional | Event classification (e.g., 'product_launch', 'partnership', 'funding') |
| sourceClass | string | optional | 'canonical' (official source) or 'discovery' (third-party). Default: 'canonical' |
| sourceName | string | optional | Human-readable source name (e.g., 'Samsung Newsroom') |
| originalPublishedAt | string | optional | ISO 8601 date string of original publication |
| region | string | optional | Geographic region (e.g., 'East Asia', 'Southeast Asia') |
| country | string | optional | Country name (e.g., 'South Korea', 'Japan') |
| keyTakeaways | string | string[] | optional | Key takeaways (single string or array) |
| whyItMatters | string | optional | Analysis of why this signal matters |
| whyItMattersZh | string | optional | Chinese version of why it matters |
| apacRelevanceNote | string | optional | Note on APAC-specific relevance |
| confidenceScore | number | optional | Confidence score (0-100) from the extraction pipeline |
| rawContent | string | optional | Raw extracted content for audit trail |
| processingNotes | string | optional | Notes from the processing pipeline |
| language | string | optional | Source language code (e.g., 'en', 'zh', 'ja') |
| canonicalSourceType | string | optional | Canonical source type (e.g., 'newsroom', 'ir_page', 'press_release') |
| discoverySourceUrl | string | optional | URL where the signal was first discovered (if different from sourceUrl) |
Example Request
curl -X POST https://your-domain.manus.space/api/openclaw/ingest \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"headline": "Samsung Unveils New AI Chip for Edge Computing",
"sourceUrl": "https://news.samsung.com/global/ai-chip-2026",
"summaryEn": "Samsung announced a new AI accelerator chip...",
"companySlug": "samsung-electronics",
"topicSlug": "semiconductors",
"eventType": "product_launch",
"sourceClass": "canonical",
"sourceName": "Samsung Newsroom",
"originalPublishedAt": "2026-03-08T09:00:00Z",
"region": "East Asia",
"country": "South Korea",
"confidenceScore": 85,
"keyTakeaways": [
"New chip targets edge AI workloads",
"40% improvement in power efficiency"
]
}'Success Response (201)
{
"success": true,
"signal": {
"id": 142,
"slug": "samsung-unveils-new-ai-chip-for-edge-computing-a1b2c3d4",
"status": "candidate",
"companyId": 5,
"companyMatchMethod": "slug",
"topicId": 3,
"topicMatchMethod": "slug",
"deduplicationHash": "e3b0c44298fc1c149afb..."
}
}Error Responses
Missing required fields (headline, sourceUrl)
Missing or invalid Authorization header
Invalid API key
Duplicate signal detected (includes duplicateType and existingSignalId)
/api/openclaw/ingest/batchSubmit multiple signals in a single batch. Creates an ingest batch record for tracking and audit. Maximum 100 signals per batch.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| signals | SignalInput[] | required | Array of signal objects (max 100 per batch) |
| batchLabel | string | optional | Human-readable label for this batch |
| sourceMethod | string | optional | 'rss' | 'api' | 'crawler' | 'manual' | 'bulk_import' |
| triggeredBy | string | optional | Identifier of who/what triggered this import |
Each item in the signals array follows the same schema as the single ingest endpoint above.
Example Request
curl -X POST https://your-domain.manus.space/api/openclaw/ingest/batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"batchLabel": "Samsung RSS Feed - March 8",
"sourceMethod": "rss",
"triggeredBy": "openclaw-scanner-v1",
"signals": [
{
"headline": "Samsung Unveils New AI Chip for Edge Computing",
"sourceUrl": "https://news.samsung.com/global/ai-chip-2026",
"companySlug": "samsung-electronics",
"topicSlug": "semiconductors",
"eventType": "product_launch",
"confidenceScore": 85
},
{
"headline": "Samsung Partners with TSMC on 2nm Process",
"sourceUrl": "https://news.samsung.com/global/tsmc-partnership",
"companySlug": "samsung-electronics",
"topicSlug": "semiconductors",
"eventType": "partnership",
"confidenceScore": 72
}
]
}'Success Response (200)
{
"success": true,
"batch": {
"id": 5,
"batchKey": "batch-1709884800000-a1b2c3d4",
"totalItems": 2,
"newSignals": 1,
"duplicates": 1,
"errors": 0,
"durationMs": 342
},
"results": [
{
"index": 0,
"success": true,
"signalId": 143,
"slug": "samsung-unveils-new-ai-chip-...",
"status": "candidate",
"companyId": 5,
"companyMatchMethod": "slug",
"topicId": 3,
"topicMatchMethod": "slug"
},
{
"index": 1,
"success": false,
"error": "Duplicate signal (url_exact)",
"duplicateType": "url_exact",
"existingSignalId": 98
}
]
}The API resolves companies and topics using a two-step matching strategy:
Step 1: Slug Match (exact)
If companySlug is provided, the API looks for an exact match in the database. This is the preferred method.
Step 2: Name Fuzzy Match (fallback)
If slug match fails and companyName is provided, the API performs a LIKE query. Less precise but useful for discovery sources.
The response includes companyMatchMethod andtopicMatchMethod fields so you can verify how the match was resolved.
All ingested signals enter the review pipeline with the following status flow:
Signals with low confidenceScore are automatically flagged for priority review by the admin team.
RSS Scanner
# Python example: RSS feed scanner
import feedparser
import requests
feed = feedparser.parse("https://news.samsung.com/global/feed")
signals = []
for entry in feed.entries[:20]:
signals.append({
"headline": entry.title,
"sourceUrl": entry.link,
"summaryEn": entry.get("summary", ""),
"companySlug": "samsung-electronics",
"topicSlug": "semiconductors",
"eventType": "news",
"sourceClass": "canonical",
"sourceName": "Samsung Newsroom RSS",
"originalPublishedAt": entry.get("published", ""),
"confidenceScore": 90,
})
response = requests.post(
"https://your-domain.manus.space/api/openclaw/ingest/batch",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"signals": signals,
"batchLabel": "Samsung RSS Feed",
"sourceMethod": "rss",
"triggeredBy": "rss-scanner-v1",
},
)
print(response.json())OpenClaw Crawler
# Python example: OpenClaw crawler integration
import requests
# After OpenClaw extracts structured data from a webpage:
extracted = {
"headline": "TSMC Reports Record Q4 Revenue",
"sourceUrl": "https://www.tsmc.com/english/news/2026-q4",
"summaryEn": "TSMC reported record quarterly revenue of...",
"companySlug": "tsmc",
"topicSlug": "semiconductors",
"eventType": "earnings",
"sourceClass": "canonical",
"sourceName": "TSMC Investor Relations",
"confidenceScore": 75,
"rawContent": "<extracted HTML content for audit>",
}
response = requests.post(
"https://your-domain.manus.space/api/openclaw/ingest",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json=extracted,
)
result = response.json()
if result["success"]:
print(f"Signal created: #{result['signal']['id']}")
elif response.status_code == 409:
print(f"Duplicate: {result.get('duplicateType')}")Batch Size
Max 100 signals per batch request
Dedup Window
URL hash is permanent; headline similarity checks last 50 signals per company
Slug Matching
Always prefer companySlug over companyName for reliable matching
Confidence Score
Set 70-100 for high confidence, 40-69 for medium, 0-39 for low (triggers priority review)
