XHub Platform Docs
Everything you need to integrate XHub into your workflow — from a five-minute quickstart to full API reference and advanced configuration.
Overview
XHub is an intelligence platform built on top of the full stream of X (formerly Twitter). It indexes posts in real time, applies a multi-stage NLP pipeline, and lets you retrieve structured analysis via a web interface or REST API.
Unlike X's own search, XHub does not filter by algorithmic relevance. It indexes the complete post corpus and lets you define your own relevance model through query parameters, filters, and weighting rules.
What XHub does
- Ingests the full public post stream of X — approximately 40 million posts per day.
- Indexes every post, account, hashtag, and entity into a structured database with sub-second query latency.
- Analyzes each query result using sentiment scoring, entity extraction, narrative clustering, and influence weighting.
- Returns a structured report with a natural-language summary, statistical breakdown, and a complete list of source posts with direct X links.
What XHub does not do
- XHub does not post, reply, or take any write action on X.
- XHub does not access private accounts or direct messages.
- XHub does not provide real-time socket push — use the streaming endpoint for near-real-time polling.
All data returned by XHub is sourced from public posts only. XHub complies with X's API terms of service and data use policies.
Quickstart
Get your first report in under five minutes.
1. Get an API key
Sign up at xhub.io and navigate to Settings → API Keys. Create a new key. Keep it secret — treat it like a password.
2. Make your first request
# POST a query and receive a structured report curl -X POST https://api.xhub.io/v1/query \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "query": "AI breakthroughs this week", "timeframe": "7d", "limit": 500, "sentiment": true, "sources": true }'
3. Read the response
{
"id": "rpt_01hwx9bkzabcd1234",
"query": "AI breakthroughs this week",
"generated_at": "2025-03-19T10:42:00Z",
"summary": "Discourse around AI breakthroughs peaked on Mar 15...",
"post_count": 847203,
"sentiment": {
"positive": 0.62,
"negative": 0.21,
"neutral": 0.17
},
"top_accounts": ["@elonmusk", "@sama", "@ylecun"],
"sources": [
{
"post_id": "1899234567890",
"author": "@sama",
"text": "We are approaching a critical threshold...",
"url": "https://x.com/sama/status/1899234567890",
"engagement": 142800,
"sentiment_score": 0.78
}
]
}
Authentication
All API requests must include a bearer token in the Authorization header.
Authorization: Bearer xhub_live_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Key types
| Prefix | Environment | Description |
|---|---|---|
| xhub_live_sk_ | Production | Full access to the production dataset and all endpoints. |
| xhub_test_sk_ | Sandbox | Returns synthetic data. Safe for development — no rate limits. |
Never expose API keys in client-side code or public repositories. Rotate keys immediately if compromised via Settings → API Keys → Revoke.
Queries
A query is the primary way to retrieve data from XHub. Queries accept natural language, boolean expressions, or a structured filter object.
Query syntax
XHub supports three query modes:
- Natural language — plain English description of the topic. XHub uses semantic matching to retrieve relevant posts beyond exact keyword hits.
- Boolean — standard
AND,OR,NOToperators with parentheses. - Structured filter — a JSON object with explicit field filters (account, hashtag, date, language, engagement threshold).
# Natural language "AI breakthroughs announced this week" # Boolean "(GPT-5 OR Claude OR Gemini) AND NOT rumor" # Structured filter object { "keywords": ["GPT-5", "Claude"], "from_accounts": ["@openai", "@anthropic"], "min_engagement": 1000, "language": "en", "timeframe": { "from": "2025-03-01", "to": "2025-03-19" } }
Reports
A report is the output of a query. Every report contains a natural-language summary, statistical breakdowns, and a source list. Reports are immutable once generated — they represent a snapshot of the corpus at the time of generation.
Report object schema
| Field | Type | Description |
|---|---|---|
| id | string | Unique report identifier. Prefixed rpt_. |
| query | string | The original query string. |
| generated_at | string (ISO8601) | Timestamp of report generation. |
| summary | string | Natural-language synthesis of the top findings. |
| post_count | integer | Total number of posts analyzed. |
| sentiment | object | Aggregate sentiment: positive, negative, neutral (0–1 floats, sum to 1). |
| top_accounts | string[] | Accounts with highest influence weight in the result set. |
| trending_hashtags | string[] | Most frequent hashtags within the matched posts. |
| sources | Source[] | Ordered list of source posts. See Source object below. |
Source object schema
| Field | Type | Description |
|---|---|---|
| post_id | string | X post ID. |
| author | string | X handle (e.g. @sama). |
| text | string | Full text of the post at time of indexing. |
| url | string (URL) | Direct link to the original post on X. |
| posted_at | string (ISO8601) | Timestamp of original post. |
| engagement | integer | Combined likes + reposts + replies at time of indexing. |
| sentiment_score | float | Per-post sentiment: -1.0 (negative) to +1.0 (positive). |
| relevance_score | float | Query relevance: 0.0–1.0. |
Sentiment analysis
XHub applies a three-pass sentiment pipeline to every post in a result set.
Pipeline stages
- Token classification — a fine-tuned BERT model classifies tokens as positive, negative, or neutral based on a training corpus of 180M labeled social posts.
- Context resolution — a second model handles negation, irony, and sarcasm — common failure modes in single-pass classifiers trained on formal text.
- Aspect decomposition — when a post mentions multiple entities, XHub assigns per-entity sentiment scores rather than a single post-level score.
Accuracy benchmarks
| Metric | XHub | Baseline (VADER) |
|---|---|---|
| Accuracy (3-class) | 91.4% | 74.2% |
| F1 (negative) | 88.7% | 68.1% |
| Irony recall | 76.3% | 31.0% |
| Latency per post | 1.2 ms | 0.1 ms |
Source attribution
Every claim in an XHub report is traced to a specific post. Source attribution is not optional — it is a core invariant of the system. This section explains how sources are ranked and linked.
Ranking factors
- Relevance score — semantic similarity between post and query (0–1). Posts below 0.4 are excluded.
- Influence weight — a composite of account follower count, engagement rate, and network centrality within the result set.
- Recency bonus — posts within the last 24 hours receive a 1.3x recency multiplier by default. Configurable via
recency_weightparam. - Diversity penalty — to prevent one account from dominating the source list, posts beyond the 3rd from a single account receive a 0.7x penalty.
Direct links
Every source object includes a url field containing the canonical link to the original post on X. XHub does not proxy or cache the destination pages — links resolve directly to x.com.
API overview
The XHub REST API is the primary integration surface. All endpoints are under the base URL:
https://api.xhub.io/v1
All requests and responses use JSON. All timestamps are ISO 8601 in UTC.
Versioning
The API version is embedded in the path (/v1/). Breaking changes will introduce a new version. Non-breaking additions (new fields, new endpoints) do not increment the version. Deprecated fields will include a deprecated_at date and a minimum 90-day migration window.
POST /query
Submit a query and receive a fully analyzed report. This is a synchronous endpoint — the response is returned when analysis is complete (typically under 400ms).
POST https://api.xhub.io/v1/query
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
| query | string | required | Natural language, boolean, or structured query. Max 1000 chars. |
| timeframe | string | optional | Duration string (1h, 7d, 30d) or ISO date range object. Default: 7d. |
| limit | integer | optional | Max posts to analyze. Range: 1–10,000. Default: 1000. |
| sentiment | boolean | optional | Include sentiment analysis. Adds ~40ms. Default: true. |
| sources | boolean | optional | Include source list in response. Default: true. |
| sources_limit | integer | optional | Max sources returned. Range: 1–200. Default: 20. |
| language | string | optional | ISO 639-1 language code filter. Omit for all languages. |
| min_engagement | integer | optional | Filter posts below this engagement threshold. Default: 0. |
| recency_weight | float | optional | Multiplier for recent posts (0.0–3.0). Default: 1.3. |
| format | string | optional | json (default), markdown, pdf. |
Python example
import xhub client = xhub.Client(api_key="xhub_live_sk_...") report = client.query.create( query="AI breakthroughs this week", timeframe="7d", sentiment=True, sources=True, sources_limit=10, ) print(report.summary) for source in report.sources: print(source.author, source.url)
TypeScript example
import { XHub } from '@xhub/sdk'; const client = new XHub({ apiKey: process.env.XHUB_API_KEY }); const report = await client.query.create({ query: 'AI breakthroughs this week', timeframe: '7d', sentiment: true, sources: true, sourcesLimit: 10, }); console.log(report.summary); report.sources.forEach(s => console.log(s.author, s.url));
GET /report/:id
Retrieve a previously generated report by ID. Reports are stored for 90 days (Researcher plan) or 365 days (Professional and Enterprise).
GET https://api.xhub.io/v1/report/{report_id}
curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234 \
-H "Authorization: Bearer YOUR_API_KEY"
GET /stream
Subscribe to a continuous query stream. XHub will re-run your query every N seconds and return new results incrementally via server-sent events (SSE).
GET https://api.xhub.io/v1/stream?query={encoded_query}&interval=30
| Parameter | Type | Required | Description |
|---|---|---|---|
| query | string | required | URL-encoded query string. |
| interval | integer | optional | Poll interval in seconds. Min: 10. Max: 3600. Default: 60. |
| since | string | optional | Only return posts newer than this ISO timestamp. |
The streaming endpoint is only available on Professional and Enterprise plans. Researcher plans must poll via POST /query.
Rate limits
| Plan | Requests / day | Requests / minute | Max limit param |
|---|---|---|---|
| Researcher | 500 | 10 | 1,000 |
| Professional | 10,000 | 60 | 10,000 |
| Enterprise | Unlimited | Custom | Unlimited |
Rate limit headers are included in every response:
X-RateLimit-Limit: 10000 X-RateLimit-Remaining: 9847 X-RateLimit-Reset: 1710806400
SDK
Official SDKs are available for Python and TypeScript/JavaScript.
Python
pip install xhub
TypeScript / JavaScript
npm install @xhub/sdk
# or
yarn add @xhub/sdk
Configuration
import xhub # Explicit key client = xhub.Client(api_key="xhub_live_sk_...") # From environment variable XHUB_API_KEY client = xhub.Client() # Custom options client = xhub.Client( api_key="xhub_live_sk_...", timeout=30, max_retries=3, base_url="https://api.xhub.io/v1", )
Webhooks
Webhooks allow XHub to push reports to your endpoint automatically on a schedule or when a trigger condition is met.
Creating a webhook
curl -X POST https://api.xhub.io/v1/webhooks \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{ "url": "https://your-server.com/xhub-hook", "query": "AI breakthroughs", "schedule": "0 9 * * *", "secret": "whsec_xxxxxxxxxxxxxx" }'
Verifying webhook signatures
Every webhook delivery includes an X-XHub-Signature-256 header — an HMAC-SHA256 of the raw request body signed with your webhook secret.
import hmac, hashlib def verify_webhook(payload: bytes, signature: str, secret: str) -> bool: expected = hmac.new( secret.encode(), payload, hashlib.sha256 ).hexdigest() return hmac.compare_digest( f"sha256={expected}", signature )
Exports
Reports can be exported in three formats. Specify the format in the query request or convert an existing report via the export endpoint.
| Format | Content-Type | Description |
|---|---|---|
| json | application/json | Full structured report. Best for programmatic consumption. |
| markdown | text/markdown | Human-readable narrative with inline source citations. |
| application/pdf | Formatted PDF with charts and source table. Professional plan only. |
# Export an existing report as PDF curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234/export?format=pdf \ -H "Authorization: Bearer YOUR_API_KEY" \ --output report.pdf
Error codes
XHub uses standard HTTP status codes. Error responses include a JSON body with a machine-readable code and a human-readable message.
{
"error": {
"code": "rate_limit_exceeded",
"message": "You have exceeded your daily request limit of 500.",
"retry_after": 3600
}
}
| HTTP | Code | Description |
|---|---|---|
| 400 | invalid_query | Query string is malformed or exceeds 1000 characters. |
| 401 | unauthorized | API key missing or invalid. |
| 403 | plan_limitation | Feature not available on current plan. |
| 404 | report_not_found | Report ID does not exist or has expired. |
| 422 | unprocessable_query | Query parsed but returned no indexable results. |
| 429 | rate_limit_exceeded | Rate limit hit. Check X-RateLimit-Reset header. |
| 500 | internal_error | XHub server error. Retry with exponential backoff. |
| 503 | service_unavailable | Temporary outage. Check status.xhub.io. |
Changelog
v1.4.0 — March 2025
- Added aspect-level sentiment decomposition — per-entity sentiment scores within a single post.
- Added influencer graph field in Professional and Enterprise reports.
- Streaming endpoint now supports
sinceparameter for incremental delivery. - Reduced median query latency from 480ms to 340ms.
v1.3.0 — January 2025
- Introduced trend forecasting — velocity spike detection up to 6 hours ahead of virality.
- Added TypeScript SDK (
@xhub/sdk). - PDF export now includes embedded charts.
- Historical archive extended back to January 2020.
v1.2.0 — October 2024
- Added webhook support with HMAC signature verification.
- Added
recency_weightparameter. - Structured filter object now supported as query value.
v1.0.0 — June 2024
- Initial public release. POST /query, GET /report, Python SDK.