XHub Documentation

Overview

XHub is an intelligence platform built on top of the full stream of X (formerly Twitter). It indexes posts in real time, applies a multi-stage NLP pipeline, and lets you retrieve structured analysis via a web interface or REST API.

Unlike X's own search, XHub does not filter by algorithmic relevance. It indexes the complete post corpus and lets you define your own relevance model through query parameters, filters, and weighting rules.

What XHub does

Ingests the full public post stream of X — approximately 40 million posts per day.
Indexes every post, account, hashtag, and entity into a structured database with sub-second query latency.
Analyzes each query result using sentiment scoring, entity extraction, narrative clustering, and influence weighting.
Returns a structured report with a natural-language summary, statistical breakdown, and a complete list of source posts with direct X links.

What XHub does not do

XHub does not post, reply, or take any write action on X.
XHub does not access private accounts or direct messages.
XHub does not provide real-time socket push — use the streaming endpoint for near-real-time polling.

Note

All data returned by XHub is sourced from public posts only. XHub complies with X's API terms of service and data use policies.

Quickstart

Get your first report in under five minutes.

1. Get an API key

Sign up at xhub.io and navigate to Settings → API Keys. Create a new key. Keep it secret — treat it like a password.

2. Make your first request

curl

# POST a query and receive a structured report
curl -X POST https://api.xhub.io/v1/query \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "AI breakthroughs this week",
    "timeframe": "7d",
    "limit": 500,
    "sentiment": true,
    "sources": true
  }'

3. Read the response

json

{
  "id": "rpt_01hwx9bkzabcd1234",
  "query": "AI breakthroughs this week",
  "generated_at": "2025-03-19T10:42:00Z",
  "summary": "Discourse around AI breakthroughs peaked on Mar 15...",
  "post_count": 847203,
  "sentiment": {
    "positive": 0.62,
    "negative": 0.21,
    "neutral": 0.17
  },
  "top_accounts": ["@elonmusk", "@sama", "@ylecun"],
  "sources": [
    {
      "post_id": "1899234567890",
      "author": "@sama",
      "text": "We are approaching a critical threshold...",
      "url": "https://x.com/sama/status/1899234567890",
      "engagement": 142800,
      "sentiment_score": 0.78
    }
  ]
}

Authentication

All API requests must include a bearer token in the Authorization header.

header

Authorization: Bearer xhub_live_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Key types

Prefix	Environment	Description
xhub_live_sk_	Production	Full access to the production dataset and all endpoints.
xhub_test_sk_	Sandbox	Returns synthetic data. Safe for development — no rate limits.

Security

Never expose API keys in client-side code or public repositories. Rotate keys immediately if compromised via Settings → API Keys → Revoke.

Queries

A query is the primary way to retrieve data from XHub. Queries accept natural language, boolean expressions, or a structured filter object.

Query syntax

XHub supports three query modes:

Natural language — plain English description of the topic. XHub uses semantic matching to retrieve relevant posts beyond exact keyword hits.
Boolean — standard AND, OR, NOT operators with parentheses.
Structured filter — a JSON object with explicit field filters (account, hashtag, date, language, engagement threshold).

examples

# Natural language
"AI breakthroughs announced this week"

# Boolean
"(GPT-5 OR Claude OR Gemini) AND NOT rumor"

# Structured filter object
{
  "keywords": ["GPT-5", "Claude"],
  "from_accounts": ["@openai", "@anthropic"],
  "min_engagement": 1000,
  "language": "en",
  "timeframe": { "from": "2025-03-01", "to": "2025-03-19" }
}

Reports

A report is the output of a query. Every report contains a natural-language summary, statistical breakdowns, and a source list. Reports are immutable once generated — they represent a snapshot of the corpus at the time of generation.

Report object schema

Field	Type	Description
id	string	Unique report identifier. Prefixed `rpt_`.
query	string	The original query string.
generated_at	string (ISO8601)	Timestamp of report generation.
summary	string	Natural-language synthesis of the top findings.
post_count	integer	Total number of posts analyzed.
sentiment	object	Aggregate sentiment: `positive`, `negative`, `neutral` (0–1 floats, sum to 1).
top_accounts	string[]	Accounts with highest influence weight in the result set.
trending_hashtags	string[]	Most frequent hashtags within the matched posts.
sources	Source[]	Ordered list of source posts. See Source object below.

Source object schema

Field	Type	Description
post_id	string	X post ID.
author	string	X handle (e.g. `@sama`).
text	string	Full text of the post at time of indexing.
url	string (URL)	Direct link to the original post on X.
posted_at	string (ISO8601)	Timestamp of original post.
engagement	integer	Combined likes + reposts + replies at time of indexing.
sentiment_score	float	Per-post sentiment: -1.0 (negative) to +1.0 (positive).
relevance_score	float	Query relevance: 0.0–1.0.

Sentiment analysis

XHub applies a three-pass sentiment pipeline to every post in a result set.

Pipeline stages

Token classification — a fine-tuned BERT model classifies tokens as positive, negative, or neutral based on a training corpus of 180M labeled social posts.
Context resolution — a second model handles negation, irony, and sarcasm — common failure modes in single-pass classifiers trained on formal text.
Aspect decomposition — when a post mentions multiple entities, XHub assigns per-entity sentiment scores rather than a single post-level score.

Accuracy benchmarks

Metric	XHub	Baseline (VADER)
Accuracy (3-class)	91.4%	74.2%
F1 (negative)	88.7%	68.1%
Irony recall	76.3%	31.0%
Latency per post	1.2 ms	0.1 ms

Source attribution

Every claim in an XHub report is traced to a specific post. Source attribution is not optional — it is a core invariant of the system. This section explains how sources are ranked and linked.

Ranking factors

Relevance score — semantic similarity between post and query (0–1). Posts below 0.4 are excluded.
Influence weight — a composite of account follower count, engagement rate, and network centrality within the result set.
Recency bonus — posts within the last 24 hours receive a 1.3x recency multiplier by default. Configurable via recency_weight param.
Diversity penalty — to prevent one account from dominating the source list, posts beyond the 3rd from a single account receive a 0.7x penalty.

Direct links

Every source object includes a url field containing the canonical link to the original post on X. XHub does not proxy or cache the destination pages — links resolve directly to x.com.

API overview

The XHub REST API is the primary integration surface. All endpoints are under the base URL:

https://api.xhub.io/v1

All requests and responses use JSON. All timestamps are ISO 8601 in UTC.

Versioning

The API version is embedded in the path (/v1/). Breaking changes will introduce a new version. Non-breaking additions (new fields, new endpoints) do not increment the version. Deprecated fields will include a deprecated_at date and a minimum 90-day migration window.

POST /query

Submit a query and receive a fully analyzed report. This is a synchronous endpoint — the response is returned when analysis is complete (typically under 400ms).

endpoint

POST https://api.xhub.io/v1/query

Request body

Parameter	Type	Required	Description
query	string	required	Natural language, boolean, or structured query. Max 1000 chars.
timeframe	string	optional	Duration string (`1h`, `7d`, `30d`) or ISO date range object. Default: `7d`.
limit	integer	optional	Max posts to analyze. Range: 1–10,000. Default: 1000.
sentiment	boolean	optional	Include sentiment analysis. Adds ~40ms. Default: `true`.
sources	boolean	optional	Include source list in response. Default: `true`.
sources_limit	integer	optional	Max sources returned. Range: 1–200. Default: 20.
language	string	optional	ISO 639-1 language code filter. Omit for all languages.
min_engagement	integer	optional	Filter posts below this engagement threshold. Default: 0.
recency_weight	float	optional	Multiplier for recent posts (0.0–3.0). Default: 1.3.
format	string	optional	`json` (default), `markdown`, `pdf`.

Python example

python

import xhub

client = xhub.Client(api_key="xhub_live_sk_...")

report = client.query.create(
    query="AI breakthroughs this week",
    timeframe="7d",
    sentiment=True,
    sources=True,
    sources_limit=10,
)

print(report.summary)
for source in report.sources:
    print(source.author, source.url)

TypeScript example

typescript

import { XHub } from '@xhub/sdk';

const client = new XHub({ apiKey: process.env.XHUB_API_KEY });

const report = await client.query.create({
  query: 'AI breakthroughs this week',
  timeframe: '7d',
  sentiment: true,
  sources: true,
  sourcesLimit: 10,
});

console.log(report.summary);
report.sources.forEach(s => console.log(s.author, s.url));

GET /report/:id

Retrieve a previously generated report by ID. Reports are stored for 90 days (Researcher plan) or 365 days (Professional and Enterprise).

endpoint

GET https://api.xhub.io/v1/report/{report_id}

curl

curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234 \
  -H "Authorization: Bearer YOUR_API_KEY"

GET /stream

Subscribe to a continuous query stream. XHub will re-run your query every N seconds and return new results incrementally via server-sent events (SSE).

endpoint

GET https://api.xhub.io/v1/stream?query={encoded_query}&interval=30

Parameter	Type	Required	Description
query	string	required	URL-encoded query string.
interval	integer	optional	Poll interval in seconds. Min: 10. Max: 3600. Default: 60.
since	string	optional	Only return posts newer than this ISO timestamp.

Professional plan required

The streaming endpoint is only available on Professional and Enterprise plans. Researcher plans must poll via POST /query.

Rate limits

Plan	Requests / day	Requests / minute	Max limit param
Researcher	500	10	1,000
Professional	10,000	60	10,000
Enterprise	Unlimited	Custom	Unlimited

Rate limit headers are included in every response:

X-RateLimit-Limit: 10000
X-RateLimit-Remaining: 9847
X-RateLimit-Reset: 1710806400

SDK

Official SDKs are available for Python and TypeScript/JavaScript.

Python

terminal

pip install xhub

TypeScript / JavaScript

terminal

npm install @xhub/sdk
# or
yarn add @xhub/sdk

Configuration

python

import xhub

# Explicit key
client = xhub.Client(api_key="xhub_live_sk_...")

# From environment variable XHUB_API_KEY
client = xhub.Client()

# Custom options
client = xhub.Client(
    api_key="xhub_live_sk_...",
    timeout=30,
    max_retries=3,
    base_url="https://api.xhub.io/v1",
)

Webhooks

Webhooks allow XHub to push reports to your endpoint automatically on a schedule or when a trigger condition is met.

Creating a webhook

curl

curl -X POST https://api.xhub.io/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "url": "https://your-server.com/xhub-hook",
    "query": "AI breakthroughs",
    "schedule": "0 9 * * *",
    "secret": "whsec_xxxxxxxxxxxxxx"
  }'

Verifying webhook signatures

Every webhook delivery includes an X-XHub-Signature-256 header — an HMAC-SHA256 of the raw request body signed with your webhook secret.

python

import hmac, hashlib

def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(
        f"sha256={expected}",
        signature
    )

Exports

Reports can be exported in three formats. Specify the format in the query request or convert an existing report via the export endpoint.

Format	Content-Type	Description
json	application/json	Full structured report. Best for programmatic consumption.
markdown	text/markdown	Human-readable narrative with inline source citations.
pdf	application/pdf	Formatted PDF with charts and source table. Professional plan only.

curl

# Export an existing report as PDF
curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234/export?format=pdf \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output report.pdf

Error codes

XHub uses standard HTTP status codes. Error responses include a JSON body with a machine-readable code and a human-readable message.

json

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded your daily request limit of 500.",
    "retry_after": 3600
  }
}

HTTP	Code	Description
400	invalid_query	Query string is malformed or exceeds 1000 characters.
401	unauthorized	API key missing or invalid.
403	plan_limitation	Feature not available on current plan.
404	report_not_found	Report ID does not exist or has expired.
422	unprocessable_query	Query parsed but returned no indexable results.
429	rate_limit_exceeded	Rate limit hit. Check `X-RateLimit-Reset` header.
500	internal_error	XHub server error. Retry with exponential backoff.
503	service_unavailable	Temporary outage. Check status.xhub.io.

Changelog

v1.4.0 — March 2025

Added aspect-level sentiment decomposition — per-entity sentiment scores within a single post.
Added influencer graph field in Professional and Enterprise reports.
Streaming endpoint now supports since parameter for incremental delivery.
Reduced median query latency from 480ms to 340ms.

v1.3.0 — January 2025

Introduced trend forecasting — velocity spike detection up to 6 hours ahead of virality.
Added TypeScript SDK (@xhub/sdk).
PDF export now includes embedded charts.
Historical archive extended back to January 2020.

v1.2.0 — October 2024

Added webhook support with HMAC signature verification.
Added recency_weight parameter.
Structured filter object now supported as query value.

v1.0.0 — June 2024

Initial public release. POST /query, GET /report, Python SDK.

XHub Platform Docs

Overview

What XHub does

What XHub does not do

Quickstart

1. Get an API key

2. Make your first request

3. Read the response

Authentication

Key types

Queries

Query syntax

Reports

Report object schema

Source object schema

Sentiment analysis

Pipeline stages

Accuracy benchmarks

Source attribution

Ranking factors

Direct links

API overview

Versioning

POST /query

Request body

Python example

TypeScript example

GET /report/:id

GET /stream

Rate limits

SDK

Python

TypeScript / JavaScript

Configuration

Webhooks

Creating a webhook

Verifying webhook signatures

Exports

Error codes

Changelog

v1.4.0 — March 2025

v1.3.0 — January 2025

v1.2.0 — October 2024

v1.0.0 — June 2024