Documentation

XHub Platform Docs

Everything you need to integrate XHub into your workflow — from a five-minute quickstart to full API reference and advanced configuration.

Overview

XHub is an intelligence platform built on top of the full stream of X (formerly Twitter). It indexes posts in real time, applies a multi-stage NLP pipeline, and lets you retrieve structured analysis via a web interface or REST API.

Unlike X's own search, XHub does not filter by algorithmic relevance. It indexes the complete post corpus and lets you define your own relevance model through query parameters, filters, and weighting rules.

What XHub does

  • Ingests the full public post stream of X — approximately 40 million posts per day.
  • Indexes every post, account, hashtag, and entity into a structured database with sub-second query latency.
  • Analyzes each query result using sentiment scoring, entity extraction, narrative clustering, and influence weighting.
  • Returns a structured report with a natural-language summary, statistical breakdown, and a complete list of source posts with direct X links.

What XHub does not do

  • XHub does not post, reply, or take any write action on X.
  • XHub does not access private accounts or direct messages.
  • XHub does not provide real-time socket push — use the streaming endpoint for near-real-time polling.
Note

All data returned by XHub is sourced from public posts only. XHub complies with X's API terms of service and data use policies.

Quickstart

Get your first report in under five minutes.

1. Get an API key

Sign up at xhub.io and navigate to Settings → API Keys. Create a new key. Keep it secret — treat it like a password.

2. Make your first request

curl
# POST a query and receive a structured report
curl -X POST https://api.xhub.io/v1/query \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "AI breakthroughs this week",
    "timeframe": "7d",
    "limit": 500,
    "sentiment": true,
    "sources": true
  }'

3. Read the response

json
{
  "id": "rpt_01hwx9bkzabcd1234",
  "query": "AI breakthroughs this week",
  "generated_at": "2025-03-19T10:42:00Z",
  "summary": "Discourse around AI breakthroughs peaked on Mar 15...",
  "post_count": 847203,
  "sentiment": {
    "positive": 0.62,
    "negative": 0.21,
    "neutral": 0.17
  },
  "top_accounts": ["@elonmusk", "@sama", "@ylecun"],
  "sources": [
    {
      "post_id": "1899234567890",
      "author": "@sama",
      "text": "We are approaching a critical threshold...",
      "url": "https://x.com/sama/status/1899234567890",
      "engagement": 142800,
      "sentiment_score": 0.78
    }
  ]
}

Authentication

All API requests must include a bearer token in the Authorization header.

header
Authorization: Bearer xhub_live_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Key types

PrefixEnvironmentDescription
xhub_live_sk_ Production Full access to the production dataset and all endpoints.
xhub_test_sk_ Sandbox Returns synthetic data. Safe for development — no rate limits.
Security

Never expose API keys in client-side code or public repositories. Rotate keys immediately if compromised via Settings → API Keys → Revoke.

Queries

A query is the primary way to retrieve data from XHub. Queries accept natural language, boolean expressions, or a structured filter object.

Query syntax

XHub supports three query modes:

  • Natural language — plain English description of the topic. XHub uses semantic matching to retrieve relevant posts beyond exact keyword hits.
  • Boolean — standard AND, OR, NOT operators with parentheses.
  • Structured filter — a JSON object with explicit field filters (account, hashtag, date, language, engagement threshold).
examples
# Natural language
"AI breakthroughs announced this week"

# Boolean
"(GPT-5 OR Claude OR Gemini) AND NOT rumor"

# Structured filter object
{
  "keywords": ["GPT-5", "Claude"],
  "from_accounts": ["@openai", "@anthropic"],
  "min_engagement": 1000,
  "language": "en",
  "timeframe": { "from": "2025-03-01", "to": "2025-03-19" }
}

Reports

A report is the output of a query. Every report contains a natural-language summary, statistical breakdowns, and a source list. Reports are immutable once generated — they represent a snapshot of the corpus at the time of generation.

Report object schema

FieldTypeDescription
id string Unique report identifier. Prefixed rpt_.
query string The original query string.
generated_at string (ISO8601) Timestamp of report generation.
summary string Natural-language synthesis of the top findings.
post_count integer Total number of posts analyzed.
sentiment object Aggregate sentiment: positive, negative, neutral (0–1 floats, sum to 1).
top_accounts string[] Accounts with highest influence weight in the result set.
trending_hashtags string[] Most frequent hashtags within the matched posts.
sources Source[] Ordered list of source posts. See Source object below.

Source object schema

FieldTypeDescription
post_id string X post ID.
author string X handle (e.g. @sama).
text string Full text of the post at time of indexing.
url string (URL) Direct link to the original post on X.
posted_at string (ISO8601) Timestamp of original post.
engagement integer Combined likes + reposts + replies at time of indexing.
sentiment_score float Per-post sentiment: -1.0 (negative) to +1.0 (positive).
relevance_score float Query relevance: 0.0–1.0.

Sentiment analysis

XHub applies a three-pass sentiment pipeline to every post in a result set.

Pipeline stages

  1. Token classification — a fine-tuned BERT model classifies tokens as positive, negative, or neutral based on a training corpus of 180M labeled social posts.
  2. Context resolution — a second model handles negation, irony, and sarcasm — common failure modes in single-pass classifiers trained on formal text.
  3. Aspect decomposition — when a post mentions multiple entities, XHub assigns per-entity sentiment scores rather than a single post-level score.

Accuracy benchmarks

MetricXHubBaseline (VADER)
Accuracy (3-class)91.4%74.2%
F1 (negative)88.7%68.1%
Irony recall76.3%31.0%
Latency per post1.2 ms0.1 ms

Source attribution

Every claim in an XHub report is traced to a specific post. Source attribution is not optional — it is a core invariant of the system. This section explains how sources are ranked and linked.

Ranking factors

  • Relevance score — semantic similarity between post and query (0–1). Posts below 0.4 are excluded.
  • Influence weight — a composite of account follower count, engagement rate, and network centrality within the result set.
  • Recency bonus — posts within the last 24 hours receive a 1.3x recency multiplier by default. Configurable via recency_weight param.
  • Diversity penalty — to prevent one account from dominating the source list, posts beyond the 3rd from a single account receive a 0.7x penalty.

Direct links

Every source object includes a url field containing the canonical link to the original post on X. XHub does not proxy or cache the destination pages — links resolve directly to x.com.

API overview

The XHub REST API is the primary integration surface. All endpoints are under the base URL:

https://api.xhub.io/v1

All requests and responses use JSON. All timestamps are ISO 8601 in UTC.

Versioning

The API version is embedded in the path (/v1/). Breaking changes will introduce a new version. Non-breaking additions (new fields, new endpoints) do not increment the version. Deprecated fields will include a deprecated_at date and a minimum 90-day migration window.

POST /query

Submit a query and receive a fully analyzed report. This is a synchronous endpoint — the response is returned when analysis is complete (typically under 400ms).

endpoint
POST https://api.xhub.io/v1/query

Request body

ParameterTypeRequiredDescription
query string required Natural language, boolean, or structured query. Max 1000 chars.
timeframe string optional Duration string (1h, 7d, 30d) or ISO date range object. Default: 7d.
limit integer optional Max posts to analyze. Range: 1–10,000. Default: 1000.
sentiment boolean optional Include sentiment analysis. Adds ~40ms. Default: true.
sources boolean optional Include source list in response. Default: true.
sources_limit integer optional Max sources returned. Range: 1–200. Default: 20.
language string optional ISO 639-1 language code filter. Omit for all languages.
min_engagement integer optional Filter posts below this engagement threshold. Default: 0.
recency_weight float optional Multiplier for recent posts (0.0–3.0). Default: 1.3.
format string optional json (default), markdown, pdf.

Python example

python
import xhub

client = xhub.Client(api_key="xhub_live_sk_...")

report = client.query.create(
    query="AI breakthroughs this week",
    timeframe="7d",
    sentiment=True,
    sources=True,
    sources_limit=10,
)

print(report.summary)
for source in report.sources:
    print(source.author, source.url)

TypeScript example

typescript
import { XHub } from '@xhub/sdk';

const client = new XHub({ apiKey: process.env.XHUB_API_KEY });

const report = await client.query.create({
  query: 'AI breakthroughs this week',
  timeframe: '7d',
  sentiment: true,
  sources: true,
  sourcesLimit: 10,
});

console.log(report.summary);
report.sources.forEach(s => console.log(s.author, s.url));

GET /report/:id

Retrieve a previously generated report by ID. Reports are stored for 90 days (Researcher plan) or 365 days (Professional and Enterprise).

endpoint
GET https://api.xhub.io/v1/report/{report_id}
curl
curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234 \
  -H "Authorization: Bearer YOUR_API_KEY"

GET /stream

Subscribe to a continuous query stream. XHub will re-run your query every N seconds and return new results incrementally via server-sent events (SSE).

endpoint
GET https://api.xhub.io/v1/stream?query={encoded_query}&interval=30
ParameterTypeRequiredDescription
query string required URL-encoded query string.
interval integer optional Poll interval in seconds. Min: 10. Max: 3600. Default: 60.
since string optional Only return posts newer than this ISO timestamp.
Professional plan required

The streaming endpoint is only available on Professional and Enterprise plans. Researcher plans must poll via POST /query.

Rate limits

PlanRequests / dayRequests / minuteMax limit param
Researcher500101,000
Professional10,0006010,000
EnterpriseUnlimitedCustomUnlimited

Rate limit headers are included in every response:

X-RateLimit-Limit: 10000
X-RateLimit-Remaining: 9847
X-RateLimit-Reset: 1710806400

SDK

Official SDKs are available for Python and TypeScript/JavaScript.

Python

terminal
pip install xhub

TypeScript / JavaScript

terminal
npm install @xhub/sdk
# or
yarn add @xhub/sdk

Configuration

python
import xhub

# Explicit key
client = xhub.Client(api_key="xhub_live_sk_...")

# From environment variable XHUB_API_KEY
client = xhub.Client()

# Custom options
client = xhub.Client(
    api_key="xhub_live_sk_...",
    timeout=30,
    max_retries=3,
    base_url="https://api.xhub.io/v1",
)

Webhooks

Webhooks allow XHub to push reports to your endpoint automatically on a schedule or when a trigger condition is met.

Creating a webhook

curl
curl -X POST https://api.xhub.io/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "url": "https://your-server.com/xhub-hook",
    "query": "AI breakthroughs",
    "schedule": "0 9 * * *",
    "secret": "whsec_xxxxxxxxxxxxxx"
  }'

Verifying webhook signatures

Every webhook delivery includes an X-XHub-Signature-256 header — an HMAC-SHA256 of the raw request body signed with your webhook secret.

python
import hmac, hashlib

def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(
        f"sha256={expected}",
        signature
    )

Exports

Reports can be exported in three formats. Specify the format in the query request or convert an existing report via the export endpoint.

FormatContent-TypeDescription
json application/json Full structured report. Best for programmatic consumption.
markdown text/markdown Human-readable narrative with inline source citations.
pdf application/pdf Formatted PDF with charts and source table. Professional plan only.
curl
# Export an existing report as PDF
curl https://api.xhub.io/v1/report/rpt_01hwx9bkzabcd1234/export?format=pdf \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output report.pdf

Error codes

XHub uses standard HTTP status codes. Error responses include a JSON body with a machine-readable code and a human-readable message.

json
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded your daily request limit of 500.",
    "retry_after": 3600
  }
}
HTTPCodeDescription
400invalid_queryQuery string is malformed or exceeds 1000 characters.
401unauthorizedAPI key missing or invalid.
403plan_limitationFeature not available on current plan.
404report_not_foundReport ID does not exist or has expired.
422unprocessable_queryQuery parsed but returned no indexable results.
429rate_limit_exceededRate limit hit. Check X-RateLimit-Reset header.
500internal_errorXHub server error. Retry with exponential backoff.
503service_unavailableTemporary outage. Check status.xhub.io.

Changelog

v1.4.0 — March 2025

  • Added aspect-level sentiment decomposition — per-entity sentiment scores within a single post.
  • Added influencer graph field in Professional and Enterprise reports.
  • Streaming endpoint now supports since parameter for incremental delivery.
  • Reduced median query latency from 480ms to 340ms.

v1.3.0 — January 2025

  • Introduced trend forecasting — velocity spike detection up to 6 hours ahead of virality.
  • Added TypeScript SDK (@xhub/sdk).
  • PDF export now includes embedded charts.
  • Historical archive extended back to January 2020.

v1.2.0 — October 2024

  • Added webhook support with HMAC signature verification.
  • Added recency_weight parameter.
  • Structured filter object now supported as query value.

v1.0.0 — June 2024

  • Initial public release. POST /query, GET /report, Python SDK.