Topic Sourcing
Topic sourcing is the first pipeline stage. It discovers content ideas from external sources and uses AI to select the best ones.
What It Does
The system pulls raw topics from RSS feeds, Reddit, and external APIs like TMDB. It deduplicates them against recent content to avoid repeats, then uses AI to select the most promising topics for short-form video. Selected topics become draft content records, ready for script generation.
How It Works
- The system queries all active sources for a channel from the
sourcestable - For each source, it fetches raw topics (headlines, titles, descriptions)
- Topics are deduplicated against the last 200 content items (by MD5 hash) to avoid repeats
- The AI (GPT-4o-mini) filters the remaining topics, selecting the best N (where N = the channel’s
posting_cadence) - Selected topics become
draftcontent records in the database
Source Types
RSS Feeds
Pull headlines from any RSS/Atom feed. Good for news sites, blogs, and industry publications.
| Config Field | Type | Example |
|---|---|---|
url | string | "https://www.ign.com/articles.rss" |
name | string | "IGN" |
Pull hot or top posts from any subreddit. No API key required — uses Reddit’s public JSON API.
| Config Field | Type | Example |
|---|---|---|
subreddit | string | "gaming" |
sort | string | "hot", "top", or "new" |
limit | number | 25 |
time | string | "week", "month", "all" (for sort: "top" only) |
Posts with score below 10 and stickied posts are automatically filtered out.
External APIs
TMDB (Movies and TV trends):
| Config Field | Type | Example |
|---|---|---|
provider | string | "tmdb" |
endpoint | string | "trending/all/week" |
name | string | "TMDB Trending" |
Requires the TMDB_API_KEY environment variable.
AI Topic Filtering
After fetching raw topics, GPT-4o-mini selects the best ones based on:
- Would it generate curiosity in a scrolling viewer?
- Is there enough substance for a 20-30 second script?
- Is it surprising, nostalgic, funny, fascinating, or opinion-provoking?
- Is it evergreen? (Content should still work in 3 months)
- Avoids: breaking tragedy, highly political, legally risky, or too niche
The filtering prompt can be customized per-channel via prompt templates (purpose: topic_filter).
Where to Find It
- Dashboard: Channel Settings, Sources tab — add, edit, and test sources
- Trigger: Pipeline page, “Fetch Topics” button
- API:
POST /pipeline/fetch-topics(all channels) orPOST /pipeline/fetch-source/{source_id}(single source)
Configuration
Sources are managed per-channel in the sources table. Each source has:
| Field | Type | Description |
|---|---|---|
channel_id | uuid | Which channel this source belongs to |
type | string | "rss", "reddit", or "api" |
config | json | Type-specific configuration (see tables above) |
active | boolean | Whether this source is included in topic fetching |
name | string | Display name for the source |
Dependencies
OPENAI_API_KEY— Required for AI topic filteringTMDB_API_KEY— Only if using TMDB sourcesREDDIT_CLIENT_ID/REDDIT_CLIENT_SECRET— Optional; Reddit’s public API works without authentication