Content Pipeline
The content pipeline is the core of the system. Every piece of content follows a linear state machine from idea to posted video.
What It Does
The pipeline takes a raw topic (a headline, a trending movie, a Reddit post) and transforms it step by step into a fully produced short-form video ready for YouTube. Each step is a discrete stage that can be triggered independently, retried on failure, or run as part of the full automated sequence.
How It Works
The State Machine
draft --> scripted --> voiced --> assembled --> review --> approved --> posted
|
rejectedEach status corresponds to one pipeline stage. Content progresses forward one stage at a time. If something goes wrong, items can be retried at any individual stage.
Stage Breakdown
draft — A topic has been fetched from a source (RSS, Reddit, API) and filtered by the AI. The content record exists with a source_headline, source_url, and source_type. No script, no media yet.
scripted — The AI (GPT-4o-mini) has generated a hook (the opening line) and a full voiceover script. The hook, script, and duration_seconds fields are populated. Duration is calibrated to the target word count based on speaking pace.
voiced — OpenAI TTS has generated an MP3 audio file, and Whisper has produced word-level subtitle timing. The audio_path points to the file in Supabase Storage. subtitle_data contains the word-by-word timestamps.
assembled — FFmpeg has composed the final video: background + subtitles + audio + optional music + branded thumbnail. The video_path and thumbnail_path are populated. Status moves to review for human approval.
review — Waiting for you to watch the video and decide. From here you can:
- Approve — moves to
approved(ready for posting) - Reject — moves to
rejected(archived) - Retry a specific stage — resets to that stage and re-runs it
approved — Ready to post. Can be posted immediately or scheduled for a specific time via scheduled_for.
posted — Successfully uploaded to YouTube. The posted_platforms field contains the YouTube video ID and timestamp.
Full Pipeline Run
When the full pipeline is triggered (either manually or by the daily cron), it runs all stages in sequence:
- Fetch topics for all active channels
- Generate scripts for up to 20 draft items
- Generate voice for up to 20 scripted items
- Assemble video for up to 1 voiced item (memory-limited)
The assembly limit of 1 per run is a safeguard for the 512MB Render starter plan. Items that are not assembled in one run will be picked up on the next.
Where to Find It
- Library (sidebar) — Browse all content by status
- Review Queue (sidebar) — Content waiting for approval
- Pipeline (sidebar) — Trigger pipeline stages and see run history
- Content detail page — View any individual content item, retry stages, approve/reject
- API:
POST /pipeline/run-full(all stages), or individual stage endpoints
Error Handling
If any stage fails, the content is marked with status failed and the error_message field is populated with details about what went wrong. You can retry any individual stage from the content detail page or via the Pipeline API. Retrying resets the status to the appropriate stage and re-runs the processing.
Dependencies
OPENAI_API_KEY— Required for script generation, TTS, and Whisperffmpeg/ffprobe— Required for video assembly and audio duration detectionPEXELS_API_KEY— Only if using slideshow backgroundsYOUTUBE_CLIENT_ID,YOUTUBE_CLIENT_SECRET,YOUTUBE_REFRESH_TOKEN— Only for posting