Content Creator
Unified content engine. Select what to generate, choose a mode, and let the Conceptual Twin create coherent content.
Frost AI is currently optimized for solo artist music videos — one character performing to your song with accurate lip-sync. A full band with multiple members isn't something any audio-driven AI model handles well right now; there's no reliable way to generate multiple consistent band members performing together. For a solo performance (vocalist singing to camera), it works great. For a full band, AI video isn't there yet across the industry, not just us.
Your all-in-one content engine. Select what to create, configure how the AI generates it, and fine-tune with studio controls. Everything here works together with your Conceptual Twin persona, Creative DNA, and Style Profile.
Main Controls
Choose what to generate: Caption, Image, and/or Video. Select multiple to create a full content package in one go. Each type has its own settings that appear when selected.
Standard — full-detail, high-quality output using all your data. Low Energy — quick casual posts, lighter AI processing. Recap — generates a summary/recap style post from recent content.
Use Persona infuses your Conceptual Twin persona into every prompt — your style, voice, and brand identity. Artist Character injects your LoRA-trained face/appearance so AI images and videos look like you.
Video Options
How long the generated video clip will be. Longer videos cost more Frost Bites. Some models have max duration limits shown in the model selector.
MiniMax ($0.045/s) — affordable, good quality. Kling 3.0 ($0.14/s) — high quality, built-in audio generation. Veo 3.1 ($0.40/s) — highest quality, up to 4K, 8s max. Veo Fast ($0.15/s) — same quality, faster, 8s max.
Kling & Veo models can generate audio. Configure Dialogue (narration/speech), Ambient (sound effects), and Background Music (genre & prominence). These are AI-generated — not your uploaded music.
More Options
Sets the aspect ratio and content style for a specific platform (TikTok 9:16, Instagram 1:1, YouTube 16:9, etc.). Also tailors caption length and hashtag style.
Controls overall intensity: High = bold, attention-grabbing. Low/Minimal = subtle, laid-back. Auto lets the AI decide based on your persona and trending context.
When generating images: choose Image Type (Promo Graphic, Cover Art, Banner, etc.), Size (square, landscape, portrait), and Quality (Standard or HD).
Upload reference images to guide the AI's visual style. The AI will analyze your references and match the look, composition, and mood in generated images/videos.
Add custom instructions the AI should follow. Example: “focus on the new single release” or “dark moody vibe”. These are injected directly into the AI prompt.
Studio Controls
How prominently your artist character appears. 100% = always center-frame. Lower values let the AI focus on scenery, objects, or abstract visuals.
How strictly the AI follows the prompt. Strict = exact match. Loose = more artistic interpretation. Default 70% balances accuracy with creative flair.
How much your Twin's personality shapes the output. On-Brand = heavily persona-driven. Generic = more neutral tone. Affects captions, themes, and visual style.
AI's creative leash. Literal = sticks close to your notes/prompt. Imaginative = the AI takes more creative liberties with unexpected angles and ideas.
Fast = simpler prompts, quicker generation. High Detail = richly descriptive prompts with more visual layers and nuance.
Video only. Calm = slow/minimal movement. Dynamic = fast camera moves, action, energy. Controls how much movement the AI adds to video clips.
Emotional intensity of the output. Neutral = understated. Intense = dramatic lighting, bold colors, strong emotion in prompts.
Music & Lip-Sync
Choose a track from your Media Library. Use the waveform editor to select a clip region. The AI times video scenes to match your music's rhythm and energy.
Enable Music Video Mode to create videos where your character sings/performs to the audio. Choose a lip-sync model — Audio-Driven models animate from a photo, Post-Hoc models sync lips on existing video.
OmniHuman = highest fidelity, full body. Creatify = singing + speech. Kling Avatar = good value. Post-hoc: Sync Labs = premium lip accuracy. Kling LipSync = budget option.
Set the performance mood: Cinematic, Singing, Hype, Soulful, Aggressive, Intimate, and more. Controls facial expressions and body language in lip-sync videos.
Prominent = close-ups, clear face shots. Balanced = mix of angles. Artistic = partial views, silhouettes, creative framing.
Text Overlay & Workflow
Add text to generated videos/images. Choose style (Neon, Cinematic, Vintage, etc.), position, size, and color. Fade and shadow options for readability.
Preview Prompts lets you see and edit the AI prompts before generating. Tweak the theme, caption, image, or video prompt, then accept or regenerate.
Create Directly skips preview and generates immediately. Results appear below and are auto-saved to your Content Library for rating and reuse.
Pro Tips
- The more data you feed your Twin (media, lyrics, social profiles), the more authentic and on-brand your output becomes.
- Use Preview Prompts to learn how the AI thinks — edit prompts to fine-tune before spending Frost Bites on generation.
- Combine Caption + Image + Video to create a complete content package in a single generation.
- Studio Controls are your creative director — experiment with different slider combos to discover unique visual styles.
- Add reference images when you want the AI to match a specific aesthetic or composition.
- Rate generated content in the Library — ratings train the AI and feed into your Creative DNA.
No images uploaded yet. Upload images in the Media tab.
Click to select/deselect. Blue dot = analyzed. Selected images guide visual style.
Content Library
Browse, rate, and download your generated content. High-rated content helps the AI learn your preferences.
Everything the AI generates lands here. Browse, rate, edit, and download your content. Your ratings are a critical feedback loop — they directly train the AI and shape your Creative DNA.
Controls
Filter by rating (5 stars, 4+, 3+, unrated), content type (images, videos, captions), or platform (TikTok, Instagram, YouTube, X). Combine filters to find exactly what you need.
Rate every piece of content 1-5 stars. This is how the AI learns your taste. High-rated content strengthens your Creative DNA. Low-rated content teaches the AI what to avoid.
Analyze Patterns scans your library to find trends in what you rate highly vs. low. Insights appear above the grid and help you understand your preferences.
Pro Tips
- Rate everything — even content you don't like. Low ratings are just as valuable as high ones for training.
- After rating a batch, go to the Twin page and refresh Creative DNA to apply your feedback.
- Click any content card to expand it — you can edit, copy, download, or delete individual items.
- Use pagination at the bottom to browse older content.
Loading content library...
Media Library
Upload images, videos, and audio to build your persona database. The Conceptual Twin learns from your media.
Your personal media vault. Upload images, video, and audio — the AI analyzes everything to build a richer understanding of your visual style, sound, and brand. Media also feeds into LoRA training, voice cloning, and Creative DNA.
Controls
Drag-and-drop or click to upload. Supports JPG, PNG, WebP, MP4, MOV, MP3, WAV, M4A, and more. Add comma-separated tags for organization.
Switch between Images, Videos, and Audio tabs to browse your library by media type.
Audio files are automatically transcribed using Whisper AI. Transcriptions feed into your Style Profile and persona analysis.
How Media Powers the AI
- Images are analyzed for visual style, color palettes, and aesthetic preferences → shapes your Creative DNA.
- Audio is transcribed and analyzed for vocal patterns, tone, and lyrical themes → feeds your Style Profile.
- Videos are analyzed for visual storytelling patterns → enriches your Creative DNA when processed on the Twin page.
- Audio tracks uploaded here become available as music selections in the Creator and Storyboard pages.
- Upload 10-20 clear photos of yourself to train a LoRA face model on the Twin page.
Drag & drop files here or click to upload
Supported: Images (jpg, png, webp), Video (mp4, mov), Audio (mp3, wav, m4a)
Loading media...
Conceptual Twin
Your AI persona — the Conceptual Twin — learns from your media, lyrics, social presence, and content feedback.
Your AI identity hub. The Conceptual Twin is your digital persona — it learns who you are from every data source and shapes all AI output to match your authentic voice and visual identity.
Core Tools
Re-analyze Persona rebuilds your Conceptual Twin brief from all data — media analysis, lyrics, social profiles, setup info. This is the “brain” that drives all content generation. (~$0.03)
Refresh Creative DNA synthesizes patterns from your content ratings, media, and style into a compact data signature. The button glows when new data is available. (~$0.01)
Train a LoRA face model from 10-20 photos of yourself. Once trained, AI-generated images and video keyframes will look like you. Choose training steps (1000-4000) for speed vs. quality. (~$3.00)
Clone your voice from audio samples. Record a new sample or select existing audio from your library. Used for text-to-speech, narration, and voiceover features. (~$0.68)
Audio Studio
Generate custom sound effects from a text description. Set duration and prompt influence for precise results.
Separate vocals from instrumentals in any audio file. Useful for creating acapellas or backing tracks.
Design a synthetic voice by describing it (age, gender, accent). Preview and save for narration use.
Convert any audio into a different voice. Select a source recording and target voice — great for demos or translations.
Auto-generate narration scripts (social snippets, podcast intros, promo spots) using your cloned voice and persona.
Pro Tips
- Refresh Persona after adding new media, social profiles, or lyrics — it compounds all your data.
- Refresh Creative DNA after rating content in the Library — your ratings directly train the AI.
- For LoRA training, use 10-20 clear photos with varied angles, lighting, and expressions for best results.
- The Evolution Timeline at the bottom shows how your Twin has grown over time.
- Use Video Analysis to enrich your DNA from social video content.
Persona Brief
Persona Status
Creative DNA
AI-analyzed creative profile built from all your artist data. Helps generate increasingly authentic, artist-specific prompts. Upload media, analyze content, or rate library entries to enrich your DNA.
No Creative DNA generated yet.
Create Your Clone
Train a custom AI model on the artist's face for perfect character consistency in images and videos. Requires 10-20 high-quality photos. Training takes 10-20 minutes and costs ~$2-4. Once trained and activated, "Include Artist Character" on the Create page will use this model.
Avoid these in your training photos:
- Sunglasses, masks, or anything covering the face
- Hands near or on the face
- Heavy makeup that dramatically changes appearance between shots
- Extreme filters or heavy editing that distorts features
- Low-resolution or heavily compressed images
Use varied backgrounds, angles, and lighting with the face clearly visible in every photo.
No LoRA models trained yet.
Artist Voice Clone (ElevenLabs)
Clone the artist's voice from audio samples to generate speech-based content (narration, voiceovers, etc.). Requires ElevenLabs API key in Settings. Needs 3-5 minutes of clean voice audio for best results.
No voice clone configured yet.
Upload audio files in the Media Library first.
Audio Studio
Create sound effects, design new voices, convert voice styles, and generate automated content.
Generate Sound Effect
Describe the sound you want to create. Be specific about type, mood, and details.
Lower = more variation, Higher = closer to description
Isolate Vocals
Separate vocals from instrumental/background. Great for acapella versions or extracting samples for voice cloning.
Design New Voice
Create a unique AI voice by describing its characteristics. Preview before saving to your library.
Voice Conversion (Speech-to-Speech)
Convert speech from one voice to another while preserving timing and emotion. Great for voice anonymization or character voices.
Auto-Generate Narration
AI writes the script and generates the audio. Perfect for podcast intros, social snippets, and promo spots.
Social Video Analysis (Creative DNA)
Analyze your social media videos to enrich your Creative DNA with speaking style, catchphrases, and content patterns. Upload videos to the Media Library, then analyze them here.
No videos analyzed yet. Upload videos in Media Library.
Evolution Timeline
No evolution snapshots yet.
Content Feedback History
No feedback recorded yet.
Lyrics Analysis
Paste your lyrics below. AI will extract your writing style (tone, themes, motifs, vocabulary) and save it to your style profile.
Feed the AI your original lyrics so it can learn your writing DNA — vocabulary, motifs, emotional patterns, and thematic tendencies. This shapes how the AI writes captions and content in your voice.
Controls
Paste your lyrics into the text area. One song at a time. The AI extracts tone, themes, recurring motifs, vocabulary level, and emotional range. (~$0.01)
Click Analyze Lyrics to process. Results show the extracted writing patterns. Each analysis adds to your cumulative Style Profile.
How Lyrics Power the AI
- Extracted patterns feed directly into your Style Profile, shaping caption tone and word choice.
- Add multiple songs for a deeper, more representative analysis of your writing voice.
- Include lyrics that best represent your artistic identity — your most personal, distinctive work.
- After analyzing lyrics, refresh your Style Profile on the Style page to incorporate the new data.
Analysis Results
Style Profile
Your style profile is auto-built from all available data: lyrics analysis, audio transcriptions, image analysis, social profiles, persona brief, and any files in style_library/. Click Refresh to rebuild it.
Your Style Profile is a compiled summary of how you write, speak, and express yourself. It's auto-built from every data source and directly shapes how the AI crafts captions, bios, and text content.
Controls
Click Refresh Style to rebuild your profile from all available data: lyrics analysis, audio transcriptions, image analysis, social profiles, persona brief, and style library files. (~$0.01)
Data Sources That Feed Your Style
- Lyrics Analysis → vocabulary, motifs, emotional range, thematic patterns
- Audio Transcriptions → speaking style, phrasing, natural language patterns
- Social Profiles → caption style, hashtag usage, posting voice
- Persona Brief → artistic identity, genre, tone, brand values
- Refresh after adding any new data to incorporate it. The more sources, the more authentic.
Trending Now
Live trending topics and hashtags filtered for your genre. Used automatically in content generation.
Live trending topics filtered to your genre and niche. Trends are automatically injected into content generation so your posts stay timely and relevant without extra effort.
Controls
Click Refresh Trends to pull the latest trending topics, hashtags, and cultural moments filtered to your genre and platforms. (~$0.01)
How Trends Power the AI
- Trends are automatically used during content generation — no need to copy/paste them.
- The AI weaves relevant trends into captions, hashtags, and visual themes naturally.
- Refresh before generating content for maximum relevance — trends change fast.
- Last refresh timestamp shows at the top so you know how current your data is.
Last updated: never
Storyboard
Auto-generate a visual storyboard of keyframe images, preview and edit scenes, then generate a seamlessly connected video.
Frost AI is currently optimized for solo artist music videos — one character performing to your song with accurate lip-sync. A full band with multiple members isn't something any audio-driven AI model handles well right now; there's no reliable way to generate multiple consistent band members performing together. For a solo performance (vocalist singing to camera), it works great. For a full band, AI video isn't there yet across the industry, not just us.
Create multi-scene music videos and visual stories. The AI plans scenes, generates keyframe images, then animates each into video segments that flow together seamlessly. Works in 3 phases: Plan → Preview & Edit → Generate Video.
Setup Controls
Total video length: 10s to 4 min. Longer videos = more scenes = higher cost. The AI divides the duration into segments automatically based on scene count.
Number of distinct scenes/segments. Auto calculates the optimal count from duration. Manual: 2-10 scenes. More scenes = more visual variety but each scene is shorter.
MiniMax Hailuo — affordable, good motion. Kling 3.0 Pro — high fidelity. Veo 3.1 — highest quality. Each model animates your keyframe images into video.
FLUX + LoRA — uses your trained character model so keyframes look like you. Imagen 4 — Google's text-to-image, great for general visuals but no character LoRA support.
16:9 landscape (YouTube). 9:16 vertical (TikTok, Reels, Shorts). 1:1 square (Instagram feed). Sets the frame for all keyframes and video.
Smooth — scenes evolve naturally with continuous visual flow. Dynamic — scenes morph and transform between each other for dramatic effect.
Audio & Lip-Sync
Select a song from your Media Library. The waveform editor lets you pick a clip region. Scene timing syncs to your music's beats, energy, and structure.
Preset — pick a fixed duration (30s, 1min, etc.). Clip Selection — use the exact region you selected on the waveform. Full Track — use the entire song.
Enable to make your character sing/perform to the audio. Choose a lip-sync model and expression style (Cinematic, Singing, Hype, Soulful, etc.). Audio-driven models animate from a still photo.
Creative Studio Sliders
Emotional intensity of the visuals — from understated to dramatic and intense.
Pacing and action level. Low = slow, contemplative. High = fast-cutting, dynamic.
How stylized vs. realistic the scenes look. Higher = more artistic/abstract treatments.
Location variety across scenes. Low = same location. High = diverse environments.
Scene Flow controls narrative coherence between scenes. Special Effects adds particles, light flares, atmospheric elements.
Camera Focus controls framing (wide vs. close-up). Movement controls camera motion (static vs. tracking/panning). Color Grade and Lighting shape the overall cinematic look.
Editing & Generation
Theme sets the overall concept (e.g., “cyberpunk city at night”). Prompt gives detailed instructions. Use Persona injects your Twin for on-brand visuals.
After generating, click any scene on the timeline to edit its prompt, keyframe image, mood, and setting. Regenerate a single keyframe or upload your own image.
Once you're happy with all keyframes, hit Generate Video. The AI animates each scene, chains them together, and overlays your audio track. Save or share when done.
Pro Tips
- Use FLUX + LoRA as image engine if you have a trained face model — keyframes will look like you.
- Select a music track and use clip selection to create perfectly timed music videos.
- Edit individual scene prompts after generation to fine-tune before committing to video.
- Start with 30s – 1 min videos to experiment, then go longer once you find your style.
- The Creative Studio sliders dramatically change the look — try different combos to discover unique aesthetics.
Loading...
Artist Setup
Tell the Co-Pilot about yourself so it can generate personalized promo content.
Your artist profile — the foundation for everything the AI generates. The more detail you provide here, the more personalized and on-brand all your content will be.
Fields
Name, Genre, Tone (e.g., confident, introspective, playful), Releases (current singles/albums), and Avoid Phrases (words the AI should never use).
Check the platforms you're active on: TikTok, Instagram, X, YouTube, Spotify. This shapes content formatting and platform-specific recommendations.
Promo Intensity: Low = subtle/organic. Medium = balanced. High = aggressive promotion with strong calls-to-action.
Physical appearance details: gender, ethnicity, skin tone, age, build, hair, features. Used for AI image/video generation so your character looks like you.
Pro Tips
- Be specific about tone — “confident but approachable” gives better results than just “cool”.
- List current releases so the AI can reference them naturally in content.
- The Character Description section is crucial for LoRA training and AI-generated visuals.
- Update this page whenever your artistic direction or releases change.
Settings
API keys are stored locally in memory/config.json — never sent anywhere except the respective API providers.
Configure API keys for the AI services that power Frost AI. Keys are stored securely in your local config and only sent to their respective providers. Use BYOK mode to pay providers directly instead of using Frost Bites.
API Keys
Required for: Content Creator, Lyrics, Style Refresh, Persona, Viral — powers all text/LLM generation. This is the most important key.
Required for: Audio transcription (Whisper). Needed if you upload audio/video files that need transcription.
Required for: Image generation (FLUX), video generation (MiniMax, Kling), LoRA training, and long video chaining.
Optional: Voice cloning, text-to-speech, sound effects, voice conversion, and video transcription.
Optional: Google API key (YouTube data) and X/Twitter Bearer Token (profile scraping). Only needed for social profile analysis.
Pro Tips
- You only need keys for services you actually use — start with OpenRouter and add others as needed.
- In Frost Bites mode (default), you don't need any API keys — we handle everything.
- Switch to BYOK mode in Account settings to use your own keys and pay providers directly.
- Click “How to Get API Keys” below for step-by-step setup guides for each provider.
How to Get API Keys
OpenRouter (AI text generation)
- Go to openrouter.ai
- Sign in with Google or create an account
- Go to openrouter.ai/keys
- Click "Create Key" and copy it (starts with sk-or-...)
- Paste in the field above
OpenAI (Whisper audio transcription)
- Go to platform.openai.com
- Sign in or create an account
- Go to platform.openai.com/api-keys
- Click "+ Create new secret key" and copy it
- Paste in the field above
Needed for audio transcription in the Media Library.
fal.ai (FLUX image generation + MiniMax Hailuo video generation)
- Go to fal.ai
- Sign in or create an account
- Go to fal.ai/dashboard/keys
- Create a new key and copy it
- Paste in the field above
Needed for AI video generation. ~$0.45 per 10s clip. Long videos chain multiple clips.
Google API Key (YouTube social profiles)
- Go to console.cloud.google.com
- Create a project (or select existing)
- Enable "YouTube Data API v3" in APIs & Services
- Go to Credentials, click "Create Credentials" > "API Key"
- Copy the key and paste above
Optional. Enables detailed YouTube profile scraping (subscriber counts, recent videos).
X/Twitter Bearer Token (social profiles)
- Go to developer.x.com and sign up for a developer account
- Create a project and app
- Under "Keys and Tokens", generate a Bearer Token
- Copy and paste above
Optional. Enables detailed X/Twitter profile scraping (bio, followers, tweet count).
ElevenLabs API Key (audio analysis & voice cloning)
- Go to elevenlabs.io and create an account
- Go to elevenlabs.io/app/developers/api-keys
- Generate and copy API key
- Paste in the field above
Optional but recommended. Enables:
- Song Analysis: Extract lyrics, detect song structure (verse/chorus/bridge), word-level timestamps
- Voice Cloning: Create a cloned voice from your audio samples for TTS
- Video Transcription: Analyze social media videos for Creative DNA enrichment
- Lyric Videos: Generate synced subtitles for lyric overlay
Frost Bites
Purchase and track your Frost Bites (FB) — the tokens that power all AI generation.
Frost Bites are the tokens that power all AI generation. 1 FB = $1.00 USD. Every action shows its cost before confirming, so you always know what you're spending.
Controls
Your current Frost Bite balance. Also shown in the sidebar. Deducted automatically as you generate content.
Pick a preset amount ($5, $10, $20, $50, $200) or enter a custom amount ($1-$1000). Payment via Stripe — instant balance update.
Recent usage history shows what you spent Frost Bites on. Transaction history shows all purchases and charges.
Pro Tips
- Every generation shows cost before confirming — no surprise charges.
- Switch to BYOK mode in Account to use your own API keys instead of Frost Bites.
- Text generation (captions, analysis) costs fractions of a cent. Images and video cost more.
- Check usage history to understand which features cost the most and optimize your workflow.
Purchase Frost Bites
Select an amount to add to your balance. Payment processed securely via Stripe.
Recent Usage
Your Frost Bite spending across AI services.
Transaction History
Account
Manage your account, subscription, and view API usage.
Manage your subscription, API key mode, password, and track API usage costs from your connected providers.
Controls
Frost Bites — we handle all API calls, you pay with FB tokens. My Keys (BYOK) — bring your own API keys, pay providers directly, no FB charges.
View your plan status, manage billing, or subscribe. Manage Subscription opens the Stripe billing portal for payment methods and invoices.
Change your account password. Requires current password and a new password (8+ characters).
Estimated monthly costs from OpenRouter, OpenAI, and fal.ai. Only relevant in BYOK mode. Refresh to get latest numbers.
Pro Tips
- New users: Start with Frost Bites mode — no API key setup needed.
- Power users: BYOK mode gives you direct control and often lower costs at high volume.
- You can switch between modes anytime — your data and content are unaffected.
API Key Mode
Choose how to power AI generation
Account Information
Subscription & Billing
Manage your subscription and payment methods.
Change Password
API Usage (This Month)
Estimated costs from your connected API providers.
Usage data is fetched directly from each provider's API (when supported).
My Jobs
Track and manage your content generation jobs.
See every job you've submitted — storyboard creation, video generation, content creation, and more. Track progress in real time and cancel jobs you no longer need.
Status Guide
Job is waiting for admin approval (Frost Bites mode). You can cancel while it's queued.
Approved and waiting to start processing.
Actively processing. AI is generating your content right now.
Completed successfully. Check your Storyboard or Library page for results.
Pro Tips
- Click Details on any job to see full configuration, APIs used, and cost breakdown.
- Cancel queued or running jobs you no longer need to save Frost Bites.
- Use the status filter to quickly find active or completed jobs.
No jobs yet. Generate content from the Creator or Storyboard page to see your jobs here.
No profiles analyzed yet.