TL;DR
This is the literal build behind Jordan on our homepage. You generate one photo of your spokesperson, connect the HeyGen MCP to Claude Code once, paste one prompt, and Claude does the rest: uploads the photo, creates the avatar, renders a talking video in your chosen voice, compresses it for the web, and drops it on your site. One image in, a moving spokesperson out. You approve; the machine assembles.
What you'll have at the end
A 30–45 second photorealistic spokesperson video — a real-looking person who talks, moves, and gestures — playing on your site (click-to-play, captioned, web-light). The same avatar then becomes a reusable asset for UGC ads and, later, a live AI concierge. Jordan is our proof: built with this exact pipeline, in one sitting.
The exact pipeline (what Claude runs for you)
HOW JORDAN WAS ACTUALLY BUILT
═══════════════════════════════════════════════════════════════════════
YOU CLAUDE CODE + HEYGEN MCP
─── ──────────────────────────
1. generate one photoreal ┌──────────────────────────────┐
still of your person ──▶ │ upload image as a HeyGen asset│
(ChatGPT image / MJ) │ (create_asset_upload → PUT) │
└───────────────┬──────────────┘
2. connect HeyGen MCP once ▼
(terminal / IDE / desktop) ┌──────────────────────────────┐
│ create_photo_avatar → "Jordan"│
3. paste ONE prompt ──────▶│ trains in seconds (AI face = │
("build my spokesperson") │ no consent step) │
└───────────────┬──────────────┘
4. sip tea ▼
┌──────────────────────────────┐
│ list_voices → pick warm/upbeat│
│ create_video_from_avatar │
│ · Avatar IV, expressiveness │
│ HIGH (real movement) │
│ · 4:5, 1080p, your script │
└───────────────┬──────────────┘
▼
┌──────────────────────────────┐
│ poll get_video → download MP4 │
│ ffmpeg 2-pass: 52MB → ~6.5MB │
│ embed click-to-play on site │
└──────────────────────────────┘
═══════════════════════════════════════════════════════════════════════
What you need
| Piece | Why | Notes |
|---|---|---|
| HeyGen account, Creator plan | Renders the avatar; removes the watermark | $24/mo. Creator is the sweet spot — watermark-free, 1080p, photo avatars. Pro's 4K is unneeded for web. |
| The HeyGen MCP server | Lets Claude Code drive HeyGen directly | Hosted, OAuth, no API key. Connection steps below. |
| Claude Code | The operator | Terminal, IDE extension, or desktop — all work. |
| One photoreal still | The face | ChatGPT image, Midjourney, or a HeyGen-generated face. Front-on, mouth unobstructed. |
| ffmpeg | Web compression | HeyGen exports ~50MB; the web needs ~6MB. Free. scoop install ffmpeg / brew install ffmpeg. |
Step 1 — Generate the face (one still)
Front-facing, evenly lit, eyes open, natural closed-mouth smile, mouth unobstructed (critical — it's what HeyGen lip-syncs). Chest-up, vertical. One good image is enough; Avatar IV builds the whole talking video from it.
The prompt that made Jordan (adapt the person to your brand):
Photorealistic vertical 4:5 studio portrait of a youthful, approachable
[describe your spokesperson]. Looking directly into camera, relaxed
closed-mouth smile. Business-casual. Setting: a warm, softly-lit space that
fits your brand, shallow depth of field. Face fully visible, evenly lit.
Natural skin texture, 85mm lens, true-to-life color. Not airbrushed, not
over-smoothed. No text, no logos.
Generate 3–4, pick the most human. Save it — every future ad reuses the same face.
Step 2 — Connect the HeyGen MCP (once)
The MCP is hosted at https://mcp.heygen.com/mcp/v1/, uses OAuth (no API key), and
works in every Claude Code surface. Pick your environment:
A) Claude Code — terminal (CLI)
claude mcp add --transport http heygen https://mcp.heygen.com/mcp/v1/ -s user
Then run /mcp, select heygen, approve in the browser. (-s user makes it
available in every project.)
B) Claude Code — IDE extension (VS Code / Cursor / JetBrains)
Open the MCP servers panel → + Add → choose HTTP → paste
https://mcp.heygen.com/mcp/v1/ → name it heygen. It appears with a "Needs Auth"
button — click it, approve in the browser, then reload the window so the tools load.
(CLI-added servers also appear here after a window reload.)
C) Claude desktop app / claude.ai (Connectors)
Settings → Connectors → Add custom connector → paste
https://mcp.heygen.com/mcp/v1/ → authenticate with OAuth. Start a new chat so the
tools are available.
One rule across all three: MCP tools load when a session starts. After you connect and authenticate, reload/restart so Claude can see the HeyGen tools.
Step 3 — Paste the one-shot prompt
Drop your photo into the project folder, open Claude Code, and paste the prompt from INSTALL.md (ships in this kit). In plain terms it tells Claude:
Here's my spokesperson photo and my 40-second script. Using the HeyGen MCP: upload the image, create a photo avatar, pick a warm upbeat voice, render it on Avatar IV at high expressiveness in 4:5 1080p, poll until done, download it, compress it under 8MB with ffmpeg, and drop it into my site as a click-to-play video.
Then you wait. Claude runs the whole chain — the same calls listed in the diagram above — and reports back with a finished, web-ready file.
Step 4 — What Claude does under the hood (so you can trust it)
create_asset_upload→ PUT the image bytes to S3 →complete_asset_upload.create_photo_avatar→ your avatar trains (AI faces skip the consent gate; real people requirecreate_avatar_consent— a browser approval).list_voices→ choose a warm, upbeat young voice (filter by gender/language).create_video_from_avatar→ Avatar IV,expressiveness: high,aspectRatio: 4:5,resolution: 1080p, yourscript+voiceId.get_videopolling untilcompleted→ grabvideo_url+thumbnail_url.The step everyone forgets: HeyGen's MP4 lands at ~10 Mbps (≈50MB for 40s). Two-pass ffmpeg drops it to ~6.5MB with no visible loss on a talking head:
ffmpeg -y -i in.mp4 -c:v libx264 -b:v 1150k -pass 1 -preset slow \ -profile:v high -pix_fmt yuv420p -an -f mp4 /dev/null ffmpeg -y -i in.mp4 -c:v libx264 -b:v 1150k -pass 2 -preset slow \ -profile:v high -pix_fmt yuv420p -c:a aac -b:a 128k \ -movflags +faststart out.mp4Embed it click-to-play (poster + captions,
preload="none", no autoplay) so the page stays fast.
Avatar IV vs Avatar V
Avatar IV builds a talking, gesturing avatar from one image — what you want here. Avatar V is "cross-reference-driven" and needs video footage / multiple angles, so it's for digital-twin avatars, not a single generated still. For a one-photo spokesperson, Avatar IV at high expressiveness is the right and best engine.
Voice — and an honest limit
list_voices has hundreds; filter for a warm, upbeat, conversational young voice (the
emotion-tagged ones like "Friendly" land best). Set speed near 1.0 so natural pauses
breathe. The honest catch: HeyGen's stock TTS can read a touch flat and monotone on
longer scripts. For a homepage hero it's fine to start; for a polished v2, render the
voice in ElevenLabs (more lifelike, real intonation, even laughs) and feed that audio
to HeyGen as audio_url/audio_asset_id instead of a text script. Same avatar, far better
delivery.
Scaling the same avatar (after the homepage video)
- UGC ads: one sentence of intent → a new 9:16 script → a new render. Same face, new message, weekly. Flag AI-generated content in the ad platform.
- Product showcases: a per-SKU template (intro → three features → CTA).
- Live AI concierge: the avatar's face on a chat widget powered by an LLM that knows your catalog and steers customers to high-ticket items — honestly, premium-option-first.
Compliance
- Disclose the AI (Jordan's script does it out loud). Never claim to be human.
- AI-content flags ON for Meta/TikTok ads.
- Never clone a real person's face or voice without written consent — law, not policy. AI-generated faces are fine and skip HeyGen's consent gate.
Numbers to watch
| Metric | Healthy | Where |
|---|---|---|
| Web file size | < 8 MB per 40s | post-ffmpeg |
| Homepage perf score | unchanged (click-to-play) | Lighthouse |
| Cost per ad variant | < $10 once the avatar exists | HeyGen credits |
| Welcome video play rate | ≥ 15% of visitors | analytics |
Week-one checklist
- Photoreal still generated, mouth unobstructed, saved as the canonical face
- HeyGen MCP connected + authenticated in your environment; session reloaded
- One-shot prompt run; avatar created and video rendered on Avatar IV
- MP4 compressed under 8MB (ffmpeg two-pass) and embedded click-to-play
- Captions attached; homepage performance unchanged
- Same face saved for the next ad
Troubleshooting
- MCP tools don't appear: you didn't reload after authenticating. Restart the session.
- "Needs Auth" won't clear: remove duplicate entries (
claude mcp list), keep one at user scope, re-auth, reload. - Avatar V unavailable: expected for a single-photo avatar — use Avatar IV.
- Video is 50MB: you skipped the ffmpeg pass. Never ship the raw HeyGen export.
- Voice sounds flat: swap to an ElevenLabs voice fed as audio (see Voice, above).
- Hands/mouth look off: regenerate the still front-on with the mouth fully visible.