AI has collapsed the cost of UGC ads: no casting calls, no creator negotiations, no week-long turnarounds. But teams that go fully synthetic keep hitting the same wall, so this guide covers the hybrid workflow that actually ships converting ads: AI for the talking head, real footage for the product.
What does a UGC ad actually consist of?
Strip any high-performing UGC ad down and you'll find the same skeleton:
- The hook (0-3s). A person looks into the camera and says something that stops the scroll: "I wish I'd found this app sooner."
- The product segment (3-20s). Hands-on proof: the app on a real screen, being scrolled, tapped, and used.
- The close (last 3-5s). The payoff line and the call to action.
AI is excellent at segment 1 and 3. It is unreliable at segment 2, where it warps hands and invents interfaces your app doesn't have. We covered the mechanics of that failure in why AI hands look fake in UGC ads. The workflow below assigns each segment to the tool that's actually good at it.
Step 1: Write the script around one pain point
One ad, one pain point. Write the hook as the first sentence a frustrated user would say out loud, not as marketing copy:
- "I almost gave up on cooking until I found this."
- "This app does my meal planning in 30 seconds."
- "Why did nobody tell me about this?"
Keep the full script under 30 seconds when read aloud. Write three to five hook variations now; you'll want them for testing later.
Step 2: Generate the talking head with an avatar tool
Feed your script to an avatar tool like Arcads, HeyGen, or Creatify. A few practical rules:
- Pick an avatar that matches your audience, not the most polished one. Slightly imperfect reads as more authentic.
- Generate in 9:16 if your ad runs on TikTok, Reels, or Shorts.
- Generate each hook variation as its own clip. Avatar minutes are cheap; testing data is not.
This gives you segments 1 and 3. Do not ask the avatar tool to show your app. That's the next step.
Step 3: Get real footage of your real app
For the product segment you need plain, honest footage: a real hand using your actual app on a real device, shot casually enough to pass as user-filmed. You have two options.
Film it yourself. Use a second phone as the camera, natural light, handheld with a bit of shake. Show real interactions: scrolling the feed, tapping through onboarding, completing the core action of your app. Avoid tripods and studio lighting; polish breaks the UGC feel.
Order it. If you don't have the device, the spare hands, or the time, UIHands shoots it for you: you send your App Store or Play Store link with a short brief describing the screens and gestures you want, and edit-ready footage of a real hand using your app lands in your dashboard within 24 hours. Packages start at $2.99 per clip.
Either way, match the aspect ratio of your final edit, and capture more footage than you think you need. Extra B-roll becomes extra ad variations for free.
Step 4: Cut it together
Any editor works; CapCut is the common choice for this format. The assembly:
- Open with the AI hook clip.
- Cut to the real app footage as soon as the script mentions the product. Let it breathe for several seconds; this is your proof.
- Cut back to the avatar for the close, or run the CTA as text over the footage.
- Add auto-captions. Most feed viewing happens muted.
The hybrid cut feels natural because real UGC ads already jump between talking heads and product demos. Nobody expects a continuous single take.
Step 5: Test hooks, not footage
When the ad is live, iterate where iteration is cheap: swap hook variations against the same product footage. The hook decides whether anyone watches; the footage decides whether they believe you. One solid set of real app B-roll typically supports many rounds of hook testing before it fatigues.
The short version
Generate the face, film the product, cut them together. You get AI's speed and cost on the segments AI is good at, and you never expose the parts it fails at. If you want the product footage handled for you, send us a brief and it's in your dashboard tomorrow.