Caught in the stands.

One fan in an oversized yellow jersey. A hair tuck, a shy half-smile, a packed stadium crowd behind her. The caption: “Comment ‘Baseball’ for the tutorial.” It looks pulled from a live broadcast. It isn’t.

Picsart recreated this trend in Flow – and the result is already shocking people in the comments.

Two renders make the whole thing – a still, then an animation – both on the same canvas. The viral name is the “Baseball Flow” trend. 5-10 seconds, 9:16, photoreal, with a fake scoreboard and station logo dropped on top.

What is the Baseball Fan trend?

A short reel that looks like a live stadium broadcast catching a “beautiful fan” in the stands – fully synthetic, made on one canvas inside Picsart Flow.

Generate the master image with a photoreal node. Pipe it into a Kling Image-to-Video node. Drop a scoreboard graphic node on top. One canvas, one export.

 

View this post on Instagram

 

A post shared by Picsart (@picsart)

Why it’s hitting right now

  • “Comment ‘Baseball’ for the tutorial” – the DM-bot hook is rocketing comment volume and reach
  • Baseball is dominating sports TikTok – bucket hats, chants, full stadium energy
  • “Caught on camera” is the format of the year – couples, fans, anything candid is winning
  • The debate is the engagement – comment sections fight over whether it’s AI, which keeps the clip alive

The fan moments worth catching

  • The hair tuck – oversized yellow jersey, hair flip, shy half-smile (the signature on the trend right now)
  • The scream – mouth wide open, beer halfway up
  • The heartbreak – hands on head, team just blew it
  • The chant – row of fans synced, fists up, jerseys matching
  • The bucket hat trio – three friends, matching stadium fits
  • The wide-eyed kid – foam finger, oversized jersey, watching off-frame

How to make a Baseball Fan reel in Picsart Flow

Step 1: Generate the master image

Open Picsart Flow and start a blank canvas. In the Prompt Panel, pick a photoreal image model – Nano Banana 2 handles stadium lighting and skin tones cleanly.

Step 1 prompt direction

Describe these elements in this order:

  • The shot type – “high-resolution candid fan-cam style photo,” broadcast feel
  • The subject – age, hair (color, length, texture), makeup (specific, light-handed – eyeliner wing, soft tones), outfit (oversized jersey, team color, trim color, anything that reads Korean ballpark)
  • The expression – neutral, glancing off-camera, micro-smile, shy. Avoid “looking directly at camera”
  • The setting – “stands at a Korean baseball game,” stadium lighting overhead, packed crowd softly blurred behind
  • The look – “photorealistic, 8k, shot on a professional broadcast camera, cinematic lighting, shallow depth of field”
  • The crop – portrait orientation, vertical, head and torso framed

Sample prompt:

A high-resolution, candid “fan cam” style photo of a beautiful young woman sitting in the stands at a Korean baseball game. She has long, dark reddish-brown hair, soft makeup with slight eyeliner wings, and is wearing an oversized bright yellow baseball jersey with black trim. She is looking slightly off-camera with a neutral expression. The background is a slightly blurred crowd of people at a stadium with stadium lighting. Photorealistic, 8k, shot on a professional broadcast camera, cinematic lighting, shallow depth of field.

Generate 6-8 variations on the canvas. Pick the one where the eyes read alive and the jersey + lighting both feel right. That’s the master image.

Step 2: Animate it with Kling Image-to-Video

Stay on the same canvas. Add a Kling Image-to-Video node and pipe the master image into it. Pick Image-to-Video V3 (best face and hair physics) or 2.6 (faster).

Step 2 motion prompt direction

Describe these elements:

  • Primary motion – one micro-action only (“slowly raises her hand to tuck a strand of hair behind her ear”)
  • Secondary motion – “glances around the stadium curiously, then looks toward the camera, gives a shy small smile”
  • Body – “natural body sway, realistic hair movement, no stiff posture”
  • Background – “stadium atmosphere remains stable, crowd stays soft and out of focus”
  • Camera – “subtle handheld drift, no zoom, no cut, single broadcast-style shot”
  • Length and format – “5-7 seconds, 9:16, photoreal, high-quality broadcast TV aesthetic”

Sample prompt:

The woman in the yellow jersey slowly raises her hand to tuck a strand of hair behind her ear. She glances around the stadium curiously, then looks toward the camera and gives a shy, charming smile and a small scrunch of her nose. Natural body swaying, realistic hair movement, stadium atmosphere in the background, high-quality broadcast TV aesthetic.

Higher realism: use Video Reference

The Kling Image-to-Video node has a Video Reference input. Drop in a short clip of a real person doing the exact movement and the AI maps that motion onto the master image instead of inventing it. Tracks tighter, especially around hands and faces.

Set Motion Strength to 4-5. Higher melts the fingers into the hair.

Step 3: Add the broadcast overlay

Stay on the canvas. Add a second Prompt Panel node and generate the scoreboard graphic as a still – “broadcast scoreboard graphic, team color blocks, score and inning visible, station logo top corner, clean broadcast layout.” Layer the graphic node over the clip node on the canvas.

Keep it subtle – “end of an inning” energy, not graphic-design energy.

Export the canvas to 9:16. Post. Caption: “Comment ‘Baseball’ for the tutorial” if you want the engagement hook.

Pro tips for realism

  • Motion Strength at 4-5. Higher melts the fingers. Lower kills the candid feel.
  • One micro-motion only. A hair tuck. A glance. A small smile.
  • Video Reference > pure prompt. The reference clip wins for realism every time.
  • Crowd stays soft. Sharp background faces is where the AI tell shows up first.
  • Eye direction sells the candid. Look slightly off-frame for most of the clip, glance once to the lens.

The tells to fix before you post

  • Scoreboard logic – if you invent player names, check the matchup is plausible. Retired players get caught in seconds.
  • Text on signs – cheering sticks, banners, jersey numbers often come out as gibberish. Crop or blur.
  • Hands and hair – finger-in-hair is the highest-risk gesture. Keep Motion Strength 4-5 and use Video Reference.
  • Liquids – beer splashes look wrong. Keep cups still in the prompt.
  • Mascot fur – it morphs. Keep mascots in the background only.

Pick your fan. Animate the moment. Press play.

One canvas. Three nodes. One export.

Picsart Flow – photoreal image node for the still, Kling Image-to-Video node for the animation, scoreboard graphic node for the broadcast layer.

Roll the cam.