
Step 1: Sign Up & Choose Your Mode
Create a free Elser AI account. In the video model selector, choose Wan 2.7 and select your generation mode: Text-to-Video, Image-to-Video, Reference-to-Video, or Video Editing.
Wan 2.7 is Alibaba's latest AI video generation suite from Tongyi Wanxiang, released in April 2026. A single model with four generation modes — text-to-video, image-to-video, reference-to-video, and video editing — it pairs a signature Thinking Mode that interprets your intent before rendering with native audio-visual synchronization and up to 5-subject reference tracking. Available now on Elser AI.
Most AI video tools rush into generation the moment you hit the button. Wan 2.7's Thinking Mode takes time to interpret your true intent before rendering — acting more like a co-director than a blind creative machine. You gain greater control, higher creative consistency, and fewer retries.
Try Wan 2.7 Now

Wan 2.7 isn't a single-purpose tool but a complete creative workflow in one model — Text-to-Video, Image-to-Video, Reference-to-Video, and Video Editing. Generate, reference, extend, and edit without switching models or leaving your pipeline.
Try Wan 2.7 NowWan 2.7 generates synchronized video and audio — dialogue, ambient sound, sound effects, and background music — in a single unified pass. Phoneme-level lip sync keeps characters' mouth movements matched to their speech, eliminating the need for post-production dubbing.
Try Wan 2.7 Now

Create a free Elser AI account. In the video model selector, choose Wan 2.7 and select your generation mode: Text-to-Video, Image-to-Video, Reference-to-Video, or Video Editing.

Write a descriptive prompt — Wan 2.7's Thinking Mode understands natural language, so there's no need for overly engineered prompts. For multi-subject consistency, upload up to 5 reference images (appearance) and optionally an audio reference (voice) in R2V mode.

Choose duration (2 to 15 seconds), resolution (720p or 1080p), and aspect ratio (16:9, 9:16, 1:1, 4:3, or 3:4). Enable first/last frame if you need precise endpoints, then generate and export as MP4 with a synchronized audio track.
The single feature I'm most excited about is the 3×3 image-to-video mode. It accepts 9 reference images as a nine-grid input — multi-angle references, sequential poses, scene variants. The composition is richer, and the drift is greatly reduced.
Wan 2.7 is what finally made AI video viable for client work. The character consistency across 5 references is insane — no more faces warping between shots. I can deliver multi-character short dramas without a production crew.
I used to spend hours syncing dialogue and searching for ambient tracks. Wan 2.7 does it in one generation. My turnaround time dropped by more than half.
The Thinking Mode is a game-changer. Instead of wrestling with prompts for 20 minutes, I just talk to it like a human. It actually gets what I mean on the first or second try.
Wan 2.7 is Alibaba's latest AI video generation suite from Tongyi Wanxiang, released in April 2026. It is a single model with four generation modes — text-to-video, image-to-video, reference-to-video, and video editing. Its signature Thinking Mode interprets your intent before generation, making AI more like a creative partner than a blind tool.
Four key differentiators. First, Thinking Mode — the model plans your scene before rendering, not just generating blindly. Second, a full creative pipeline — generate, edit, reference, and extend all in one suite. Third, industry-leading 5-subject reference tracking — consistent appearance and voice across up to 5 characters. Fourth, instruction-based editing — modify existing videos with natural language instead of regenerating from scratch.
Yes. Elser AI offers trial credits for new users. Upgrade to a paid plan for higher resolutions, priority queue, and full commercial rights.
Wan 2.7 supports video durations from 2 to 15 seconds at 24 fps. Resolutions are 720p and 1080p. Aspect ratios include 16:9, 9:16, 1:1, 4:3, and 3:4. For 4K output, use Wan 2.7-Image-Pro (image only).
Yes. Wan 2.7 generates synchronized video and audio — dialogue, ambient sound, sound effects, and background music — in a single pass. Phoneme-level lip sync ensures characters' mouth movements match their speech naturally.
In reference-to-video mode, Wan 2.7 supports up to 5 simultaneous character references — the highest in the industry — locking both appearance and voice. In image-to-video mode, it accepts a 3×3 grid layout of 9 reference images for structured multi-angle composition.
Wan 2.7 (Video) is for video generation — one model with four generation modes covering text-to-video, image-to-video, reference-to-video, and video editing. Wan 2.7-Image is a separate image generation model with deep personalization, color-palette control, advanced text rendering, and a Pro version with 4K output. Both are available through Elser AI — use Wan 2.7-Image for static visuals and Wan 2.7 (Video) for motion content.
Be descriptive but natural — Thinking Mode understands natural language, so you don't need over-engineered prompts. Include camera movements (tracking shot, dolly zoom, pan), lighting conditions (golden hour, soft diffused light), mood/tone, and an audio description. Wan 2.7 also supports structured multi-shot prompts when you want precise shot-by-shot control.
Pricing varies by mode and resolution. Through Elser AI, we offer simplified usage-based plans — check the platform for current pricing and free trial availability.
Elser AI has integrated Wan 2.7 alongside other leading video models including Seedance, Kling, and Veo. Sign up, select Wan 2.7 from the model selector, choose your generation mode (text-to-video, image-to-video, reference-to-video, or video edit), enter your prompt or upload references, and start generating — no API keys or complex infrastructure required.
1080p at 24 fps with cinematic camera movement, smooth motion dynamics, native audio-visual sync, and strong character consistency.
Sign up on Elser AI and unlock Wan 2.7 — one model with four generation modes, Thinking Mode, and native audio sync. Generate professional cinematic videos instantly, no skills required, no GPU needed.
Try Wan 2.7 on Elser AI