There is a corridor in Lost Garden, the dark-fantasy anime series I am building solo, where the main character walks toward a sound she cannot place. Torchlight. Wet stone. Her face was half in shadow. On paper, it is one of the simplest shots in the whole sequence. A character walks down a hall.
I rebuilt that scene three times. Not three takes. Three full rebuilds, across the better part of three weeks. And here is the part nobody tells you when they sell you on AI filmmaking: generating the shots was the fastest, cheapest, least painful thing I did the entire time.
I want to walk through exactly what happened, because the breakdown of where my hours went is the most useful thing I can hand another AI filmmaker. The numbers were not where I expected them.
The corridor sequence is about forty seconds of finished screen time. Six shots. A wide establishing the hall, a medium of her walking, two inserts (the torch, her hand on the wall), a close-up as she stops, and a reverse on whatever she is walking toward.
For the first version, I did what almost everyone does on their first real project. I opened a generator and started prompting. I had the script. I had a mood in my head. I figured I would describe each shot, pick the good takes, and cut them together.
The individual clips came out fast, and they looked great. That is the trap. A single shot from a current model can look like a frame from a real film. The problem is never one-shot. The problem is the second shot and the third.
The first version fell apart not because any clip was bad, but because no two clips agreed on what they were showing.
Her hair changed length between the wide and the medium. The corridor was three torches wide in one shot and a narrow passage in the next. Her eyes drifted from amber to grey. The torch flame jumped sides. Stitched together, it did not read as one place or one person. It read as six beautiful postcards from six different films.
I had spent maybe an afternoon generating and two days trying to cut it into something coherent. I could not. So I threw it out.
Before I describe the rebuilds, here is the honest accounting for that one forty-second scene, version one through version three. I logged it because I did not believe my own gut estimate.
That top line is the whole lesson. The cheap, fun, marketable part of AI filmmaking is about a sixth of the real job. Everything else is decisions and bookkeeping.
Where my time actually went on one 40-second scene, across three rebuilds.
The discard rate alone surprised me, and it turns out it is normal. Working filmmakers using these tools commonly generate twenty to fifty variations per shot before they keep one, and a typical short can mean eighty to a hundred and fifty individual clips to assemble a final cut. I was not doing it wrong. I was doing the normal thing, which is to say I was drowning in selects.
For the second version, I fixed the most obvious failure first. The character.
This is the part of the toolchain that has genuinely matured. Instead of re-describing her face in every prompt and praying, you train a reusable identity from reference images. Higgsfield’s Soul ID is the clearest example I use: you upload a handful of reference photos, including varied angles and at least one full-height shot, it trains a persistent identity in a few minutes, and that character becomes available across your image and video generations with a stable face, skin tone, and proportions.
So, version two had a consistent protagonist. Her face held across all six shots. I thought I had solved it.
I had solved one of four problems. The corridor was still drifting. The torch was still jumping. The palette warmed and cooled depending on which take I kept. The character was an island of consistency floating in a world that kept rebuilding itself shot to shot. Tools like Runway’s Gen-4 have made real progress on holding a look across shots, but if you need the same thing to appear identically across a dozen scenes, you are still iterating and cherry-picking. Consistency is not a setting you flip on. It is a discipline you carry.
Version two was better and still unusable. The seams had just moved.
The third version is the one that worked, and the change was not a better prompt or a better model. It was a sequence. I built every fixed thing before I generated a single final shot.
Then, and only then, I generated. The discard rate dropped because the model had less room to invent. The continuity work dropped because there was less to reconcile. The scene held together on roughly the second assembly instead of never.
The fix was sequence: lock character, world, palette, and grammar before generating.
Generation is the part that got cheap. Continuity is the part that never did, and you pay for it either up front in planning or afterward in fixing. There is no version where you do not pay.
The reason I now treat that planning layer as the actual product is simple. The plan is the only thing in an AI pipeline that has a memory. The models do not. Each clip is generated fresh, with no knowledge of the last one. The continuity has to live somewhere outside the model, and if it does not live in a deliberate, written-down plan, it lives nowhere, which means it lives in your head until your head loses it three weeks in.
In an AI pipeline, the written-down plan is the only thing that remembers. The models do not.
That gap is exactly what I am building ScreenWeaver to close: a place where the character, the world, the palette, and the rules of a project are held as the source of truth, so the plan is a real thing the production runs against instead of a memory that drifts. Version three of the corridor was the first time I ran a scene that way on purpose. It is the only reason it shipped.
If I could hand the three-weeks-ago version of me one note, it would not be about prompting. It would be this:
The promise of AI filmmaking is real. One person genuinely can make something that used to take a crew. But the promise that gets sold and the work that actually fills your week are two different things. The generation is the magic in the demo. The case study is everything around it.
Lost Garden is still in production. The corridor scene is done. It took three rebuilds to learn that the rebuilds were never about the shots.