This is an impressively in-depth look at stereo conversion using several different projects as case studies.
Art of Stereo Conversion: 2D to 3D – 2012
May 8, 2012
Source: FX Guide
Written By: Mike Seymour
Stereo conversion, or dimensionalization as it is sometimes called, is the process of making stereo images from non-stereo traditional 2D images. We originally published an Art of Stereo conversion two years ago, and this is a new updated version of that story covering The Avengers, Titanic, John Carter, Star Wars: Episode I, plus the newest techniques and approaches.
Many people argue that if you want a film in stereo you should shoot it in stereo. Yet many filmmakers do not want the physical size of an on set stereo rig, or they prefer to shoot film (ruling out stereo in camera, effectively) or want to use anamorphic lenses which are extremely difficult for stereo native capture. Even on a stereo film, lenses and situations often times render a single camera the only viable solution. Thus, even on films shot in stereo there may well be a need to convert some footage and that high quality conversion is an important tool in the box of any effects house.
Stereo conversion is also needed for converting older films – such as Top Gun or the Star Wars franchise. John Knoll (ILM) oversaw the stereo conversion of the Star Wars films for director George Lucas starting with Star Wars – Episode One: The Phantom Menace and you can watch our detailed chat with John on that process here. Below his comments are included as part of our special case studies on major feature films.
3D animated films, such as Toy Story 3, UP, Brave, Tangled and others find it easy to correctly generate stereo imagery from either stereo renders such as RenderMan or just rendering the entire scene from two similar but offset virtual cameras. For everything else live action, normal stereo production is hard. Even Avatar required tiny amounts of stereo conversion. For example, the opening macro eyeball shot was far too close for a stereo camera rig to film, and this first shot of the film was converted stereo from 2D.
There are several leading companies in this area today such as Prime Focus, who after a battering of critical opinion on the conversion of Clash of the Titans, recently converted Star Wars Episode 1 and the new Wrath of the Titans. Stereo D, another leading company, converted Titanic, while Legend3D worked on Transformers: Dark of the Moon. Yet another significant company is Cinesite in London who not only delivered stunning visuals for John Carter but also had to deal with converting the film from anamorphic 35mm to digital stereo.
The problems with generating a second view or second ‘eye’ for stereo conversion are:
• The parallax effect. That means a second eye will see around things the original eye won’t and thus there is missing background information to be replaced.
• A depth map is needed of the scene to determine the correct distribution of the objects for the second eye. While amazing work has been done with programs such as Ocula by the Foundry, and Mistika by SGO (see below), the process is far from automated.
• Cardboard cutouts and the need for roto. Not only is a roto required for the outline of any character in shot, if they are closer than say a wide shot, internal mattes are also required to generate different depths for different parts of their bodies. A character could easily have 7 rotos in addition to their outline for features such as nose, eyes etc and all of these must be conceptually and logically correctly placed based on z depth.
• Projection. While some shots respond well to re-projecting or ‘camera mapping’, the mono footage over 3D models and then filming the stereo by rendering the 3D scene from two virtual cameras, this rarely works well for people in movement as the difficulty of generating accurate 3D models to map onto, renders the approach extremely expensive. The technique works well for building, hallways and other regular and mostly rigid body solutions but most films are about people – normally with soft edges such as hair etc.
• Shot design. Most stereo films are shot with an understanding and consideration of the stereo nature of the experience. A mono film may be poorly composed from a stereo point of view. For example, most staging of a stereo scene would have a distribution of the objects over the immediate foreground, avoiding the clumping that may seem odd when all the props are at the back of a room. In mono, although this shot may be extremely creatively valid, it is only when the shot is converted does it seem empty or oddly distributed.
• Singular depth resolution. One of the trickiest problems can be, say, glasses on a live action character. While the glasses have a depth from the camera, the eyes behind them are further away. If incorrectly dealt with, the eyes of the character would appear to be printed on the glasses, not behind the glasses. But conversely the reflections on the glasses should not appear to be drawn on the actual eyes of the character. Add to this, one tends to focus on the eyes of character in a close up and one could guess that this is perhaps a contributing factor as to why there was a scheduling concern that led to a film starring a certain famous glasses wearing boy not being stereo converted recently. Similarly, hair is extremely complex and in a single close up may require extremely complex and fine roto work.
• Floating windows and stereo budget. As an object can be both simultaneously forward of the cinema screen and cut off as if behind it, a technique was developed some time ago to place objects forward of the screen – closer to the audience than the cinema screen, but with a false or floating ‘fake’ edge window even further forward. This allows some of the depth budget of the space between the audience and screen to be used, but not violate the edge paradox since the object or actor is in the cinema space but the screen appears closer still and thus normal.
Key to good conversion
• A strong working relationship with the overall supervisors.
• A strong relationship with the vendors, so that relevant files such as keys and mattes can be provided.
• Pre-production planning. As with all vfx work, the earlier the conversion company can be involved the better.
• Time. Given the huge volume of work, ensuring the schedule allows time for roto and volumizing is vital.
• Speed of camera moves, framing and staging can all affect conversion. If you can isolate the key stereo moments in the script and shoot those knowing how they will be converted, you can produce a more impactful final conversion.
• Good asset management and asset tracking.
• If Clash of the Titans taught the industry any overall lesson, it is that stereo conversion should not be a process ‘tacked’ on the end of a film’s production. Sometimes a film is converted years after principal photography, but if the conversion team can be involved before principal photography it can aid the creative, schedule and the budget.
• Make a good film in stereo not a good stereo film.
Specialist tools developed by dedicated teams:
There are five main companies with specialist tools, doing large scale feature film conversion. This is not to exclude smaller companies, many of whom do great work, but there are five big players that regularly dominate film credits in stereo conversion.
Below are cases studies showing how they work and the quite marked divergence in their solutions.
• Cinesite: John Carter
• Stereo D: Titanic
• Legend3D: Transformers: Dark of the Moon
• Prime Focus: Star Wars Episode 1: Phantom Menace
• In-Three / Digital Domain
(Cinesite is listed here but it is unclear if they will become a general stereo conversion company or if the John Carter stereo work was a specific add on to a much large effects project, in the same way ILM, Weta and others have very strong in-house stereo teams).
CASE STUDY ONE: John Carter – Cinesite
Director Andrew Stanton decided to shoot John Carter in mono on 35mm, but the film was to be released in stereo. Nearly all stereo releases are filmed digitally and most that are going to be converted at least try and shoot digitally, and for good reason. It is hard enough to successfully convert a film from mono to stereo, but it is made even more complicated if the production was shot on 35mm, and really difficult if it is shot with anamorphic lenses.
Cinesite led the stereo conversion of the film, as well as being one of the major visual effects vendors (along with Double Negative). Cinesite is one of the leading visual effects companies in the world, and John Carter was one of the largest budget films of all times, so there was tremendous pressure to produce and define the state of the art in terms of stereo conversion. And then, not to add any pressure to the project, but while Cinesite had done a few shots and odd pieces of stereo conversion for Harry Potter, they had never stereo converted a whole film.
As it turned out, this was a gift, as Cinesite built an entirely new and accurate stereo conversion pipeline virtually from scratch under the careful supervision from Scott Willman, Cinesite’s conversion supervisor for the project, and thanks to the great contribution from Gregory Keech who developed all key conversion tools.
“We got an opportunity to take a fresh look at how we would structure the pipeline in general and it was a long enough project to give us time to plan ahead and focus on how to produce the best quality we could possibly achieve,” explained Michele Sciolette, head of VFX technology at Cinesite, who walked fxguide through a couple of key shots in micro detail to illustrate Cinesite’s commitment to accuracy and quality in their stereo conversion work.
The basic approach Cinesite took for John Carter was to make sure that every 3D/stereo conversion set-up was really grounded in the correct spatial volume. “For each shot, our process started by tracking the camera and building geometry for all static and moving objects,” Sciolette outlines. “This ensured that our stereo conversion was based on spatially correct data. We also did a traditional 2D rotoscope of every element to make sure we had detailed edge information. Our extensive set of proprietary tools processes all this data to perform the actual conversion and really go the extra mile resulting in a workflow that has all the advantages of being solidly grounded in 3D space together with the accuracy and flexibility that come with image based techniques.”
If there was ever a part of VFX where ‘God is in the detail’ it is in the area of stereo conversion. Below we set out the detailed steps involved, including the challenges and difficulties.
Using this shot of New York is a good example to discuss, since a nearly fully animated 3D shot would open itself to other approaches, including just re-rendering a second eye. For Sciolette, this shot “was one of the first shots, after we were awarded the show, that was handed over to us by production, and when we realized what challenges we were up against – the amount of people, the amount of complexity – including the rain – it does not get more difficult than that!”
1. Lens distortion
The film was shot anamorphically. Every shot was camera tracked (see below) and part of this was computing the lens distortion of the footage based on that exact lens. The team would then undistort the plate. “One of the key requirements from our stereographer Bob Whitehill from Disney, is that we would have no vertical disparity on the shots,” says Sciolette. (Bob Whitehill is one of the greatest stereographers working today, his work on UP, Toy Story, Tangled and other Disney/Pixar films has been covered by fxguide extensively).
In a normal CG pipeline one would undistort the image, add CG created with the ‘pin hole’ perfectly undistorted CG camera and then the comp would be re-distorted back to match the original.
However, a post distort such as this will produce vertical misalignment at the final stage for two stereo cameras. If both left and right eyes are re-distorted, they will no longer have perfect vertical alignment, as the same feature appearing in two different positions in each eye will also be distorted differently.
2. Vertical alignment (VA)
Vertical disparity can come from both a misaligned camera rig (one camera not pointing perfectly compared to the other) and also from how the stereo is derived. It is possible to have two stereo cameras converge on a point and then each side of the screen suffers from parallel – one side one way and the other the opposite – even with the two centers perfectly aligned. For this reason Cinesite, under the guidance of Whitehill, would go on to produce all the final shots as parallel stereo. “Converting a shot is one of the only ways to guarantee absolutely no vertical disparity,” notes Sciolette.
“In our show,all the shots were shot anamorphic, so the lens distortion was significant enough to introduce a noticeable vertical disparity if the two eyes were distorted at the end of the conversion process.Our approach was to make sure that out of the 3D environment we converted everything into images with depth information and we distorted all of those and then we applied image warping operation to do the actual image shifting – but purely horizontally so there was no vertical misalignment.”
3. Chromatic aberration
One hallmark of anamorphic lenses is the way the image produces chromatic aberrations, especially on the edges of frame. Again one person’s ‘look’ is another person’s ‘fault’ or image ‘error’. Here, Cinesite decided to not remove the aberrations but rather match them in both eyes. “We’d would treat them just like any other distortion effect,” says Sciolette.
Each shot was then very accurately camera tracked. Camera tracking the New York scene was not easy with the rain, people and horse movement and greenscreen.
The now undistorted and tracked plates were then rotomated, where CG figures and props where placed over the top of the live action. While the team could get close, in the case of the New York shot, there is no way that the animators could line up CG figures with the live action perfectly every time. For example, to do so for the horses would require every muscle to be perfect, for the people with coats every cloth sim would need to match perfectly, it just is not viable to 100 per cent line up CG character with the live action, no matter how great the rotomation team is.
If, however, one projects the live action over inaccurate CG environments, while it looks perfect in mono, in stereo the results can be tragic. Early stereo film suffered from this – imagine a 3D model of a head that did not quite cover the actual head of the live action. Once the live action is projected over this geo, part of the character’s head is no longer in the front with the actor – it ‘rubber mat effects’ to the back wall behind them, with horrendous results.
So even though Cinesite relies heavily on geometry to give the correct spacial set-up for the shot, every actual conversion is done on an image level, based on depth maps derived from the geometry.
Relying on geometry, projecting imagery on that geometry and then rendering that from another eye, “would not work, due to geometry that is not accurate in some areas, does not have enough detail or is just missing information,” explains Sciolette. “We did do projection tests and it does not give you enough control, not only for what is correct but also what is pleasing to the eye.”
6. Additional roto
If projection does not provide isolation of objects, then a vast amount of roto is required to isolate all the elements and combine this roto with the disparity maps generated from the rotomation stage with geometry (including the manual additional fine tuning). This roto, together with the rotomation was partly farmed out to other companies. It is also done just with traditional roto tools in a manner now commonly understood by a group of specialist facilities worldwide.
A lot of effort was put into integrating the use of outside vendors and clearly communicate to them the rotoscoping and rotomation that was needed. Most of the roto was done in Silhouette, and interestingly there is no really efficient way to get roto data into Nuke. Mattes are easy but the team wanted the vendors to feed splines not image mattes back into Cinesite’s pipeline. Cinesite used a simple Silhouette exporter that initially ended up with Nuke scripts over a gigabyte in size just based on spline data, so they set to producing optimizations and data reduction programs to clean up the translation. “This greatly, greatly helped,” says Sciolette. “We also had to rely on an old Nuke node called Bézier which is practically hidden, and from the old days of Nuke.” This node is much simpler and more lightweight than the fully featured roto node in the current version in Nuke, but unless the team used it the script size would have been unworkably large. “We shared some of our heaviest scripts and all our findings with The Foundry who is already at work on optimizing the performance of their roto tools based on this data,” adds Sciolette.
Cinesite converts the geometry and rotos into accurate disparity maps and the team can fix any issues with the geometry using some of the specific tools that Cinesite developed or any other trick from Nuke’s set of tools. “That is where having such an amazing team of artists made a big difference,” says Sciolette.
These disparity maps then feed the image processing done on the original mono shot.
7. Converting to stereo – which eye is which?
Normally with a stereo rig, one eye is the hero and the second eye is adjusted to it. In this case that one hero eye, say the left eye, would then also be the mono version of the film. In Cinesite’s case there were three options:
1. Always assume that the mono version is one of the two eyes, for example, the left eye.
2. From the hero original camera eye, build two new cameras – one to the left and one to the right – and use one of those as the mono version of the film also.
3. Or the third option (which Cinesite did pick and is carried out on a per shot basis, judged by the complexity of the next step) – the image painting replacement, sometimes assume the original hero film camera is the left eye, sometimes the right eye.
Reducing paint replacement outweighed the disadvantage of having just one clean ‘eye’ for outputting for the mono version. So the mono version was built from a shot by shot selection of the best shot, the mono version is not just the left or right eye of the stereo master. (Note: even if it was the mono it will always be graded differently for stereo light loss from the glasses and the removal of the floating window, which is redundant in mono – see below).
9. Image processing conversion
As stated above, the actual conversion is done via image processing.
The key ingredient in the image processing part of the process is an image warping tool in Nuke. Cinesite developed a custom Nuke plugin for this. “I have been in many presentations talking about stereo conversion in the past and they all seem to imply that the minute you have a depth image and your beauty plate it is all solved and there is nothing more to be done,” says Sciolette.
“The assumption is that you can plug them into something like an IDistort node but in reality this does not work. Most of these distortion tools implement a ‘backward’ warping operation, where for every pixel in your output image, say the right eye, you use the pixel intensity of your distortion map in that location to find where to pull pixels from. In our case, if we want to fill the right eye image pulling pixels from the left one, we need a depth map that lines up with the right eye, which is not available.”
“There are different ways to solve this problem,” continues Sciolette, “ranging from writing a full ‘forward’ warping operator to relying entirely on geometry projection techniques instead of image warping. We put a significant effort into writing a custom stereo specific warping operator that gave us the great flexibility and control that comes with image processing approaches together with the reliability that is typical of projection techniques. The final result worked so well that the generation of the second eye was done entirely as an image processing operation for every shot.”
10. Painting back missing details
With a depth solution and a set of prepared files, the stereo effect can be applied to produce another view or eye, but as this happens it reveals areas of the film not filmed – only visible to the new eye – and thus this missing piece of the image needs to be replaced.
Cinesite had several solutions for this issue. One was an automated in-paint tool that took image information from the edge pixels beside the gap and filled in a close but only approximate solution. This is an automated solution. “Initially we expected to use it to only fill in extremely small gaps,” says Sciolette, “but actually it worked extremely well even on larger seams.”
In addition there was a large amount of manual paint, but the ‘in-paint technique’ was used both on the disparity maps and the repair work of the actual picture. The process ended up with each shot being automatically in-painted and then these shots would be reviewed and if necessary escalated to a second level of manual painting work.
If there was a paint task the in-painting and stereo process would provide a mask to identify just how much and where the manual paint needed to done.
11. The rain
For the rain to work it needs to be scattered in 3D-space, but of course roto for every drop is nearly impossible. The approach was broken into different types of rain:
- the bigger drops, and ones coming off umbrellas, were individually rotoscoped, the background painted back in and then the drops were put in the correct 3D position
- CG rain was added, to give depth
- multiple layers, especially in the distance, were not isolated especially for distant rain
12. The film grain
Grain is a characteristic of the recording medium not of vision. This raises problems since stereo-captured images would have completely independent grain profiles, but that neither helps the stereo effect nor reduces eye strain. The mono grain could not be ignored as anything the same in both eyes will have a depth. If grain was added equally to both left and right eyes, it would appear as a wall of grain at the screen plane, like a shower curtain, since only things at the screen plane resolve to the same position in x and y space. If a different approach is taken and a layer of grain is added but offset (but the same structure) would then either be in front or behind the screen – but in either case it would appear as a flat ‘shower curtain of grain. Cinesite never used this approach.
One could degrain the footage and try and remove all grain, but the director specifically wanted to shoot on film and grain is part of the film experience. Degraining can also cut into things such as distant rain, since small rain on screen has many of the same characteristics to a degrain algorithm as grain.
Cinesite’s solution was more traditional. The footage was degrained, which did remove some fine rain, but then this helped as it could be stereoscopically re-added with CG rain. But the grain was added independently to both eyes, simulating what would happen if the film had been shot with a stereo film rig – in this one minor respect. But also not all grain was removed, so some grain would be moved back in 3D-space with the dimensionality of the object that was moved to that position in 3D-space.
13. Lens flares
Another aspect of anamorphic lenses are lens flares. Many directors shoot with anamorphic lenses specifically for the flares, yet like grain they are a product of the record system and do not have a real world stereo equivalent. The flare is happening in the lens not in front of it, so it does not aid stereo suspension of disbelief.
When there were heavy lens flares in the image, Cinesite would sometimes paint them out, or replace them with 3D stereo spacing creatively, normally positioned in 3-space in front of the main action. But this is entirely a creative choice as flares do not exist in stereo space.
14. Set extensions
When extending shots or adding large 3D it would be possible to create the shots in stereo, but Cinesite produced stereo versions of their effects only when needed. In this shot the set extension was added down the street and then the crowds added and finally the shot was approved and then stereo converted.
15. Cheating stereo cameras
One of the tricks that Disney developed and actually even passed to Pixar was special camera rigs that would produce a different stereo ‘solution’ for the background and foreground. This does not work with a receding floor or ground plane but it does in a close-up with a distant and disjointed background, which is clearly a common shot in any film when an actor has a close-up. While easy to do in CGI, it is also possible to do when compositing of say greenscreen closeups over a distant background (having a different stereo solution for each) and it is also possible in stereo conversion. “We were asked on some shots to say give a little more dimensionality on their face,” notes Sciolette. “Not on every shot but on some we were asked to add some and so cheat the cameras. Even though it was used sparingly it was used on a reasonable amount of shots. We dealt with cheating the actual truth and, purely from a storytelling point of view, enhanced some of the shots.”
16. Floating windows
One technique that Bob Whitehill has used extensively, but was not used on films like Avatar is ‘floating windows’. This technique aims to get around the limits in one’s stereo ‘budget’, in relationship to the edge of the screen. If a character is closer to camera – as in an over the shoulder shot – it is hard to have them sitting in stereo space closer the audience (ie. ‘in the space’ between the screen and the audience, since their body is cut off by the bottom of the real cinema screen). As our eyes can see the stereo image is floating closer to us than the edge, yet our minds ‘know’ the person must be on the other side of that screen or window, we mentally reduce the stereo effect.
In other words, the illusion is lessened since we know that someone cant be close to us AND further away behind the sharp real world edge of the cinema screen. The trick that Bob Whitehill did not invent but radically perfected at Disney was to float another dummy edge of screen between the viewer and the stereo violating back of the shoulder. This floating window is rendered in the footage, but tricks the audience into thinking the edge of the real world cinema is closer than it is – hence not violating the edge (this effect is known as an edge violation) and thus we regain some of the stereo budget. That is the useful working space one has to place things between a sensible close and a sensible far in stereo terms.
Cinesite, under Whitehill’s direction, did add floating windows. “We used floating windows quite extensively,” says Sciolette, “so every time we had a foreground object during an edge violation, we would introduce a floating window – even tilting floating windows were used.” (This is where the floating window is not parallel with the read cinema screen).
It should be noted that for this to work, Cinesite needed to not only add the windows but manage the color space and black levels very closely. The effect is lost if the black levels are tinted or affected adversely during the final grade – they need to be completely black, which they were successfully on John Carter.
Sciolette has nothing but great things to say about Whitehill’s stereo direction. He clearly admires him both technically but perhaps most for his ability to use stereo as a storytelling tool. “He is someone who, just looking at the initial script of the movie, not even having seen any images from the film, came to our first meeting with a depth script for the whole film, trying to set the tone of how depth would play across the movie.”
Yet Whitehill was also very hands on and gave very specific direction technically. Where many people talk in terms of rough generalities – such as on a scale of 1 to 10, where 10 hurts your eyes – ‘I want this to be a 5′. “But not on this show,” says Sciolette, “it was very reassuring to talk to someone who was able to express [in technical terms]. Bob would speak in a way that provided no ambiguity, he would always speak in terms of actual pixel offsets. It may seem like a simple piece of information, but it was really great. We could transfer that to our artists to get it exactly right.”
CASE STUDY TWO: Titanic 3D – Stereo D
(Also see Avengers below)
We turn to the stereo release of James Cameron’s Titanic, and talk to Stereo D President William Sherak and Chief Creative Officer Aaron Parry. Stereo D, now owned by Deluxe, converted 95% of the film for the re-release (VentureD also contributed to the conversion) in 60 weeks. Titanic 3D is already doing incredible box office worldwide, especially in China where it is breaking all previous box office records.
Sherak co-founded Stereo D in 2009 and has guided the company from 15 employees to an international staff of 450 in less than two years. Past film credits include Avatar, Jackass 3D, Gulliver’s Travels, Thor and Captain America. Sherak is currently supervising the full theatrical 3D conversion for Paramount’s Hansel and Gretel: Witch Hunters, Fox’s Abraham Lincoln: Vampire Hunter, and Stereo D were lead conversion company on this month’s mega tentpole film Marvel’s The Avengers.
We asked William Sherak, “In the past, James Cameron hasn’t always been a huge fan of post-conversion films. What do you think changed?”
“Well, I can’t tell you what changed in his mind,” says Sherak, “but I can tell you what got us the movie. And that is, technology can only make an artist more efficient, it can’t replace an artist. Our process is 100 per cent artist-driven, so when he came into our facility and saw 400 artists working and sculpting every frame, he realized this technique and this industry had come far enough where he could get the film he wanted in stereo.”
The work was done in painstaking detail given Cameron’s key role in the whole 3D stereo film community. “It was frame by frame – 297,000 of them,” Sherak laughs. “That’s how it was done, there’s no black box, there’s no magic red button. It is frame by frame”.
In practice the team broke the film into a series of types of shots. Clearly there were large epic effects sequences but there was also a need for carefully crafting the less epic dramatic shots. Some of the most complicated shots involved the dining/ballroom room of Titanic and the departure of Titanic at the docks at the start of the film. “There were only so many locations on the Titanic,” explains Sherak, “so once you’ve established the look and depth you want of every location, then it became easier to know what James would want in the next shot that was in the same location. So you start getting as many looks of depth passes in front of him of individual locations so you can set a depth script for the whole movie. And we would do that, get notes, make revisions.” Certain shots went through 100s revisions, and others went through as few as 10. “It all depended on what was in the scene – crowds, making sure not to miniaturize the ship, for example,” he adds.
The team at Stereo D used no projection or automated solutions. The team isolated with multiple layers of roto each of the key elements of the scene, often many per person in the foreground and then these were used to derive a depth map. From the depth map normally a new left and right eye was created. By creating two new views the team reduced the amount of in-fill or missing information per eye. So rather than move an object in a ‘new’ left eye 80 pixels to create depth, while using the original film as the right eye, the team would make a new left eye 40 pixels left and a new right eye view 40 pixels right. “It was isolating every single image using roto and any other type of tool available to us, then sculpting those images in a depth map that made you feel like it was shot in stereo,” Sherak. “Then there was also a paint issue – the movie is filled so deep that the occlusion fill was unheard of – I don’t think any movie has had to be painted the way Titanic has been painted, in terms of filling in the missing information. We have proprietary software to manage this, but really it only makes an artist more efficient.” Stereo D also have a proprietary production tracker that helped manage the conversion.
Stereo D has a 70,000 square feet in Burbank, California, which housed about 400 people for Titanic. “We have an amazing facility in India which does our rotoscoping,” says Sherak. “We annotate all our roto frames here and it comes back here and our artists keep working on it. We work in 3D the whole way.” Stereo D decided to have every artist have access to a 3D monitor at their workstation. The leads on the project then also had a 47 inch 3D monitor to work off, plus a 20 foot screen in the Stereo D theater where Stereo D’s senior stereographer could review shots.
Stereo D has a Quantel Pablo set-up for final stereo review and adjustment (see below). Says Sherak: “We can do remote sessions where we set up a Pablo PA at the director’s location or house. We can have them come in, of course. With Jim we did a combination of everything. But we’re pipeline agnostic, and we have a full editing pipe with Avid.”
The overriding principle for Sherak and the team was to find ways to ‘enhance the story’ with 3D. Speaking at NAB 2012, one of the things that producer Jon Landau, VFX supervisor (of the original film) Rob Legato and Sherak all agreed on was that the dramatic performances were surprisingly enhanced by the stereo conversion. For example, when Rose’s mother is dressing her daughter early in the film, the panel thought that the dramatic performance of the mother behind Rose was thrown into even sharper relief thanks to the use of stereo, even though the shot is neither an effects sequence or a major action moment.
“When you’re dealing with a movie that is set in reality,” explains Sherak, “it was ‘set your convergence at point of interest’ and then see the world around that. Jim has that point of view better than any person I’ve ever met. We set out to make the gold standard of conversion and that meant that every other scene was as important as the other. The movie is 194 minutes long, we had 60 weeks and we wanted to make sure every frame was perfect. The thing we also have to remember is that it’s not our movie – it’s the director’s movie and it’s a director’s medium. And Jim is a good captain – he knows exactly what he wants and how to tell everyone.”
The film was also digitally restored to a 1.78 restored 4K master by Reliance Mediaworks. They handled the grain reduction and final timing of the film (The Lowry process, as it was known before its acquisition by Reliance in 2008, is aimed at improving the resolution and dynamic range of motion picture imagery while also removing dirt, noise and scratches). The grade needed to account for stereo light loss in addition to any other grain or restoration aspects. James Cameron closely supervised the entire process. The project to convert Titanic to 3D took about 60 weeks of work, including the restoration of each frame at 4K resolution and conversion to 3D, and a titanic budget of $18 million.
Even before Cameron could begin the conversion process to turn Titanic from a 2D into a 3D movie, he had to restore the original 1997 film. The goal was to create a new, cleaner version of the film in all formats – including 3D and 35 mm. “This is more about Titanic returning to the theaters than just 3D,” Cameron said. The 4K restoration alone took roughly 10 weeks and was completed earlier this year.
It is interesting to contrast the conversion of John Carter and Titanic.
On the face of it, the films are years apart, and yet both were shot on film and both were converted with the latest techniques. Both are big budget conversions with first rate teams, yet the approaches taken by Stereo D under Cameron’s direction and Cinesite’s approach under Bob Whitehill’s supervision and of course Andrew Stanton’s direction, are dramatically different. We spoke to Stereo D’s Chief Creative Officer (CCO) Aaron Parry about the differences.
Converged vs parallel
James Cameron is very much an advocate of having convergence at the point of interest, which is normally also the primary point of focus. As such his native rigs ‘pull convergence’ or literally adjust convergence while a shot is in progress, exactly as a focus puller ‘pulls focus’. The cameras in this rig are ‘toe-in’ or pointed inwards on each other ever so slightly. This is visually pleasing but not the only way to work. It is also possible to shoot with both cameras parallel and then adjust the x ‘interaxial’ distance to place the desired object at screen plane.
Shooting parallel or converged is a fundamental stereo position – it is the Mac vs PC of the stereo world. In this debate Cameron is very much just converged. By contrast John Carter was converted parallel.
The Stereo D team took their moves from the Cameron / Pace playbook and as such they tended to not religiously avoid edge violation. In the scene where Rose is first seen in her large hat at the docks, the stereo clearly places parts of the hat forward of the screen plane, but Cameron believes people are looking at Rose’s face, explains Parry, and so the hat edge violation is not an issue and can be effectively treated as a non-issue and ignored. It is true in a scene like this that Cameron’s original strong framing style and camera work makes it a deliberate decision to try and look away during a 3D stereo screening of the film and examine the edge.
Titanic had no floating windows, James Cameron doesn’t believe they are needed, he doesn’t believe in them, according to the panel discussion on the film at NAB when fxguide asked about their use. In contrast in shots were there was a need, Cinesite deployed floating windows to buy back some of the stereo budget while avoiding edge violation. Bob Whitehill at Disney has been a strong advocate of floating windows. He has strongly promoted their use to great effect in both animated film and live action conversions. So skilful is Whitehill and the Cinesite team that even with non-parallel floating windows it is impossible to notice the floating of the black ‘false’ floating window.
Use of projection
Titanic used little or no projection to create the stereo. There was some use of geometry to help generate a z-map but it always needed additional work. By comparison, the Cinesite method relied at its core on the use of geometry. The Stereo D solution was very much built on an artist roto approach. Of course, both are valid and both produced strong dramatic comfortable stereo.
In the case of Titanic there was a lot of grain management done as part of the restoration – as such the team choose to not actively stereo manipulate the grain. The grain fell where it may in a way very consistent with what would happen if someone shot (perfectly aligned) film cameras, i.e. different grain to each eye.
Cinesite was dealing with original source film, scanned directly for them and not the color timed and restored 4K master that Stereo D had to work with. As such the Cinesite team needed to take a more active role in grain management.
Stereo D opted for the audience watching dynamic grain with no extra z placement considerations, Cinesite positioned their regrain and selectively dealt with the issue on a per shot basis.
Choice of Left, Right or Both to be created
At Stereo D the team found that they needed to provide about a 50-60 pixel average shift in most shots. And in some shots as high as 80 pixel shift. This means a lot of image needed to be replaced or painted back in. Their decision was to make a new left and new right eye image from center most of the time. About 80% of shots were new angles for both left and right. By contrast, John Carter would pick to generate the left from the right or the right from the left based on paint-in favoring whichever would produce the best results and jumping on a shot by shot basis.
Access to anything
The Titanic team had access to nothing from the original source elements. Everything was converted from the re-mastered film alone. While much is made of the fact that no effects or new material was added in the popular press, in reality some volumetrics and some water needed additional new elements to allow correct high level conversion. In all cases the ‘look’ of the shots was matched, but in about 2%- 5% of the water shots and with some of the smoke/steam volumetrics were used to produce successful stereo conversions, according to Parry. It should be stressed that these new elements are not designed to change the shots, but roto techniques really find it hard to solve complex imagery with volumetics. Another area that need new CG was some of the bubbles in the water, in the sinking sequence there is a vast density of bubbles, especially as the ship drags our heros under the water. 98% of the smoke, bubble and steam are original but the rest is re-done with new digital tools.
On the other hand, for John Carter, the stereo conversion site was also the effects company and the film was not being converted after the fact. The stereo team need only walk down the hall to speak to the team lensing the master visual effects shots, or contact nearby other vendors, and information on cameras, lenses, distances etc were all much more accessible if required.
Click here to read fxguide’s retrospective look at the visual effects of Titanic, with interviews with some of the leading artists on the film.
CASE STUDY THREE: Transformers 3: Dark of the Moon – Legend3D
One of the largest US-based 2D-to-3D conversion studios with some 250 US based engineers is Legend3D has served as the primary conversion partner on blockbuster hits and Oscar award-winning films such as Transformers: Dark of the Moon and Hugo. Additionally, Legend3D recently completed converting Tony Scott’s famed classic Top Gun, anticipated to be re-released in 3D later this year.
Dr. Barry Sandrew is founder of Legend3D and the inventor of several proprietary colorization and 3D conversion techniques that have helped advance the craft, but in directions fairly unique to Legend3D. With the growing demand for 3D/stereo films, Legend3D evolved based on colorization technology. The company has its own unique approach to the problem that is neither geometry nor roto-reliant, but rather image processing-based. Their approach is more automated than many others, making their work high-quality, competitively priced and quick. “It all depends what it is,” explains Sandrew, “but we have done a trailer, which was a 120 shots in just eight days, we did a Chase spot in 3 days (and that was nominated for a Telly Award), so we are probably fastest and, according to the industry, highest quality as well.”
For feature film production, Sandrew points out that one of the biggest factors for Legend3D on schedules was now who was involved in approvals. The studio, the studio stereographer and perhaps the director can have direct involvement, and that can mean a longer production time, versus some older catalog titles where Legend3D works more autonomously. For those catalog titles, like Top Gun where the director does reel review, rather than shot by shot direction, the film can be converted in just 12 weeks. For new films that are in production when they are being converted, roughly speaking, Sandrew estimates that the schedule can be at least a month longer than that.
Legend 3D ended up converting over 77 minutes of footage for Transformers: Dark of the Moon. Most of the VFX shots were converted in a 4-5 month time span. The converted footage was a mix of visual effects shots (from ILM and DD) and non-vfx shots. In most cases these would be shots intercut back and forth with original stereo footage shot on location. “This raised the complexity much more than if the entire film had been converted, just as doing VFX in a live action film is a different task than creating an animated film (or virtual sequence),” commented Scott Squires who worked at Legend3D on the film and setting up the pipeline. Scott Squires is ex-ILM and one of the most respected visual effects supervisors in the world, with not only a very strong IMDB credit history but a very strong programming and development background dating back to Commotion and beyond. Squires did not do the conversion, he worked directly on the workflow and pipeline involved, under Dr Sandrew.
While always maintaining a healthy R&D budget, Sandrew told fxguide that for Dark of the Moon they invested an additional million dollars in meta data and shot management R&D, and a further two million in primary image processing depth tools R&D. Legend3D uses its own image processing tools, in concert with Nuke from The Foundry. Their image processing tools are not founded on Ocular and they are not based on optical flow, although they do use some combination of techniques including optical flow for complex participatory volumetric materials such as smoke and steam, of which there was a lot in the third Transformers film.
Legend3D has some proprietary software and techniques for different stages in the process. “I wrote a number of specialty Nuke plugins and scripts to help leverage these proprietary software packages with existing software,” says Squires. “I worked with a number of artists and developers at Legend to review and analyze the shots as they were delivered and worked closely with ILM and DD to get the proper elements that would be required. Both technical and creative issues had to be examined to get the best possible quality. Overall this represented a number of challenges that were unique to the world of stereo conversion and we were able to raise the quality of conversion to a new level.”
Legend3D “don’t use projection and models,” says Sandrew. “That’s one of the most common ways of doing conversion – a lot of studios are doing it that way – we don’t. And probably if I had known about that technique early on when I started to develop my process I probably would have mistakenly gone in that direction, but I didn’t. I took a totally different approach.” While the new stereo conversion tools are built on the back of the colorizing tools, they are not the same tools. “In terms of masking for 3D (stereo), it is a hundred times more complex than colorizing black and white movies,” jokes Sandrew.
Legend3D prefers to produce two new image streams from one source plate, and not making that source plate one of the two views an audience would see in stereo (this later technique is called ‘one eye’ or ‘hero eye’ by Legend3D and is popular elsewhere). For Transformers 3, director Michael Bay had wanted to convert by using the source material as ‘one eye’, and then just convert out a second eye by the Legend3D process. That is, until Legend3D showed him tests. “Michael had wanted us to do one eye, he knew about one eye and he did not want to deviate from that, then we showed him what happens when you do that and basically the image is skewed,” explains Sandrew. “If you do left from right and compare you can see it is skewed, if you do it the way we wanted to do it – as two new eyes – nothing is skewed. The result is, if you (do the one eye approach) it is not what Michael shot, he understood that and he let us do it two eyes (from one middle source original).”
Given Legend3D’s image processing approach, they are not reliant on roto or roto tools. “We don’t do roto,” says Sandrew. “We are one of the only companies that doesn’t. We don’t use splines – like virtually every other company. Rather than roto’ing everything, we mask everything.” Legend3D uses masks based on pattern recognition and various other technologies. These masks feed into their system that then allows the shots to be dimensionalized. The process does not rely on chroma keying, since it is based on colorizing black and white. “Although we have color keying in some of our other technologies,” says Sandrew, “our masking technology does not work that way.”
In colorizing a film it is important to ignore or even remove – from an image processing point of view – shadow and shading. The patten matching of a face should not be affected directly by a side light causing a shadow on one side of say a person’s nose. Legend3D removes shadows from the equations (not from the images – just from its calculation of depth). Of course, color is completely connected with light and thus shade, so while shadows may be removed from the masking process, inside one mask Legend3D may put multiple colors of a face to make the colorization look believable, based on shading. So too with stereo conversion – shading is removed for some calculations while being valid in other processes. The Legend3D masking in stereo tends to define a depth level but not a volume. For example, a side shot of a glass on a table may sit with one mask, defining its depth on the table, but the glass may still be stereoscopically converted to have volume and not look like a cutout card on the table. “The depth is dependent on the mask but the volume is dependent on the render,” explains Sandrew.
An interesting side aspect of this process, which ignores shadow means that Legend3D gets good volume and dimension even on hand drawn cel animation.
On Dark of the Moon, the conversion would be handled via the following steps:
Step 1: Shots are classified and identified into a very elaborate database and tracking system, ‘a project management and asset management system that groups shots’. Sandrew believes this tracking and control software removes a lot of the human error that can happen in a conversion process.
Step 2: For a given shot, several key frames are done in the San Diego main office, where all the primary creative work is done. At the end of the day this is placed in the network queue for the Indian office. At this stage Legend3D templates can be used if they are on file or new templates created. The preproduction planning is “fairly intense,” notes Sandrew.
Step 3: Legend3D has a Virtual Private Network (VPN) to their Indian operations. Legend3D has operated in India for over 12 years now and the system is well managed and works extremely well with the timezones. As America sleeps, the Indian team work, reloading at the end of their day in time for the start of the next American day. From the key frames the day before, India does the ‘in-betweens’ overnight.
Step 4: Volumetric elements, for example, may get special treatment having been flagged in step 1. The company has specialist tools and pipeline solutions for these things such as smoke and clouds. “There is a completely separate but parallel process for atmospherics,” explains Sandrew. “In Transformers we had a lot of CG smoke combined with practical smoke, fire and sparks. We have image processing processes to handle that. It is not the kind of thing you can roto or mask. We got a lot of those atmospherics from Digital Domain and ILM and while they gave us some Z-depth information we needed more and it had to be a completely different approach which we developed about a year and a half ago.”
Step 5: From these masks the scene is put into stereo space. The Legend3D team neither favor doing this with convergence nor strictly in parallel. A conversion stereographer uses propriety software or heavily modified Nuke at this stage. The stereographer usually works with a stereo monitor and adjusts and animates the very shapes, objects and planes to move in 3D space. This needs to match the surrounding shots and the 3D relationships need to be correct. In the case of Transformers the volume and depth of the objects had to match to the original stereo photography. The show stereographer works with the director as well to provide both creative and technical feedback regarding the depth range and convergence planes. Cory Turner was the show stereographer on Dark of the Moon.
Step 6: The images are rendered. Grain is managed as a part of this. Grain is placed in z space distributed through the stereo space dependent on subject matter at that point in the frame. “On Transformers we tried to get layers in depth as much as possible to avoid re-extracting anything they added in the composite stage, says Squires, who worked with Tony Baldridge, Jacqueline Hutchinson and Jared Sandrew as some of the key technical people at Legend3D on this project. “We were also able to leverage the depth information supplied with the effects when possible. This helped given the complexity of the transformers and their transformations. We also worked in real world units to make sure the intercutting was correct.”
Step 7: Clean plates when available are used as part of the solution to gaps produced by the new stereo eye/lens positions. In conversion, there are always issues with glass window, mirror, chrome or other reflective surfaces. Each of these has to be removed into a separate element and then added back in at the right depth. “Transformers of course was full of fast action with glass windows, smoke, explosions, sparks, and lens flares,” states Squires.
Step 8: Shot review. Legend3D has RealD theaters, but Sandrew personally likes to review with a JVC monitor. “The reason why I like the JVC monitor is that I have the entire image in my field of view,” he says. “It is extremely bright, and if I see anything I can zoom in on it.”
Step 9: Final QC is done in the main RealD theater. It is then passed on for final grade (as a stereo grade will need to allow for the light loss of the viewing glasses). Legend3D is very aware of light levels even though they do not do the final grade. “Michael wanted to get all the screens to 6 Lamberts, and when they did get the light levels up it really did look great,” says Sandrew.
Interestingly the process has ‘learning’. Having converted Tom Cruise in one film, Legend2D know how to do Tom Cruise for the next film. “We have his head face and body all masked and rendered into depth (after Top Gun),” says Sandrew. We know how to do it, and we can use that again for the next Tom Cruise movie.” The Legend3D system uses templates and they are asset managed along with the footage, WIPs and shot meta data.
One other aspect of Legend3D is different from almost all other experts one may talk to: they believe converted films are every bit as valid and good as shot stereo – or even more so. They reject the notion that it is better to shoot in stereo if you can, pointing out a vast catalog of issues from live action dual camera rigs: from mirror issues to alignment, ocular rivalry – all of which makes on set stereo difficult and problematic. “You see all these problems and typically we end up fixing a lot of these problems,” suggests Sandrew.
As converted work has no lens or mirror errors, Legend3D believes converted footage can be better than shot footage with less problems, less artifacts and thus more pleasing for filmmakers. “If you have a conversion process that is sophisticated enough and you have clean plates, it is more precise than if you shot it in stereo,” says Sandrew. “All the information is there (from the clean plate), people think that it is less accurate as you are filling in gaps but if you have the correct material and process, it is more accurate than shooting stereo.” And there are other advantages for a director like Michael Bay – a telephoto lens shot can be converted with more volume than would be possible shooting stereo natively.
For Dr Sandrew, “Transformers is probably the most complex film that has ever been converted and it is the most complex film that will be converted for maybe a long time – well, let’s say I think that it will hold that record for a long time!”
CASE STUDY THREE: Episode 1: The Phantom Menace – Prime Focus
Using Prime Focus’ proprietary View-D conversion process, The Phantom Menace was re-released in stereo on February 10, 2012. “It was incredibly important to me that we have the technology, the resources and the time to do this right,” said Star Wars creator George Lucas. “I’m very happy with the results I’ve been seeing on Episode I.”
Over a 10 month period John Knoll supervised the stereo conversion of The Phantom Menace, performed by Prime Focus. The film, while extremely effects heavy, was still primarily a live action film so it was never going to be practical to reload the course files and try and re-render or “do something that would have been done on Toy Story – restoring the scenes and re-rendering,” says Knoll. “Our pipeline has changed dramatically enough since the work was started in 1997 that we don’t have machines that use the same software anymore. It was all animated in Softimage and the RenderMan renders dispatched from iRender – packages we don’t even run anymore.”
Knoll knew the film was long and so intended the conversion to be not only the best conversion ever done, but also a low-strain stereo final product, one without loads of gags and negative space ‘in your face’ stereography. “We’re trying to avoid going too extreme in places where it’s going to give you a headache,” says Knoll. “The biggest difference is that because George has an inclination to do very dense compositions – there’s a lot going on – having that extra dimension makes things spatially clearer, with the characters.”
Prime Focus and Knoll did use floating windows in the conversion. In general, the space was distributed one-third forwards and two-thirds back. “We used floating windows to reduce eye strain,” explains Knoll. “I was a little dubious about floating windows in general, because a lot of my 3D experience came from my stint on Avatar. Jim Cameron has very strong opinions about stereo – he is very strongly against floating windows. I became a convert, I think it really helps that cognitive distant you get at the edge of screen.”
“I was cribbing a lot of my stereo style on Phantom Menace on Jim Cameron’s playbook on Avatar,” adds Knoll. “There’s really no stereo arc on Avatar. It’s done on a naturalistic basis that the amount of depth is presented to you based on the subject matter – all the way from the beginning of the movie to the end it’s consistent with that logic. I tried to do the same thing on Episode I.”
On Avatar, Jim Cameron was very comfortable to adjust the stereo in a non-realistic way to accommodate large wide shots with close foregrounds, by having a different stereo solution used on each of the two sections ie. the front was different from the back in terms of interaxial.
“Having separate cameras for foreground/background that have different inter-axial distance was something that was quite a familiar process from my stint on Avatar,” recalls Knoll. “We did a lot of cockpit shots where the foreground camera had a particular inter-axial, but Jim had designed coverage of the background that was sometimes shot with different focal lengths than the foreground was and very frequently had very different inter-axial distances, just so it wasn’t so flat. So I was very used to these cheats of what the depth was.”
In the pod race sequences when viewing a foreground cockpit shot, “there’s a nice level of shaping and depth that felt right there,” says Knoll, “but then what you’d see in the background would feel very far away – should that be completely flat or should we break that up a little bit. Sometimes just to see the tiniest bit of depth in it would require the equivalent of an inter-axial that was significantly different than the foreground.”
The best stereo is often when the director has a lot of objects that are relatively close to the camera, the effect lessens dramatically for distance objects, mountains have little volume, a face close to camera has a lot. “When you have big spectacle shots with a wide landscape and there’s nothing particularly close to camera, there’s not a lot of stereo possibility there unless there’s a big cheat,” says Knoll. “And that was one of my stylistic guidelines at the beginning that we’re not going to do hyper-stereo. We’re going to look at the sequences more holistically and if you have a big wide epic shot, as George is very want to do, we’re not going to push it. I’m going to rely on the fact that on either side of the epic shots there’ll be closer shots where we’re in with the characters where there are lots of stereo possibilities.”
All the work was formally reviewed in a theater and not on a stereo monitor. Knoll had taken this step on Avatar since the screen size directly affects the stereo perception. While one could argue that everything scales, one thing that does not is the distance between your actual eyes. So the stereo effect is directly impacted by screen size and the viewers distance from it. “On small 19 inch monitors we had converted for stereo, you don’t really see anything that’s closer than 4 inches from the screen or deeper than about 3 or 4 inches deep,” he says. “So it all has this very compressed look. So if you’re making aesthetic decisions based on that form, the tendency would be to make much deeper stereo than you could really deal with in the theater. Looking at stereo on a monitor is useful for spotting gross errors. But my favorite technique for looking at what that mistake is was very simple – just popping back and forth between mono – right and left where everything shifts.”
There needed to be two new color timing parts to the show and Knoll used this opportunity to upgrade all the material, since when Episode I was originally finished, it was done on a per shot basis – “done sort of old style, final a shot, film out a shot, look at a print of that negative and that’s what we would final”. It went through a conventional negative cut. An optical timed IP was generated from that and then then master printing negatives were made from that timed IP. So everything audiences saw in the theater was two generations down from the original. “When the original DVD was released,” Knoll says, “it came from scanning in the timed IP because it was the simplest thing to do. But when it came time to do [the conversion], we were going to take the movie and cut it up into 2,000 separate pieces, work on them and re-assemble it, we had an opportunity to go back to the original material. We could go back to the original film-out tapes that are a couple of generations better than what had been seen. So we figured let’s do that. We made a concerted effort to collect all the bits, re-create all the dissolves and pre-wipes. So that was all pre-graded material, so we had to do all new color timing, just to have the new Blu-ray master. Then there is a device-dependent color timing that’s done to compensate for the light loss that comes from stereo.”
John Knoll really pushed hard to make this conversion exceptional and by all accounts Prime Focus rose to the challenge. “I saw real growth in Prime Focus’ ability to do high quality material with fewer iterations over the course of the project,” he says.
Knoll has three areas that going into the process he was concerned with:
1. Hyper-stereo: where everything gets this ‘super-deep’ look, even things that are inappropriately that way. “That throws me out of the movie where suddenly I’m looking at a big wide scene and there’s all these little people running around on a tabletop. I’m just thrown out of the narrative. Having worked on Episode I for two and a half years, we put a lot of effort into establishing scale, and to have that all undermined by miniaturising everything.”
2. Over driving stereo: which is extreme roundness and volume. “I don’t like weird, warpy depth, where you see over-driven stereo inside a character where his nose sticks way out or his forehead is bulgy.”
3. Bad in-fill: this is the repair or restoring of missing information now seen from a new camera. “In-fill is probably the single most complicated and labor-intensive part of the post process. It’s often skimped on or skipped entirely. So you have a character that will appear to be at the right depth and the edges will receded back because either they skimped or didn’t bother with in-fill. It looks like a character’s hair extrudes back behind them in.”
Prime Focus aimed to address all of these and match Knoll’s vision for good conversion.
Prime Focus’ View-D 2D to stereo 3D conversion tool is a typical example of the specialist tool used after roto for the actual second camera/eye stereo compositing and repair work. This tool was developed in house at Prime Focus from an initial technology concept by Chris Bond (then President of Prime Focus Film VFX, North America). The tool allows artists to take the roto work that has been done to identify the separate elements and then expand and build from that the second eye and stereo effect.
For example, on Dawn Treader, the comp team using View-D converted the roto files of the mono shots to final stereo shots in about 8 weeks (after roto was done). As with most pipelines, general tools are incorporated in the workflow and Prime Focus’ workflow includes Fusion which has been modified and expanded with custom plugins. While the software is evolving, the process is not an automated one. The techniques are still very manual. It is not an automated 2D-in / 3D-out process, so artist skill is as vital as workflow R&D tools.
On Clash of the Titans, the roto was done in Mumbai but the composting team was small and relatively new to View-D. In fact at the time, View-D was version 1.0. In contrast, on the completion of The Dawn Treader the software had progressed by several major versions and changed and expanded considerably to version 3.0, according to Matthew Bristowe, Joint Managing Director, View-D London. Today the software is effectively at version 10.0, and it has expanded greatly. Key now to the ViewD system is the meta data and shot management software that runs alone side the advanced technology.
While Prime Focus still uses Fusion, it is looking to move large tracks to Nuke also, since the company now works more closely with visual effects houses, and they need to be able to access vfx assets such as Nuke scripts, Maya scene files and even Lidar scans and on set measurements. Prime Focus is able to reduce the time it takes to produce really high quality shot conversion by opening up the ViewD process to import much more data from vfx houses, and they are very built on this even further.
When we wrote the first version of this Art of Stereo story here at fxguide.com 24 months ago, Prime Focus had “85 experienced artists in the Los Angeles facility, 65 out of the London facility and the Mumbai facility had 40 experienced compositors plus another 100 artist in training, and this is before you get to the roto team.” Today that has changed to the LA facility being just R&D, London still having about 65 artists, but now the Indian operation has expanded to “500 seasoned A level artists, and another +1500 roto artists,” according to Bristowe. This demonstrates not only the size of Prime Focus but illustrates their ability to handle more than one film at a time.
Prime was primary facility on not just Star Wars but also Harry Potter and the Deathly Hallows – Part 2, Wrath of the Titans and more. These other two films were subject to enormous critical elevation, one for its place in film history as the climax of a film and publishing phenomenon, and the other due to the criticism leveled over the first film. Prime Focus knows that the first Titan film was not well reviewed in stereo and it firmly believes that pushes the studio to do better work, while subjecting them to perhaps even higher standards than most when it comes to stereo conversion. While Clash of the Titans was not a masterpiece of stereo conversion – far from it – the company certainly has creatively and financially bounced back, showing perhaps that, as with all things in VFX, companies constantly improve techniques and do so through the fires of real production. Today Prime Focus is one of the most dominant players in the industry and no one disputes the quality of Phantom Menace, Harry Potter or Wrath of the Titans.
In regards to the combined software and artist workflow approach, Bristowe explains: “This is version 3.0 and we are already working on version 4, 5 and 6 of this technology, it is not finished being improved.”
Meanwhile, Knoll still believes that if someone can make a stereo film natively, shooting stereo, that this is the preferred method. “I feel like if you’re going to originate new material for stereo exhibition you should shoot in stereo. Varying inter-axia depending on focal length, like Cameron did on Avatar, make a lot of sense and I would shoot stereo that way.”
With regards to the debate over converged or parallel, Knoll has seen both having worked on Avatar and seen a huge number of technical tests: “I’d shoot completely parallel if you can because then you’re avoiding keystoned mismatching between the eyes. The actual converged cameras on Avatar did tend to create some eyestrain since there were different magnifications on the different sides of frame – some vertical misalignment there,” he says, adding, “If you think about a subjective experience of seeing 3D on the screen – just close your eyes alternately and you’re just seeing objects shift over – you don’t see it. Your retina is a hemispherical surface and there isn’t any keystoning between what you see in your eyes. So when you’re shooting material it should be more like that subjective experience, especially because both your eyes are focused on the same screen, so the keystone of your eyes is the same that way.”
Knoll feels that a better way of doing stereo is to mount your cameras completely in parallel and then have extra image you’re working with to allow you to do that convergence later. He believes that it reduces on set errors where by your cameras are locked straight parallel and then one has the option to make convergent decisions in post – adjusting using inter-axial.
The VFX supervisor also had the chance to compare stereo approaches while working on Rango. “There was a time when there was some very serious discussion of Rango being done as a stereo release,” he says. “We screened various films back to back. I’d seen Avatar – we looked at Shrek 4, Up, How to Train Your Dragon and Alice in Wonderland. We had digital prints of these and could watch five minutes at a time in one theater. It was a really great way to form opinions about what you thought worked and what didn’t.”
“The style that Dreamworks was doing in How to Train Your Dragon – they weren’t racking convergence, not the way Jim did in Avatar,” adds Knoll. “In general, convergence followed focus so that if you had a character that walked from the background to the foreground, the convergence would rack with them, so we stayed with convergence at the screen plane. The logic behind that is to reduce eye strain on cuts. That was also the screening that I was convinced floating windows would work well.”
Specialist Tools: Dimensionalization by In-Three, now Digital Domain Stereo Group
It is worth noting that there is one other significant player in high-end stereo conversion – the company formerly known as In-Three which is now part of Digital Domain. Digital Domain’s stereo conversion team also worked on Transformers: Dark of the Moon as well as other films that they were not the visual effects house on, such as The Smurfs 3D.
Historically, In-Three’s stereo process was called Dimensionalization, and was launched in 1997. It uses Imagineer’s Mocha tools to help produce the roto isolation, Nuke for stereoscopic compositing, and their proprietary disparity generation software, Intrigue
In 2008, Disney and Bruckheimer films turned to the then In-Three to handle the live-action 2D to 3D conversion of G-Force, while Sony Pictures Imageworks (SPI) handled the conversion of the CG animation integrated scenes. “This was our opportunity to go full scale with our newly developed workflow. With all the pieces in place with our proprietary solution, and with mocha integrated nicely into our pipeline, we could confidently take on this challenge,” explained In-Three’s Alex Torres.
The results were a success with the G-Force 3D team recognized by the International 3D Society, winning the 2D to 3D Conversion Project of that year. It was so successful, Disney came back to the then In-Three for Alice In Wonderland. In-Three was assigned that task of converting scenes from the prologue and the epilogue of the film that were primarily live action. Elements created at Matte World Digital and Café FX were integrated into some shots as well.
In November of 2010, Digital Domain Media Group (parent company of Digital Domain Productions, Venice) bought the Thousand Oaks, Californian based, In-Three Inc. and announced that it was moving it to Port St. Lucie, Florida, where Digital Domain Media Group has its HQ. Michael Bay is an equity partner in the feature film visual effects company but is not an equity partner in the broader Digital Domain Media Group. The stereo group is now part by Digital Domain Media Group.
A team from In-Three shifted to Port St. Lucie in Florida, where it now works on further stereo conversion. This is separate from the other animation office of Digital Domain,Tradition Films, which is working on feature films. The 11-year-old In-Three had most recently worked with Digital Domain on some sequences on TRON: Legacy prior to the sale. Digital Domain stereo team in Florida, is a sizable facility for stereo conversion but it will almost certainly work as a hub to outsource perhaps to co-ventures Digital Domain Holdings has in both China and elsewhere.
“A large part of the stereo conversion team came from In-Three, 48 of the 65 or 70 employees from In-Three made the move all the way to Florida, that was what was interesting, they combined very well with local talent,..and they have been very effective with generating high quality stereo conversion,” explained John Textor, Head of Digital Domain Holdings, when he spoke exclusively to fxguide last month. In the last year since moving to Florida the team has expanded and there has been intense R&D. The team in Florida is now several hundred employees with most of the paint and roto sent “out of house” The stereoscopic compositing and more difficult paint work is done in Florida. The team has really enjoyed expanding and taking advantage of the increased resources that DD allows. The team has extensively hired locally, and done extensive training as there is not a large pool of local senior talent in Florida, with the industry being so young in that part of the USA.
The team now known as Digital Domain Stereo Group (DDSG) worked on Transformers: Dark of the Moon. While having perhaps less shots than Legend3D, they certainly believe they “had the more difficult hero shots on the project,” according to Jon Karafin, who is Director of Production Operations at DDSG. They have also worked on The Smurfs 3D for Sony Pictures and they have several projects in the works currently (including a large library conversion) that they cannot yet discuss.
The group has also expanded its tools considerably in the last 24 months. Two examples of this are:
• Decompositing (Decomp)
• Quick Three (Quick3)
Decompositing is a tool for preparing shots. “It allows us to re-construct very efficiently, new eye views and perspectives for the occlusion and transparency information to really provide the visual effects quality result that Digital Domain represents,” says Jeff Barnes, VP/GM DDSG.
The tool builds on DDSG’s close understanding and relationship with DD visual effects. “The strength of this group is that we have DD behind us. Pretty much all the other conversion houses do not have the breath of visual effects background that DD has, so we have a huge advantage in terms of what we can do with hero big complex vfx shows. The big trailer shots – no one can get the most out of those assets as we can.”
The Decompositing tools and the main Dimensionalization aim to provide complex depth information for each pixel, not a single z depth for each pixel.
Some pixels represent CG elements that have been added and hence there is an ability to access perhaps original geo, XYZ, or z depth information from the DD CGI pipeline, while others are captured as part of the plate photography and thus have no separate z depth information in a compositing sense than the background they were filmed over originally. So in a shot from Transformers there may be smoke added as CG smoke and also real smoke that is drifting around an actor from real smoke on set.
Starting with the first case where the smoke is added as CG: as DD has such a strong vfx background and was even the birth place of Nuke, the team can pass along complex vfx information and metadata to the Dimensionalization process. For the second example, which is the example where the smoke may have been filmed drifting over the face of an actor in camera, DD’s tools isolate the face completely separating it from the smoke, allow the smoke to either be rendered back in as a new particle system animation with all the depth information that goes alone with that, or have a volumetric geometry to provide a template for depth or disparity map generation. In both cases, DD builds what it calls Deep Disparity maps that feeds the Dimensionalization toolsets.
What is interesting is that this is currently being pushed even further by the team and currently in prototype phase, the team has this working into and with deep compositing, albeit as a proprietary format. The new deep compositing holds not one z-depth value for each pixel but a huge deep array of data that allows for compositing in participatory media or volumetric elements like smoke or haze. Unlike normal z depth these vastly bigger files allow for correct compositing and thus disparity at any point inside this volumetric file. While file sizes explode, the level of quality and complex disparity that this makes possible is also vastly improved. DDSG is the first company fxguide knows of that is merging the new deep compositing technology with complex disparity mapping in high quality stereo conversion.
“Whenever you have a single pixel that extends across multiple spatial planes like volumetric smoke, rain, reflections, motion blur, particles – anything of that sort – that is what the decompositing process lends itself to,” says Karafin, “because what we are doing is fully reconstructing every pixel so you have independent alpha and spatial planes to place them correctly in the stereoscopic scene. For example, we can reconstruct smoke seamlessly in the foreground without pulling forward any of the background information.” What this means to a director is substantial: “We can re-position, re-compose, re-lens a scene all as a post process – if the client wants to do that.”
Like other companies in the last 24 months, the DDSG team has built up its project management and meta data asset tools. “One of the great advantages of joining Digital Domain is that they already had an amazing pipeline to handle feature films, so that was part of our integration process – being able to tap into their pipeline to track all the meta data and project management tools,” former VP of Production and Operations of In-Three who saw that integration first hand.
This is a tool set that allows DDSG to produce faster and often times short duration stereo conversion for either games, mobile devices or other non-feature film markets. It is very much part of the brief of the Florida-based DDSG to explore non-feature film projects and this is something that excites the group. After all, both DD as a company and many of the staff have had a long background in commercials, film clips (music videos) and game cinematics. “We love features, but we love all the other genres – the game cinematic we are working on right now is just amazing, the visuals are so compelling and the crew is beyond excited about it,” says Barnes. I don’t think we are locked in to the specific medium – features tend to be bigger awards, but the shorter projects can be really appealing on a creative level for a bunch of reasons.” For DDSG to convert a major library title the team would take six to eight months to complete. Moving forward and building on Quick3 the team is looking forward to diversifying into these other areas.
General Approaches at DDSG
In terms of house style, and for Transformers: Dark of the Moon, the DDSG team produced left eye from right, right eye from left or two new eyes from a central hero original view on a per shot basis. DDSG explained that Michael Bay was great to work with and even though Legend3D did not take this approach on their work on the same film, DDSG changed this approach as per what they thought was best for the shot, and varied it on the shots they did for Bay and Corey Turner, the film’s stereographer. “They let us run with the show creatively, and pick what was best for the shot,” explains Karafin.
DDSG uses roto, geometry and projection technologies. Karafin summarizes the process as ” an extremely hybrid approach, we do work within the disparity space, but we can move between disparity space/screen space and real world co-ordinates seamlessly. Everything we do starts with object isolation and segmentation, and from there our tools generate the geometry that drives the disparity mapping processes. From there within the stereoscopic composite – it is using the geometry to drive the image processing to perform the displacement. So we are really leveraging every form of the stereoscopic creation technologies as possible to provide the most versatile and highest quality results.”
Industry Tools of the trade
Roto is the primary tool used for stereo conversion by volume. While the actual roto itself just prepares the material it is some of the most time consuming work in any conversion process.
Almost all programs have roto tools but key to the success of roto tools in a stereo pipeline is the ability to export and import the splines so that the roto can be built into a pipeline. Roto is normally a stage in a pipeline, it is unlikely the roto artist will also produce the final stereo.
Once footage is stereo converted, the missing information due to parallax must be replaced. Several approaches to this are used, including in-painting, often a proprietary tool that effectively smears into the gap the colors from the surrounding gap. This is only needed for one eye and only on one edge as the other side of the same object would move in the opposite way and fractionally cover some part of original image.
This is an approach whereby the image is projected over a depth map calculated point cloud. If the objects in a scene move or the camera moves, it is often over a set of frames to use optical flow style techniques to produce a disparity map. If this process is inaccurate, an edge of a persons hair, for example, can appear to not be where the character’s face is, but rather projected on the wall or building behind the character.
Several vendors use projection techniques to re-project imagery over stand-in geometry. This combination of scene reconstruction and roto has proven very popular in Nuke. The Foundry’s Nuke is now the mainstay of feature film compositing for many companies but it has also been one of the most popular tools for doing short form TV show or commercials stereo conversion. Nuke naturally works closely with Ocula, as it is currently the only visual effects compositing tool that supports the software. (See John Carter case study above)
This technique of having a false black frame that allows the film maker to use negative stereo space (ie in the theatre) without edge violations dates back to the 1950s. One of the first modern films to deploy floating windows was Meet the Robinsons. Some companies such as Disney and Pixar use it on 95% of shots while Cameron / Pace tends to not use floating windows at all. Some grading companies do not have a pipeline in their grading desk that allows for floating windows.
While someone can perfectly convert a film, they are limited in their ability to stage the action for stereo depth cues to the actual original composition. Some people would argue that native stereo allows for minor art direction and camera composition that will improve the stereo experience, while others point to the fact that a great shot is a great shot and cinematographers will compose for the shot they want and good shot composition will naturally have depth and not lack from the opportunity to recompose for stereo vs their natural normal 2D composition.
Roto-puff is a term that started floating around studio supervisors when they saw cheaply done conversion. It is used to describe the main approach to stereo conversion, which is rotoscope, blurs, and gradients. It can produce phenomenal results in the hand of good and experienced artists. It got this term from the early results that were less nuanced, but requires a lot of work and experience. The idea is that roto-puff is now considered inferior way compared to producing more intelligent approaches that allow for complex “depth sculptures”. Johnathan Banta, Stereo Pipeline Specialist at 3DCG: The Stereo Consulting Group, has consulted on a lot of native stereo projects, as well as stereo conversion. Banta notes that simple roto style approaches have been left behind as companies have moved beyond just roto. Companies that really understand stereo conversions, he argues, have a vastly more nuanced solution to the problem.
“I prefer the method that works best for the problem at hand. I did the first stereo projection maps in 1998, and at first favored that technique. I did roto-puff technique (gradient displacement depth maps) as early as 1996, and refined the technique in 2002. I then started using rendered depth maps, back-filled into the rotoscope shapes to get quicker results that were more consistent. Some pipelines only pick one of these techniques, and I prefer that the best tool be used for the job. A conversion house should be prepared to do any technique that is the quickest and best solution — whatever that may be.”
While he feels the best in the industry have moved beyond ‘roto-puff’ , “Roto is not the problem. Roto is a necessity, but (not enough) time spent sculpting is what gives stereo conversion the bad reputation”, he said pointing out that the sacrifice and hardwork of the thousands of artists involved should be rewarded by the industry as a whole.
Ocula: The Foundry
The Foundry in London, long time image processing and compositing experts, makes Ocula for stereo post production. Ocula produces some of the highest quality disparity maps allowing for a range of high quality solutions. Ocula was not primarily developed for stereo conversion. It was actually more developed for stereo production correction and adjustment. Filmed stereo footage can be analyzed and the stereo properties very accurately adjusted to aid matching and camera/rig imperfections, as well as accurate stereo roto. This is roto that object matches left and right eye rotos in each eye, not rotos designed for stereo conversion.
Several major effects houses use Ocula. Digital Domain used it on TRON: Legacy. ILM decided to invest in a significant number of Ocula seats after a successful evaluation period using it to tackle common stereo problems and used it on Avatar. MPC, London, has also made a major Ocula pipeline using it on Disney’s Pirates of the Caribbean: On Stranger Tides. They used Ocula to resolve vertical alignment issues and to color match plates.
Ocula is currently at version 3.0.
Problems addressed are primarily from filming in stereo such as:
- Color matching – fixing differences introduced by polarisation and mirror rigs
- Vertical alignment – correcting vertical offsets introduced by mismatched camera rigs and keystoning
- Focus matching – fixing focus differences between cameras by deblurring and view reconstruction
- Interaxial shifting – altering the interaxial distance baked into a stereo pair by rebuilding views
- Retiming – stereo motion vectors for retimes with correct stereo alignment
- Disparity generation – for automating and speeding up day to day compositing tasks including automated roto view generation, convergence manipulation, depth channel generation, material QC and more.
Ocula 3.0 has a new disparity generation algorithm. The algorithm in Ocula 2.X can deliver accurate disparities but also produces noisy results. This often meant a lot of clean-up to get clean disparity vectors and the danger there is that once you process the vectors they no longer correctly match pixels in the left and right views.
The algorithm in Ocula 3.0 is based a new technique that smoothes the disparity field as part of the process of matching the left and right eye images. The result is that you get cleaner results quicker. The algorithm is also specifically designed to match edges cleanly and to provide a set of intuitive parameters that allow the user to control how much emphasis is given to different constraints such as the vertical alignment of the camera rig.
“We made use of this new algorithm to introduce stereo motion vectors for retiming in Ocula 3. The motion vectors in the left and right eye image are calculated simultaneously to match the disparity vectors using the same set of intuitive parameters. This allows the user to control how similar the motion is in each eye to produce consistent stereo retimes” commented Ian Hall of the Foundry.
Ocula provides a toolset to correct native stereo footage. The main tasks are vertical alignment and color correction, it also provides tools to match focus, build z-passes, rebuild views, re-time footage as well as review the stereo corrections in a composite. At the moment Ocula does not deal specifically with stereo conversion other than providing the toolset to rebuild one view from another based on disparity.
You can generate disparity if you have a 3D model of the world or z-depth for one camera. The O_DepthToDisparity node in Ocula generates the disparity vectors for a CG stereo rig given z-depth. The problem is producing the z-depth for a single camera.
NukeX has a set of tools to calculate and model depth from one view. The DepthGenerator node calculates a depth pass for a camera for the background elements. Unfortunately there is not a magic answer – there is often insufficient parallax in camera moves to be able to recover the depth accurately. We are currently in the process of reviewing the toolset to create 3D geometry in NukeX to generate cleaner results and give the user more freedom in authoring geometry.
There is a huge amount of stereo conversion being done using Nuke. But there are not any specialized tools in Nuke for conversion, some have industry sources have commented that the Foundry aimed at native stereo production rather than conversion, – but that being said it is key to many conversion pipelines. Roto tools, Furnace, the DepthGenerator and the other geometry tools in NukeX are widely being used.
Each company that uses Nuke tends to use a blended pipeline of standard tools, plus an amount of special secret sauce which helps them to differentiate and to be efficient. As usual, it comes down to establishing the most efficient processes for the particular margins the work offers. Customers use fast layer-based techniques, and they use full match-moved geometry, and they use everything in between.
For more on Ocula as a stereo tool for shot stereo check out this fxguide story on Avatar.
In the area of specialist tools, Mocha from Imagineer is a vital tool for pipelines requiring roto.
April 16th, 2012 saw the release of Mocha v3. The latest version has a number of new features and a huge amount of work went into improving workflow.
New features include:
- New roto tools: Transform Tool and Join Layers (point parenting).
- Layer Management: View, organize, color code and group layers
- Project Management: Merge and share projects between artists
- 3D Camera Solve: Camera export for After Effects, Nuke, C4D & Maya
- Dope Sheet: new keyframe editor
For more on Mocha 3.0 read our first look story here at fxguide.
Pixel Farm: PFdepth
Performing 2D to 3D conversion by assigning accurate depth cues relative to the camera’s position over time, dynamic adjustments to the perceived dimensionality of the scene are animated automatically, making PFDepth truly distinctive amongst stereoscopic post production tools.
“Through speaking to our users who do this type of work, we’ve learned a lot about the significant complications encountered when trying to build a workflow around various loosely-connected applications,” says Daryl Shail, VFX Product Manager at The Pixel Farm Ltd. “When we set out to develop PFDepth, we approached it as a total solution to directly address the specific challenges of 2D to 3D conversion, with an emphasis on productivity, intuitiveness, and flexibility. As a result, you can literally drop PFDepth into any existing post production pipeline, and start cranking out shots from day one.”
Unified product approach means that if you”are working in roto you can do that, if you are working in geometry you can do that, Projection – you can do that, depth maps – you can do that…and we are generating all that in pfdepth”
The product works by assigning a depth to every pixel in the image, the product has got a number of different ways to modify image planes, so they have actual shape applied to them – whether that is by image modelling brought into pfdepth from pftrack (contact geometry) or by other methods such as perspective shapes.
The product aims to avoid complex and tedious manual roto, for example it has a x-spline with a planar tracker tool that helps with object identification by allowing computer assisted object isolation. An artist can paint or draw a line and the product will examine the image under the line and in this bounded region do image processing to determine foreground and background objects. From this, a detailed roto is provided with computer assistance. Roto is one of the largest problems in terms of time for stereo conversion so the Pixel Farm has really focused on providing a range of isolation tools. Another example is with say a tree with gaps between the branches and leaves that needs to be isolated from a background. PFdepth provides a ‘detail keyer’ which can be used to sample the background and separate the tree without rotoing the individual leaves.
“From our perspective, the greatest advantages of PFDepth stem from working in real-world space – assigning depth values based on world space and offsetting with pre-defined units of measure, and using 3D camera positions (motion or static) to dynamically adjust the perceived dimensionality of the scene. This means I can assign a specific depth value to an object, as being 15′ away from the camera for example (as opposed to some arbitrary shade of grey), and automatically adjust convergence based on the camera’s position at any point in time.” says Shail. While the tool will have roto features, he points out that ” the real coolness in PFDepth comes from the way we are working with geometry, depth maps, and depth modifiers (filters that can quickly deform a 2D image plane to add shape and surface texture)”.
As PFDepth works in concert with PFTrack, a solved camera solution from pftrack can be feed into PFDepth and help to build a disparity depth map based on 3D tracking object point cloud isolation triangulation. As PFTrack understands real world cameras, PFDepth also understands paint strokes in terms of real world co-ordinates and thus real world grey scale distances.
PFDepth was shown as a pre-release software at NAB and is currently being expanded upon with some new code, it will not be released for several months, which may involve a Beta program. Pixel Farm is keen to not over hype the product while it is still pre-alpha, fxguide will publish more when the product is actually released.
Quantel has built quite a following in 3D stereo workflows with the Quantel Pablo.
Modern VideoFilm used the Quantel Pablo on Avatar for conforming and stereo 3D checking, adjustment and quality control. The Pablos also handled all the stereo 3D English subtitling required for the Na’vi language used by some characters in the film.
Most of this work is in shot stereo workflow, but of course converted films or native stereo films look technically the same by the time they reach final DI grading. As both converted films and native films require extensive DI – we include the Quantel here as a key finishing tool.
James Cameron relied on Modern VideoFilm and its three Quantel Pablo systems, with 4K and 3D real-time capabilities. The Quantel Neo control surface was also used. Modern was also involved with the final grading of Titanic – stereo re-release for Cameron.
The work typically done on Pablo includes conforming and stereo 3D checking, adjustment and quality control. For Avatar, Modern VideoFilm also conformed promotional materials as well as the estimated 10 North American versions in their Pablos. Work ran for nearly seven months, starting in June 2009.
Interestingly, Pablo’s precursor, iQ was used at Modern VideoFilm to complete James Cameron’s 2003 3D Imax film Ghosts of the Abyss. Quantel has been working to further refine its toolset as 3D advances. The Pablos at Modern VideoFilm allow the artists to work in stereo 3D in realtime, a major selling point that Quantel built their entire company on, while also providing serious file management and asset management.
“The biggest job was to manage the thousands of shots that made up this movie. There were many different versions and we had to keep the reels and cuts current with editorial,” explains Roger Berger, Modern VideoFilm Supervising Editor and Lead Pablo Artist on Avatar. “We were editing 24/7-updating lists, updating reels, updating versions, visual effects shots were coming in,” adds Mark Smirnoff, President of Modern VideoFilm Studio Services. “It was organic. We did not follow the traditional workflow.”
During the production of Avatar production, a Pablo had as many as 12,000 clips online. In the end, the production consumed about 150 TB of SAN storage. In addition to completing the film itself, Modern VideoFilm met the demand for finishing movie footage for promotional events, each typically 10 to 20 minutes in length, all before the shots were finalized for the movie.
Like the Pablo, a key finishing tool in stereo is the SGO Mistika. This remarkable product has one of the most advanced optical flow engines for stereo adjusting for mis-alignment, disparity, and color differences. Producing much of the real time performance of the Pablo with the image processing technology of an Ocula pipeline - Mistika offers both automated and user controlled adjustment tools that are some of the most advanced in the industry.
The Mistika is not cheap, it is a ‘big iron’ item like the Pablo, with dedicated hardware, but as such it delivers stunning results and very quickly processes stereo imagery for alignment and grading. Mistika is at the forefront of higher frame rate and higher image size stereo workflows. Mistika handles a range of formats including SD, HDTV, Film, Stereoscopic 3D, real time 4K & 5K RED workflows, IMAX and multiple screen projection.
The product is being currently used on The Hobbit in New Zealand. Its roll out into the USA was delayed. The company took the unusual decision to roll the product out in secondary markets first and only recently started a full blown sales and marketing campaign in the US. SGO and Park Road Post Production, the premier New Zealand facility, have formed a new multi-year agreement. A strong bond already exists between Park Road and SGO, due to The Hobbit. Park Road has had Mistika systems since 2010, but as of April this year as they have added another three systems.
Mistika is actually a range of product, from onset tools to full large scale post-production systems, running natively on two large side by side user interface screens.
Technical Capabilities include:
- 16 bit per channel floating point processes
- Linux based OS
- Native support for most image file formats
- 3D LUT support
- HDR image support
- Motion Estimation based processing
- Optical flow correction and VA adjustment
- Auto regional near real time color-matching
- Real-time non-destructive processes for effects, colour grading, and more
- Background processing for any effect that can’t be performed real-time
It is an incredibly impressive product for stereo workflows.
Key Historical Timeline
(Key films of recent times – not exhaustive)
First use of projection mapping for stereo conversion of matte paintings -
Sassoon Film Design, Metrolight Studios.
Displacement mapping and projection mixed for stereo conversion of selected scenes – Sassoon Film Design
Conversion of original lunar mission photography as set extension, and as full frame images in a stereo film – Sassoon Film Design,Digital Dimension and others.
First big digital animated 3D stereo film, kicks off the start of the modern digital stereo era.
Stereo conversion of Roar: Lions of the Kalihari to 3D. First full live action feature conversion using hybrid displace/projection methods.
Converted by Industrial Light & Magic.
Sony Pictures Imageworks.
Sea Monsters 3D
Combination of full 3D and conversion shots (by Sassoon Film Design).
In-Three and SPI conversion (plus some original stereo generated animation).
Minor conversion done by companies such as Stereo D
Alice in Wonderland
In-Three conversion for the start and end of the film. The Wonderland section was SPI.
Prime Focus converted the film in just 10 weeks.
Major conversion by Prime Focus.
Stereo consulting by 3D CG.
Pixel Magic Visual Effects
Stereo-D Stereo conversion
Stereo-D Stereo conversion
Gulliver’s Travels (2010)
Major conversion by Stereo D, Rocket Science 3D and I.E.Effects.
Around the start of the decade there was an explosion of stereo films. Many of the films that were even shot in stereo required some stereo convergence. We have not listed all the films of the last couple of years but rather below are just a few key films with significant stereo conversion.
Harry Potter and the Deathly Hallows – Part 2 (2011)
Pixel Magic Visual Effects
Sassoon Film Design
Prime Focus View-D
Sassoon Film Design
* note this was supervised by Creative Cartel.
Supervised by ILM’s John Knoll
Conversion primarily by Prime Focus
Transformers of course was full of fast action with glass windows, smoke, explosions, sparks, and lens flares
Here is Scott Squire’s discussion of the conversion work: http://effectscorner.blogspot.com.au/2011/08/2d-to-3d-conversions.html
John Carter (2012)
Cinesite (see above)
Stereo D (see above)
Stereo D provided all of the 3D conversion, augmentation of visual effects, and full 3D editorial services for this blockbuster mega hit.
Stereo D had a relationship with Marvel from Thor and Captain America, which led them to Whedon and Avengers. Director Joss Whedon and cinematographer Seamus McGarvey, ASC, BSC shot on the Arri-Alexa in 2D, Stereo D converted vitually all of the film. They also leveraged the Deluxe resources to create a workflow that allowed Joss Whedon and his team to see every phase of the post work from one place using the Deluxe Express system. (Stereo D is part of Deluxe Entertainment Services and sometimes known as Deluxe 3D).
Workflow was as follows:
Cameras: 4x Arri Alexa, with Panavision Primo and PCZ Lenses (plus Arriflex 435 with Panavision Primo Lenses for high speed work, and even some Canon EOS 5D Mark II). The frame was composed for the 1.85:1 aspect ratio, a concept suggested early on by Whedon. DOP McGarvey is quoted as saying, “Shooting 1.85:1 is kind of unusual for an epic film like this, but we needed the height in the screen to be able to frame in all the characters like Hulk, Captain America and Black Widow, who is much smaller. We had to give them all precedence and width within the frame. Also, Joss knew the final battle sequence was going to be this extravaganza in Manhattan, so the height and vertical scale of the buildings was going to be really important.”
On set: Production DIT Danny Hernandez managed four Codex Onboard Recorders on set. He recorded CDL (color decision list) values to the Codex along with the media from the cameras. McGarvey supervised color LUTs on set and the looks they established were passed on to Efilm, the post facility handling dailies processing and final digital intermediate work. The Arri recorded to a Codex recorder in Arri-Raw format.
Edit: Editorial was located ‘near-set’ with a fully equipped with a Codex Digital Lab. The Editorial team checked and adjusted all the metadata. The shots were backed-up to LTO-5 tape. The data-packs from the Codex recorders were then sent to Efilm for dailies processing. The edit was done mono and the plates were sent to the vfx houses for the 2200 visual effects shots. Stereo D converted mainly from the finished comped shots, but some sequences such as the HUD for Tony Stark’s Ironman were deliberately designed for stereo. The advantage of this RAW workflow is that the footage was extremely clean and free of film artifacts like grain or film float or digital noise. The Alexa was rated at ISO 800 – but given the huge amount of lighting setups from low light to bright sun- it was key to have very clean noise free images to produce the best stereo conversion with the least amount of signal processing or noise reduction.
The film was converted to stereo in postproduction, utilizing the Stereo D process. “I’ve shot a little 3D, but never really worked in it,” says McGarvey, the DOP. “I’ve always approached it with a very highly raised eyebrow and now I must say I’m a convert with a movie like this. It’s really striking to see how well the 3D works.”
The process was fairly standard in terms of the Stereo D process. (See Titanic Above) – especially given that the team had done the films leading up to this such as Thor and Captain America, both of which were heavy vfx films – (although not as heavy as Avengers), but the mandate for this film was to “always be better..”. There are several ways the process improved. Aside from technical advances, StereoD worked very well in concert with other sister Deluxe facilities, from using the Deluxe private 10 gig dark fibre ‘pipeline’ review system that linked Burbank,(Stereo D), Hollywood(Efilm) and the production offices of Avengers in Santa Monica, to complex shot management, and dovetailing in with the final detailed DI grading and mastering at EFilm. Stereo D only joined Deluxe in May 2011, so this evolution that started with Captain America, and really expanded on Avengers as the company managed not only a stereo conversion, but co-coordinating 2200 effects shots from a dozen different vfx houses in a variety of stereo ways.
William Sherak president of Stereo D explained that the company had “partnered with our sister Deluxe companies, was able to employ an utterly unique end-to-end solution on The Avengers. Using Deluxe’s private and safe 10 gigabyte pipeline, our workflow enabled Marvel and Joss and Seamus to see every stage of post-production and conversion right through delivery in real time from one location. Being part of Deluxe, and having the best team of artists and stereographers in the business, along with our proprietary software, give Stereo D a singular advantage to filmmakers and studios who want to make the most of every minute in their post timeline”
Stereo D’s Chief Creative Officer Aaron Parry characterized the style of stereo for Avengers and Marvel films in general as being ” focused on giving the (Marvel) characters as much breath and depth as possible, where as other films may be other evenly distributed, dealing with all aspects… they(Marvel) really want their characters to sing”.
Stereo D worked with the various effects vendors one of four primary ways -
A • full conversion from mono to either L and R eye (also some Left from Right and Right from left)
B • Stereo Renders by VFX houses that were received as two full L and R to be just adjusted by Stereo D as part of the edit and the stereo script.
C • Partial stereo done as per-conversion. Here Stereo D would convert a shot to stereo from the mono camera ‘negative’ and then give that L & R set of clips to an effects house for stereo effects to be added. This is for example how the head up displays were done for Ironman. The Robert Downey Jnr shot was converted to stereo and then passed over to Cantina Creative for the stereo HUD to be added.
D• Stereo or mono elements (that were converted) were added back into a shot. Here, Stereo D was tasked with matching in stereo the work done by another facility in mono. Clearly, if a CG object was being added to a shot, it was possible for the facility to render and finish the mono version, but then also supply the CG element re-rendered stereoscopically. Stereo D needed to then match the work in both shots, one finished by a vendor and one finished by them.
Parry is keen to point out what a great job DOP McGarvey did with the original cinematography, the composition, and camera work set up the conversion at Stereo D to really bring the film to life in 3D.
The project was complex and Stereo D had 27 weeks to convert the film in total, which is not a rush, but the film was extremely complex. Stereo D could not wait until all the shots were completed to start of them so for some shots, once camera work and animation was locked down, the Stereo D team would begin roto and conversion. If the animation was not locked down then there was not much point – since the roto is so specific on the character and their faces, but if the animation was locked and just the lighting was still being adjusted, the team could begin and then drop in final shots over the top of the temp work, thus dramatically speeding up the pipeline and moving huge chunks of work earlier in the schedule. In all 57% of the films shots had this temporary pre-build approach.
Each week Deluxe/Stereo D would do a full conform of the film all 16 reels, this allowed a constant work in progress (WIP) of the film at all times. This coupled with Deluxe Express system, this was key in maintaining schedules.
The Head of Stereography on Avengers at Stereo D was Graham D. Clark. He headed a team that is experienced in the creative desires of the Marvel studio, but the pipeline is still creatively re-built each film to allow for the directors expression of how they feel their film should look in stereo. While Stereo D and Clark have strong opinions on all the technical aspects of the film, Whedon’s role is central to the look of the stereo version of the film.
Final color timing took place at EFILM in Los Angeles, with colorist Steve Scott, one of Hollywood’s most respected colorists. Scott handled the mono grade and the special stereo grade, which allows for light loss due to glasses and other technical aspects. Scott lead with the mono grade, but the process progressed in parallel – ‘it was not a linear process’. Scott could grade in Hollywood, and people could watch the grade in a virtualized version of his grading suite in a balanced full stereo review theater across town at Lantana Complex (ToddAO stages and production offices) in Santa Monica.
The film has opened to be the largest opening box office of any converted film in history.