Open Source Contributions + Knowledge Sharing = Better World
Animating and Rendering Large Data Sets
By Unknown
Share this post:
: Animating and Rendering Large Data Sets
Photo of eight-foot tall prints, with detail. Click for larger view.
We were recently given the task of producing a series of animations and high resolution still images based on very dense 3D scan data from an archaeological dig site. The site included four graves and their remains along with an artifact. The data from the site came to us in several pieces of varying and overlapping resolutions and source image quality. Surface textures were based on photogrammetry while the geometry was produced using multiple forms of 3D scanning.
The biggest challenge to this project was the efficient management of the large amount of data involved. By the time this project was done, we had mostly filled a dedicated terabyte drive with its resources. The geometry consisted of eight data sets, most of them containing more than ten million polygons each. The majority of texture maps were sixteen thousand pixels square. Even with a very fast local network, plenty of Solid State Drive space, 64gigs of RAM and dual pro video cards in our graphics workstations, careful scene management was necessary.
The source data consisted of a digital elevation map and aerial photography of the surrounding countryside, a point-cloud scan covering a few acres, 3D scans converted to polygonal geometry for the dig site along with separate, higher resolution scans of the interiors of each grave as well as scans of a reliquary found in one of the graves, both before and after restoration. Each piece of geometry had an associated high resolution surface texture generated through photogrammetry which were then manually cleaned and optimized. The fully assembled scene totaled over nine gigabytes of geometry and texture data.
This may seem excessive, but we were dealing with scientifically accurate and historically significant data. The client wanted the resulting imagery to be as detailed and accurate as possible. The animations would go from aerial photography a thousand feet up, all the way down to within inches of human remains and artifacts. Still renders of entire individual graves would be printed life sized and would be scrutinized from inches away. We wanted to keep as much detail as possible, showing grains of sand, twigs, tiny cracks, along with conveying an understanding of layout of the site as a whole.
We used Autodesk Maya as our primary animation tool for this job, along with Cinema 4D for the aerial photography element and DEM manipulation. Rendering was done with a mix of Autodesk’s Mental Ray and Chaos Group’s V-Ray on our render farm. The farm is a hybrid of in-house OS X machines along with a dynamically scaling cloud based system that runs a combination of Linux and Windows, depending on the required tool. The cloud farm is very small when idle, but can quickly scale up on demand. Our animation team is fortunate in having another team in our company that just happens to specialize in custom cloud computing solutions, networking and vast data storage and manipulation. They made this setup possible. Thanks guys!
One of the problems we had not anticipated was how quickly even our fast network would become saturated when many render machines come on line, simultaneously requesting such large amounts of data. This seriously affects render efficiency when the bulk of the farm is figuratively twiddling it’s thumbs, waiting for the data it needs to get started. The dynamic scaling of our cloud farm is based on how busy the processors of the machines are over a given time period. This lead to a yo-yo effect where machines would be added to the farm as the base machines got busy, then pruned out of the farm because they weren’t actually doing anything as they waited for all the data to come in, then added back when the base machines showed they needed help, etc.
One of the things we did to improve this situation was build an auto synchronizing mirror of our local render server in the same cloud space where the render machines reside. This did introduce a bit of a delay as local data synced to the cloud render server before a render could begin, but it eliminated a significant part of the bottleneck. As segments of a render are completed they are written first to the cloud server, then mirrored back to our local network. Our render farm management tool includes options to throttle the number of machines that can request data at once, but that reduces the speed and efficiency to some extent. A new, higher speed cloud storage system is in testing now, and should completely eliminate the problem once it is more widely available.
Highest vs. Lowest resolution geometry. Click for larger view.
On the actual animation and render side of things, we built multiple resolution sets of data, from full resolution down to about 1/10 resolution, along with intermediate versions as necessary. The scene was built with simple reference objects that could be easily swapped in and out at different resolutions as required. This made the interactive process of lighting and animation go smoothly, using only low resolution geometry and textures for initial setup. For the most part, the highest resolutions were not necessary for the animation sequences, except for when the camera got very close to the surfaces. This also made it possible to send jobs to the render farm in smaller parts, reducing the amount of data sent back and forth and increasing efficiency. For animation, it’s a fairly straight forward process to send sets of individual frames to each machine on the farm for rendering. For the very large still images, each image was broken into 256 tiles and those tiles were individually sent to render on different machines on the farm. After all the tiles are rendered, a separate process assembles them into the final composite image.
The extremely large size of some of the print renders pushed the limits of commercial digital image creation and editing tools. Usually when printing a wall sized image it is intended to be viewed from at least several feet away, so the amount of information per square inch can be relatively low. In this case, however, we wanted viewers to be able to walk right up to the prints, look closely from inches away and see all the details as if they were looking at the actual object. Standard file formats like Photoshop’s .psd and the tried and true .tif were not capable of holding all the information in these images. Fortunately Adobe has a less common format, the .psb or Large Document Format that can hold many times the information as the standard Photoshop document. In addition to Adobe’s format, we used the open source OpenEXR file type for animation output. This is a format designed for digital effects production by Industrial Light and Magic and updated in 2013 in conjunction with WETA Digital. It can handle multilayered, high dynamic range images with and without compression and with practically limitless resolution, along with lots of other useful image information. There are plugins available that allow most professional editing tools to read and write OpenEXR.
Lastly, there’s the physical side of all of this. In our current, relatively high-speed internet connected world, we have gotten used to sending information nearly instantly with the click of a mouse, even for large images, videos and animations. That was not possible for this project. Test images and sample renders could be sent for review electronically, but for final delivery the only practical solution was to physically hand deliver a terabyte hard drive.
Ultimately, the client was delighted with the final imagery and animation. Their research announcement got wide distribution and high visibility around the world through many media outlets.
The client attributed at least some of that success to having compelling imagery to go along with the story. This project pushed our technical abilities to new levels, with very satisfying results.
We’ve been coming up to speed with Docker, planning to use it for deployments on AWS and GCE.
I’ve tried it before and gotten a bit frustrated with the disconnect between my daily driver — a MacBook laptop — and the docker server; the cool kids running Linux laptops have no such issues. While boot2docker is of course a huge help, I had problems. It wasn’t running like the docs said it would, it asked for a password when bringing it up; something was seriously hosed.
Some of these turned out to be ancient installations of docker, so recently I used brew to remove them and reinstall current versions. It took me a while to realize that my VM was running an old ~/.boot2docker/boot2docker.iso, so I removed it too and did the brew reinstall again. Even better.
Background: boot2docker for OS X
Folks running Linux run the docker server natively, but it doesn’t run native on OS X, so boot2docker was created. It runs a small VM inside VirtualBox which acts as the docker server. The ‘docker’ command can then communicate with it via a UNIX socket and we can reach it with TCP. This extra distance is what complicates things for OS X users and docker-1.3 makes this much more transparent.
That HOST address and port will change if you restart your boot2docker.
So lets get into the big win caricatured in the release’s graphic.
In the sections below, I’m creating then running a container “webvol2” which pulls from DockerHub the standard “nginx” image. I want to mount a section of my local filesystem in the container so I can easily update the content HTML serves. Finally, I want a way to get into the container and look around to verify the volume is as expected.
Mount local OS X volumes in the container
I’ve been feeling like a second-class citizen, compared with my Linux brethren: they could mount local filesystems in their containers. This made it super-easy to — for example — develop web content locally and test it served by a docker-resident application, without resorting to building new images with ADD or COPY in Dockerfiles.
There’s a great discussion on GitHub about how best to accommodate this on OS X, and happily, it was resolved on October 16 with the docker-1.3 release. This is huge: I no longer covet my neighbor’s laptop. Check it out, “it just works”:
f985f7dc574ce8228c96c64dac769f6123411849330748f3dd2dce4d7daf9ef3
The above mounts a docker-related directory under my home as a volume on the container. In this case, it’s shadowing the one that was originally installed by Nginx; exactly what I want.
Get a shell in the container
Lots of folks want visibility into their containers but you have to do this manually. Some folks include an ssh server in their Dockerfile images, but this bloats the image and may pose a security risk. Chris used ‘nsenter’ and a neat shell script to get access. That’s no longer necessary; now it’s trivial:
MPEG-4 Part 14 or MP4 is a digital multimedia format which acts like a wrapper for video and audio files. One huge benefit of MP4 is that the format allows for using different video codecs, such as h.264, which allow for better compression while still providing high quality video/audio but smaller file sizes. Smaller file sizes in turn allow better results when streaming content over the Internet.
MP4 – it does the trick for all occasions.
Aside from file size, why use .mp4 as the wrapper of choice for video files on the web? The answer is simple. MP4 files do not require proprietary software to be played by an end user. Video files that use the MP4 wrapper can be played cross platform and can viewed using any number of popular video players. Another benefit of MP4 files is their ability to play on mobile devices without relying on proprietary video players. Additionally, mainstream media players such as Windows Media Player and Quicktime can play the files natively and they do not require any plugin downloads.
For web designers, using MP4 allows the use of simple code to add HTML5 players to integrate into any website. Some may ask, what about HTML4? The answer is, the future. Again we point to modern browsers that support the progressive development of HMTL5 while still providing support for older standards. By using the <video> tag a basic player can be added to a page without using old standards which call for object ids and allow for the MP4 file to be play natively in modern browsers. Below is an example of the code needed to add video to a web page.
<video width="400" controls> <source src="your_video_file.mp4" type="video/mp4"> Your browser does not support HTML5 video.</video>
So, why use MP4? Again, the answer is simple. The file format allows for greater viewer access across PC platforms, modern browsers (Internet Explorer 9+, Firefox, Opera, Chrome, and Safari) and mobile devices while making it easier for designers to complete sophisticated web sites and interfaces.
V! Studios received an award from the NASA Extra Vehicular
Activity (EVA) Office for creating the EVA Drawing Repository Dashboard (EDRD)
Demo. The Office of the CIO commented “You guys deliver!” The Demo is a Proof of Concept to enable real-time access to EVA data
during missions. NASA pursued the project as a result of an incident during a
space walk in 2013.
The award reads:
“In recognition of successfully completing the Proof of
Concept demonstrating easy access to EVA data. The system will enhance
real-time mission decision making which will ensure astronaut safety during
EVA’s.”
The patch affixed to the award represents the EVA Team and
was flown aboard the Space Shuttle Atlantis on its final servicing mission to
the Hubble Space Telescope during STS-125, May 11-24, 2009.
We have beenWading into Goon some projects recently. In fact we have been using it on small and throw away projects for a while. Wefirst used Go in angerto manage transfering and updating ~500,000 unique files (~1TB total) from an EBS volume to S3.
It was my first code in Go so it isn’t pretty. I am also not sure what the likelihood of it working now are as it used a fork ofgoamz. Thefork was absorbedintocentral (hah) fork of goamz, but YMMV. The take away is that Go did made dealing with a massive number of files during a large scale migration practicable and I would definitely choose it again.
NB:A first class AWS SDK for Go would be awesome. This is definitely the missing tooth in a smile.
During that same project migrated a large number of vanity URL redirects. As part of the move there was a rule; if a redirect hasn’t been reviewed in more than 2 years, get rid of it. We had no way to know when rules had last been reviewed. They were stored as apache rules across a dozen servers. So the order was given that redirects had to have a “last reviewed” date. We used Go again to build anin memory redirect serverwith custom “Last reviewed” headers.
Most recently we have been using Go to write the backend API for an app powered by angularjson the client side. This our first project which leverages GAE and is expected to have sufficient complexity and lifespan to warrant first class testing. The rest of this post discusses what warts we’ve seen when we got up close and how we have worked around them.
Testing
Testing with Go is a pleasant experience. Go’s standard library ships with a testing package that should feel familiar to most programmers. It is admittedlymissing some convenience items like assertions, but that does not have much impact. Many coming from dynamic languages might think this is an ugly feature. However, it is easy enough to include a few of your own.