Pick Our Brain

Open Source Contributions + Knowledge Sharing = Better World

June 22, 2018

V! Studios Wins 2018 Communicator Award of Distinction for Online Video

By Unknown

aftereffects, animation, NASA, science

Share this post:
: V! Studios Wins 2018 Communicator Award of Distinction for Online Video

V! Studios has been honored with a Communicator Award for its online video series of NASA ScienceCasts. The NASA ScienceCast series from NASA explores space phenomena and explains scientific discoveries to the general public.

The Communicator Awards are judged and overseen by the Academy of Interactive and Visual Arts (AIVA), a 600+ member organization of leading professionals from various disciplines of the visual arts dedicated to embracing progress and the evolving nature of traditional and interactive media.

“We’re honored by another Communicator Award for our continuing creativity and efforts for our customer,” says Mike Brody, Creative Services Manager and ScienceCasts Producer for V! Studios. “Although our studio has produced over 250 episodes of ScienceCast the past 7 years, receiving this Communicator Award affirms that our team constantly creates engaging content. As NASA’s expectations continue to rise, we enjoy meeting this challenge with shows that are both inspiring and effective.”

With over 6,000 entries received from across the US and around the world, the Communicator Awards is one of the largest and most competitive awards programs honoring creative excellence for communications professionals. “We are extremely proud to recognize the work received for the 24th Annual Communicator Awards. This year’s class of entries embodies the ever-evolving marketing and communications industry” noted Derek Howard, director of the Academy of Interactive and Visual Arts.

The Communicator Awards is the leading international awards program honoring creative excellence for marketing and communications professionals. Founded by passionate communications professionals over two decades ago, The Communicator Awards is an annual competition honoring the best digital, mobile, audio, video, and social content the industry has to offer. The Communicator Awards is widely recognized as one of the largest awards of its kind in the world.

Headquartered in Tysons Corner, VA, V! Studios is a unique hybrid company, successfully combining left brain and right brain skills to weave technology, information, and the arts into innovative and effective products and services. Learn more about V! Studios services at: V-Studios.com
June 21, 2018

V! STUDIOS WINS 2018 NASA T&I LABS INNOVATION CHALLENGE

By Unknown

AI development devapp Artificial-Intelligence

Share this post:
: V! STUDIOS WINS 2018 NASA T&I LABS INNOVATION CHALLENGE

Via PRweb.com: V! Studios Wins 2018 NASA T&I Labs Innovation Challenge

V! Studios, a creative technologies company, proposed for NASA to use artificial intelligence (AI) to transcribe vast amounts of digitally-scanned engineering drawings. NASA Technology & Innovation (T&I) Labs, an innovation program of the Office of the Chief Information Officer (OCIO) within the Technology and Innovation Division, selected V! Studios’ proof of concept as one of 10 projects for inclusion in its innovation challenge.

The T&I Labs Project Challenge solicits ideas from the greater NASA community and enables research for winning proposals as part of a rapid, low-cost, low-risk process. This year’s T&I Project Challenge Judge Panel of experts reviewed submissions and selected only ten winning proposals.

The goal of the V! Studios’ proof of concept is to demonstrate that existing NASA systems will benefit from using AI to improve project administration. Given the variety of engineering drawing layouts and decades of custom formats by multiple NASA contractors, AI will reduce manual cycles as well as personnel hours.

V! Studios CTO, Chris Shenton, enthuses, “When we get an artificial intelligence or machine learning solution to be able to look at an engineering drawing, zero in on the parts list, transcribe the information into structured data before inserting it into the legacy search system, then the entire system becomes more comprehensive and smarter”.

Headquartered in Tysons Corner, VA, V! Studios is a unique hybrid company, successfully combining left brain and right brain skills to weave technology, information, and the arts into innovative and effective products and services. Learn more about V! Studios services at: V-Studios.com
April 2, 2018

Serverless Browser Uploads to S3

By Chris Shenton

aws, lambda, s3, serverless

Share this post:
: Serverless Browser Uploads to S3

We’ve been diggin’ the serverless ecosystem around AWS Lambda, API Gateway and the rest, especially using the Serverless Framework which makes it easy to define interesting applications.

We frequently have a need to upload files from a user’s desktop to AWS S3 storage, and prefer to have the browser upload directly, rather than passing the file through some API handler. This allows us to deploy less API instances which would otherwise be bogged down just acting as a pass-through between the browser and the S3 bucket. It becomes more important when using Lambdas that have limited disk size and time limits.

Happily it’s pretty easy to do using pre-signed S3 URLs. These are URLs that can limit the time allowed for an operation (e.g., download, upload), but encode temporary credential information so that the requester — a user’s browser — does not need to have actual credentials needed to access the bucket.

We’re building a “Scaffolding” for Serverless apps that includes stuff like docs, tests, CI/CD, and deployment to multiple AWS environments upon commits. For the sample CRUD app, I’m using an Angular front-end and a Python back-end, so the code here will be in those languages. I’ve seen a number of folks looking for examples on how to do this pre-signed upload, and it took me a while to suss how to convince Angular to do this, so I thought I’d share what I learned.

In uploads/uploads.component.html we have a basic file picker:

<div>
    <label>Upload Filename:
        <input type=”file” #file (change)=”onChange(file.files[0])” />
    </label>
    <button (click)=”doUpload(file.files[0])”>
        upload
    </button>
</div>

which renders like:

In uploads/uploads.component.ts, we define the invoked function:

import { UploadService } from ‘../upload.service’;

doUpload(file0): void {
    // Read file, get presigned URL from API then PUT file to that S3 URL
    var reader: FileReader = new FileReader();

    reader.onloadend = (e) => {   // arrow_function doesn’t change `this`
      var fileBody = reader.result;

      this.uploadService.getUploadURL(file0)
        .subscribe(urlObj => {
          this.uploadService.putUploadFile(urlObj, file0, fileBody)
            .subscribe(res => this.log(`doUpload done res=${res}`));
        });
    }
    reader.readAsArrayBuffer(file0); // asBinaryString mutilates binaries!
}

It took me a while to realize that I needed to handle the “onuplodend” here, and invoke my services after the file was finished reading, rather than doing the read in the service.

After the file read completes, it then asks a service to calculate a pre-signed URL for a “PUT” operation; this is in upload.service.ts:

getUploadURL(file: File): Observable<UploadURL> {
    const apiURL = `${this.baseURL}/upload_url?filename=${file.name}`;
    const headers = { ‘Content-Type’: file.type };
    const options = { ‘headers’: headers };

    // API returns: {‘url’: url [, ‘warning’: ‘if no content-type specified’] }
    return this.http.get<UploadURL>(apiURL, options).pipe(
      tap(uploadURL => this.log(`got upload=${uploadURL.url}`)),
      catchError(this.handleError<UploadURL>(‘getUploadURL’))
    );
}

Then another service implements an HTTP PUT operation:

putUploadFile(uploadURL: UploadURL, file: File, fileBody): Observable<any> {
    const headers = { ‘Content-Type’: file.type };
    const options = { ‘headers’: headers };

    return this.http.put(uploadURL.url, fileBody, options).pipe(
      tap(res => this.log(`putUploadFile got res=${res}`)),
      catchError(this.handleError<UploadURL>(‘putUploadFile’))
    );
}

I discovered the hard way that I had to set the Content-Type, otherwise the http.put() assigned one of its choosing, and that broke the cryptographic signature calculated for the pre-signed URL; so make sure you specify this in both the calculation of the URL and the HTTP operation. Happily, it’s easy to get that from the HTML File object.

In my serverless.yml file, I specify a Lambda with an API Gateway endpoint:

getUploadURL:
    handler: handler.get_upload_url
    events:
      – http:
          method: get
          path: /upload_url
          cors: true

The Python Lambda function in handler.py calculates the URL; there’s more comment than code:

def get_upload_url(event, _context):
    “””Return a presigned URL to PUT a file to in our S3 bucket, with read access.

    We get the Content-Type from HTTP headers.
    “””
    content_type = event[‘headers’].get(‘content-type’) # APIG downcases this
    filename = event[‘queryStringParameters’].get(‘filename’)
    if not filename:
        return {‘statusCode’: 400,
                ‘body’: ‘Must supply query string “filename=…”‘}
    # We need to spec content-type since NG sets this header
    # ContentType is proper boto3 spelling, no dash; value must be lowercase.
    # ACL=public-read so NG can read it and display directly, without API
    params = {‘Bucket’: UPLOAD_BUCKET_NAME, ‘Key’: filename, ‘ACL’: ‘public-read’}
    if content_type:
        params[‘ContentType’] = content_type
    url = s3.generate_presigned_url(‘put_object’, Params=params, ExpiresIn=3600)
    body_json = {‘url’: url}
    if not content_type:
        body_json[‘warning’] = ‘No Content-Type that AWS signature calculation may require’
    return {‘statusCode’: 200,
            ‘headers’: {‘Access-Control-Allow-Origin’: ‘*’},
            ‘body’: dumps(body_json)}

And that’s it. Pretty straight-forward, quick and easy to run.

You can use the same technique to offer — say — limited-time downloads, just change the method from PUT to GET.

One word of warning: standard S3 uploads are limited to 5GB; this caused us a bit of surprise when we were building the video portion of images.nasa.gov — our large video files (frequently over 25GB) cannot be uploaded as a single part. For that, you’ll need to use S3 multipart uploads. You can use Lambdas to calculate the checksums required for each part, but you’ll have to do a bunch more work on the front-end side to calculate all the proper headers; probably best to use the AWS SDK in your browser rather than rolling your own as it’s hairy.
March 6, 2018

NASA ScienceCasts: Cosmic Bow Shocks

By Unknown

aftereffects, animation, NASA, science

Share this post:
: NASA ScienceCasts: Cosmic Bow Shocks

Check out the latest effort from our Animation Team for NASA’ Science Mission Directorate.
February 14, 2018

Serverless Optical Character Recognition in Support of NASA Astronaut Safety

By Unknown

aws, cloud, dynamodb, GovCloud, Government, Lamba, Public Sector, serverless

Share this post:
: Serverless Optical Character Recognition in Support of NASA Astronaut Safety

Chris Shenton, CTO, V!Studios

The NASA Extravehicular Activity (EVA) Office at NASA Johnson Space Center (JSC) needed to be able to search and make decisions based upon a huge volume of spacesuit safety and test documentation, many of which were only available as scans of paper reports. Timeliness was critical, especially in the event of a spacewalk mishap. While the EVA Data Integration (EDI) infrastructure was hosted in AWS GovCloud (US), the load to perform optical character recognition (OCR) on 100,000 pages per month overwhelmed their systems, so they had to suspend their OCR activity.

Our company, V! Studios, had been using AWS Lambda to prototype an OCR product, and we happened to run into our JSC EVA contact at the 2017 AWS Public Sector Summit. After giving him a brief demo of our prototype, he asked us to develop a solution to the EVA OCR integration bottleneck. There was one catch: it had to be completed in about two months, before the end of the fiscal year.

We knew we could develop a conventional cloud autoscaling solution based on Elastic Load Balancing, Amazon Elastic Compute Cloud (Amazon EC2), and Amazon Simple Queue Service (SQS), but the complexity of building and tuning that infrastructure would eat into our application development time. Instead, we leveraged the event-driven, automatic scaling features of AWS Lambda, driven by S3 events to deliver a solution that was elegant, fast, scalable, and cost-effective. We used IAM Roles and Policies and a separate VPC to integrate it securely with NASA’s existing system.

The EDI system is configured with an IAM Role containing a policy that lets it drop PDF scans into a specific S3 prefix, or “upload folder,” within our S3 bucket. This first event starts the OCR process by triggering a Lambda function which splits the document, which could be around 500 pages, into individual pages, and writes each page back to S3 into a second S3 prefix. These events trigger a parallel swarm of Lambda functions, which perform OCR on the individual pages and drop the text into separate files under another S3 prefix. Each of those events trigger a Lambda function to check whether all pages are done; when true, it assembles a JSON document associating each page of text to the page number and parent document ID and places it in the bucket’s final prefix. This invokes the final Lambda function, which feeds each page sequentially to EDI’s ElasticSearch API. Using S3 events to trigger Lambda functions allowed us to decompose the functionality into small, well-understood pieces with loose coupling, rather than a tightly coupled monolith.

In order to minimize operational costs, V! Studios eschewed the use of a database – even Amazon DynamoDB. Instead, we used S3 prefixes to track process, and propagated the original document’s name, unique ID, and other information with each processed file using S3’s user-defined metadata fields. We used an S3 Lifecycle Configuration that removes content after 24 hours to reduce costs further and strengthen our security posture. All data is encrypted in transit as well as at rest in S3. We deployed the Lambda functions into a VPC separate from the EDI services in order to provide isolation and give the security team the ability to verify that no data is “leaked” from Lambda. It also provided sufficient address space to scale out a fleet of Lambda servers that could easily launch a thousand functions for a single document. All this was deployed in the AWS GovCloud (US) Region due to the sensitivity of the information.

The Lambda service allowed us to focus on our problem and deliver on time, within budget, and deploy NASA’s first serverless operation running in AWS GovCloud (US). Operational costs are expected to be a small fraction of what would be incurred with conventional cloud architectures. Cuong Q Nguyen of the JSC/NASA EVA Office had this to say: “The work you’ve accomplished is a big step proving out this new technology for NASA.” We hope to leverage Lambda for other NASA, federal, and commercial solutions – it’s a game-changing technology.

*This post originally appeared on the Amazon Web Services Public Sector blog.
December 4, 2017

ScienceCasts: An Out of This World Research Lab

By Unknown

aftereffects, animation, NASA, science

Share this post:
: ScienceCasts: An Out of This World Research Lab

The International Space Station is more than just a bright light in the night sky. It is also an out-of-this-world research laboratory.

ISS: www.nasa.gov/station
October 13, 2017

Automatic Swarm of Ephemeral Servers Extract Text to Improve NASA Astronaut Safety

By Unknown

aws, cloud, extraction, lambda, library, NASA, ocr, text

Share this post:
: Automatic Swarm of Ephemeral Servers Extract Text to Improve NASA Astronaut Safety

V! Studios develops high volume solution for extracting text from scanned documents faster and cheaper for NASA’s Johnson Space Center by utilizing Amazon Web Services (AWS) Lambda serverless technology.

V! Studios, a creative technologies company, has developed a new high-speed, low-cost cloud-based solution for optical character recognition (OCR) and text extraction. Standard OCR software extracts text as a page-by-page serial task while the V! Studios solution coordinates an ephemeral server to process each page. Using hundreds of ephemeral servers operating in parallel, the system completes a scan for hundreds of pages in minutes.

V! Studios developed its text extraction engine “SOCRATES” (Serverless Optical Character Recognition And Text Extraction System) to assist businesses and enterprises to quickly identify and discover information within document repositories. The product provides a quick and cost-effective service, enabling text extraction in a cloud workflow while maintaining data security. Upon learning of V! Studios’ efforts, NASA realized the potential of the technology to solve an existing problem.

The Extravehicular Activity (EVA) group at NASA’s Johnson Space Center faced the challenge challenge of converting a massive amount of spacesuit safety and test documentation into an easily searchable database. Although their existing OCR procedure was cloud-based, the serial text extraction process significantly slowed the workflow.

“When I learned V! Studios was developing a serverless OCR solution, I immediately saw the potential benefit for a current problem in our cloud-based data integration application. By decoupling the OCR text extraction function using Lambda’s serverless framework within GovCloud, we can now extract text from documents in a fraction of the time and cost.” – Cuong Nguyen, NASA EVA Project Manager

SOCRATES is an automatic, serverless compute system that scales instantly to process a thousand page document in minutes while avoiding many of the requirements of a conventional cloud computing environment. With V! Studios’ solution, NASA’s text extraction tasks gets the exact amount of compute power at exactly the time needed, eliminating the cost for idle time.

V! Studios CTO, Chris Shenton, enthuses, “The cool thing about SOCRATES is that it’s event-driven and it instantly handles all the scale-out for you; you pay only for active use, so the cost-savings are huge – it’s a game-changer.”

V! Studios looks to deploy SOCRATES as a SaaS (Software as a Service), with availability coming to the AWS Marketplace in the first quarter of 2018 in all U.S. regions including GovCloud. The API will provide quick and cheap text extraction service for any organization to integrate into their cloud workflow.
March 31, 2017

Accessing NASA Digital Media Now Easier With New Library Search Engine

By Unknown

aws, cloud, images, library, NASA, photos, video

Share this post:
: Accessing NASA Digital Media Now Easier With New Library Search Engine

TYSONS, VA (PRWEB) MARCH 28, 2017

By leveraging cloud computing and storage capabilities, V! Studios developed a highly scalable, cost efficient multimedia library for NASA at images.nasa.gov, which officially launched to the public on March 28, 2017.

NASA, like many other government agencies, is challenged with compiling, archiving, and retrieving massive amounts of digitized content. Prior to this agency-wide media solution, each NASA center is responsible for managing its own multimedia repository. Historically, this meant that users would often have to search multiple repositories to find a particular media asset.

V! Studios built an Application Program Interface (API) backend that permits curators at each NASA center to upload media assets in bulk, index existing metadata, and preserve the original files. The backend automatically transcodes and resizes all files into popular standards including those for mobile devices. At any time, a curator can add or edit metadata tags on any image, audio, or video file. The site grants visitors the ability to conduct quick searches, find related content, refine search results by date, and discover the most recently uploaded or most popular images.

To provide flexibility to customers, the application is built with a backend “engine”, which exposes an “API”, with a mobile-friendly responsive web front-end. Anyone can integrate a customized application with the system by making simple HTTP requests to the API.

The solution leverages the scalability of cloud storage to provide unlimited storage capacity, which eliminates the need for onsite server management and administration. The economies-of-scale provided by cloud storage, permits NASA to realize declining storage costs, while providing built-in durability.

Moe Benesch, CEO of V! Studios says, “While it is exciting to see all this great NASA content made easily available to the public, the most interesting opportunity will come from how other organizations, or individuals, utilize the API and reach into terabytes of content as part of their own web application or research project.”

Companies and organizations looking to offer easily retrievable high resolution content can benefit from solutions similar to images.nasa.gov. Practical applications include media organizations needing to make content available from a simple interface, or public institutions that wish to make their content available to the public in multiple resolutions.
March 20, 2017

ScienceCasts: NASA Embraces Small Satellites

By Unknown

aftereffects, animation, NASA, science

Share this post:
: ScienceCasts: NASA Embraces Small Satellites

NASA is embracing small satellite designs, from tiny CubeSats to micro-satellites. These miniature marvels are providing many ways to collect science data and to demonstrate new technologies.
February 17, 2017

Design Patterns for Serverless, Lambda, DynamoDB, S3

By Chris Shenton

aws, cloud, dynamodb, elasticsearch, lambda, s3, serverless

Share this post:
: Design Patterns for Serverless, Lambda, DynamoDB, S3
Motivation

We’ve been using AWS load balancers with autoscaling instances for years now and it’s great at handling load, but it’s quite a bit of infrastructure to manage (even with Troposphere + CloudFormation). We also have to manage all the data flow, queue processing and such ourselves: multiple SQS queues, EC2 polling, recording state in databases… More of our code is dedicated to that “plumbing and wiring” than the actual focus of our application.

We’ve been looking at “serverless” options after the announcement of AWS Lambda, and been following the rapid development of the Serverless framework. Recently I started working on a proof of concept, re-grooving our cloudy application for a serverless world. So far, I’m really lovin’ it.

Instead of implementing data flow, finite state machines, queuing and such in our software, we describe that wiring in terms of AWS “events” that trigger “functions” running on Lambda infrastructure. Instead of “data flow as code” we now have “data flow as configuration”.

We’re comfortable with AWS services — and for this exercise, want to avoid using any 24×7 EC2 servers — so our Lambda functions interact with S3 object storage, DynamoDB databases, and Elasticsearch service search engines. Here are some design patterns that have proven to be helpful as we’ve come up to speed using this brave new world; they’re pretty generic problem solving approaches so should be applicable to your applications as well.

Sample Application

The application I’m using for this learning exercise, this proof of concept, is a pretty common pattern. Someone uploads an image to an S3 object store, it gets resized, some data about it is stored in a database, then the data and processed image locations are put into a search engine. A simple query API lets users find images in a variety of sizes. The API is exposed to the public via a responsive Angular web front-end, but that’s a subject for another post.

We upload an image to S3 and it fires an event to an “extract” lambda which stores info in a database. A “resize” lambda is triggered to resize the image and store it in S3, then updates the database entry. The database emits an event stream which is fed to a “publish” lambda which checks to see if both the resize and metadata information is present, and if so, injects the info into our search engine.

Below we show some patterns we’ve found useful for doing this. We’re mostly a Python shop, so the code examples are in Python. The Serverless framework configuration is a YAML file, serverless.yml.
S3 Events

S3 can emit events on object creation and deletion. Normally we care about creation (uploads) and handle it with a lambda that acts on the creation.

We can create events in each lambda function definition separately for each event type, so we can have a different function module for each event, like:
```
functions:
  extract:
    handler: extract.handler
    events:
      - s3:
          bucket: images-in
          event: s3:ObjectCreated:*
  nuke:
    handler: nuke.handler
    events:
      - s3:
          bucket: images-in
          event: s3:ObjectDeleted:*
```
This, in my case, would invoke two python modules, extract.py and nuke.py, each of which has its own handler function. Easy enough, but could be a bit too fine-grained if there’s a lot of redundant code in the two files.

Instead, we could have one module which processed both (all) event types, perhaps using different handler function in the Lambda:
```
functions:
  create:
    handler: s3in.handle_create
    events:
      - s3:
          bucket: images-in
          event: s3:ObjectCreated:*
  delete:
    handler: s3in.handle_delete
    events:
      - s3:
          bucket: images-in
          event: s3:ObjectDeleted:*
```
Here, we’d have a single s3in.py module with two handler functions, handle_create() and handle__delete(). If both share some code, this reduces repetition.

Or we could have one module and one handler and let the handler discriminate based on examination of the event:
```
functions:
  extract:
    handler: s3event.handler
    events:
      - s3:
          bucket: images-in
          event: s3:*
```
I expect it’s more reliable to let the Lambda event/handler mapping do the discrimination, as it saves a step in the handler in the last example.

DynamoDB Streams

Unlike S3 events, DynamoDB streams emit information about the changed rows. The record contains an eventName like “INSERT”, “MODIFY” or “REMOVE”. We don’t get separate events we can discriminate on in the severless.yml file.

This means that our handler must handle all event types. We can do something like:
```
eventname = record['eventName']
if eventname == 'REMOVE':
    self.delete()
....
raise Exception('Unimplemented: id={} ignoring eventname={}'.format(self.id, eventname))
```
This is a pretty straight-forward pattern.

Below we show how we could switch on these with a minimal top-level handler() function.

Handler structure

I’m finding it convenient to have a minimal handler function that loops over incoming S3 events or DynamoDB streams. I frequently see a bunch of DynamoDB records come in at once to my lambda — we can’t simply assume we get a single event and get only records[0].

So my handler tries to be as dumb as possible, looping over the triggers and calling a class to do the work:
```
def handler(event, context):
    try:
        for record in event['Records']:
            AssetDDBRecordHandler(record)
    except Exception as e:
        msg = 'ERROR asetddb.handler: {}'.format(e)
        log.error(msg)
        return {'event': event, 'message': msg}
    return {'event': event,
            'message': 'Function executed successfully: asset.handler'}
```
This also makes it easy to wrap the record handler in an exception handler that logs the exception — which lets the Lambda complete successfully instead of throwing the exception which causes Lambda to needlessly retry the permanently-failed event (which the Lambda infrastructure presumably does to overcome transient errors, overload conditions, etc).

In the class, the constructor examines the event type (e.g., ADD, REMOVE, …) and invokes a method specific to the event — a simple dispatcher:
```
class AssetDDBRecordHandler:
    def __init__(self, record):
        self.id = record['dynamodb']['Keys']['id']['S']
        eventname = record['eventName']  # INSERT, MODIFY, REMOVE
        if eventname == 'REMOVE':
            self.delete()
        elif eventname == 'INSERT':
            self.insert()
        raise Exception('Unimplemented: id={} ignoring eventname={}'.format(self.id, eventname))
    def delete(self):
        try:
            res = es.delete(index='images', doc_type='image', id=self.id)
        except Exception as e:
            raise Exception('id={} deleting Elasticsearch index: {}'.format(self.id, e))
```
Note that the code that actually does the work is free to throw detailed exceptions up to the top-level handler() since it will catch them and log instead of blowing out the lambda.

DynamoDB Streams Native Protocol Deserialization

When we get S3 events in Lambda, we get a clean structure we can dissect easily to get the event, bucket, key and whatnot as native Python structures. When we access DynamoDB through the Boto3 library using the Table model, we can also read and write the Python structures easily.

But the stream records we get from DynamoDB in our Lambda are encoded with a low-level serialization protocol like the one you have to use when you work with Boto3’s DynamoDB Client model. Each datum is a dict indicating type and value, like:
```
{u'_dt': {u'S': u'2017-02-08T13:30:38.915580'},
 u'id': {u'S': u'ALEX18'},
 u'metadata': {u'M': {u'description': {u'S': u'12-year-old...'}}} }
```
So we have to deserialize it to process. It’s tempting to think this is easy to write, but realize that DynamoDB records can have nested elements, like the u’M’ map (Python dict) above, so it becomes a chore to roll your own.

“There’s got to be a better way!”. Happily, there is.

Boto3 has to do this, and it has a deserializer (and serializer) we can simply import and use. This is how I do it:
```
from boto3.dynamodb.types import TypeDeserializer

deserialize = TypeDeserializer().deserialize
for record in event['Records']:
    data = {}
    new = record['dynamodb'].get('NewImage')
    if new:
        for key in new:
            data[key] = deserialize(new[key])
    id = data['id']
```
Now we can work with data as native Python objects.

We could get clever and use a Python dict-comprehension to combine three lines into one:
```
data = {k, deserialize(v) for k, v in new.items()}
```
Conclusion
We’re certain to come up with more design patterns that make the resource-event-function wiring easier, and the lambda processing more self-contained, but the above have emerged quickly and naturally as ways to structure our project. It’s been a fun and rewarding exercise which has given us the confidence to go all-out on future projects in a serverless, event-driven manner.

Pick Our Brain

Open Source Contributions + Knowledge Sharing = Better World

June 22, 2018

V! Studios Wins 2018 Communicator Award of Distinction for Online Video

Share this post:

June 21, 2018

V! STUDIOS WINS 2018 NASA T&I LABS INNOVATION CHALLENGE

Share this post:

April 2, 2018

Serverless Browser Uploads to S3

Share this post:

March 6, 2018

NASA ScienceCasts: Cosmic Bow Shocks

Share this post:

February 14, 2018

Serverless Optical Character Recognition in Support of NASA Astronaut Safety

Share this post:

December 4, 2017

ScienceCasts: An Out of This World Research Lab

Share this post:

October 13, 2017

Automatic Swarm of Ephemeral Servers Extract Text to Improve NASA Astronaut Safety

Share this post:

V! Studios develops high volume solution for extracting text from scanned documents faster and cheaper for NASA’s Johnson Space Center by utilizing Amazon Web Services (AWS) Lambda serverless technology.

March 31, 2017

Accessing NASA Digital Media Now Easier With New Library Search Engine

Share this post:

March 20, 2017

ScienceCasts: NASA Embraces Small Satellites

Share this post:

February 17, 2017

Design Patterns for Serverless, Lambda, DynamoDB, S3

Share this post:

Motivation

Sample Application

S3 Events

DynamoDB Streams

Handler structure

DynamoDB Streams Native Protocol Deserialization

Conclusion