We’ve been diggin’ the serverless ecosystem around AWS Lambda, API Gateway and the rest, especially using the Serverless Framework which makes it easy to define interesting applications.
We frequently have a need to upload files from a user’s desktop to AWS S3 storage, and prefer to have the browser upload directly, rather than passing the file through some API handler. This allows us to deploy less API instances which would otherwise be bogged down just acting as a pass-through between the browser and the S3 bucket. It becomes more important when using Lambdas that have limited disk size and time limits.
Happily it’s pretty easy to do using pre-signed S3 URLs. These are URLs that can limit the time allowed for an operation (e.g., download, upload), but encode temporary credential information so that the requester — a user’s browser — does not need to have actual credentials needed to access the bucket.
We’re building a “Scaffolding” for Serverless apps that includes stuff like docs, tests, CI/CD, and deployment to multiple AWS environments upon commits. For the sample CRUD app, I’m using an Angular front-end and a Python back-end, so the code here will be in those languages. I’ve seen a number of folks looking for examples on how to do this pre-signed upload, and it took me a while to suss how to convince Angular to do this, so I thought I’d share what I learned.
In uploads/uploads.component.html we have a basic file picker:
<div>
<label>Upload Filename:
<input type=”file” #file (change)=”onChange(file.files[0])” />
</label>
<button (click)=”doUpload(file.files[0])”>
upload
</button>
</div>
which renders like:
In uploads/uploads.component.ts, we define the invoked function:
import { UploadService } from ‘../upload.service’;
doUpload(file0): void {
// Read file, get presigned URL from API then PUT file to that S3 URL
var reader: FileReader = new FileReader();
reader.onloadend = (e) => { // arrow_function doesn’t change `this`
var fileBody = reader.result;
this.uploadService.getUploadURL(file0)
.subscribe(urlObj => {
this.uploadService.putUploadFile(urlObj, file0, fileBody)
.subscribe(res => this.log(`doUpload done res=${res}`));
});
}
reader.readAsArrayBuffer(file0); // asBinaryString mutilates binaries!
}
It took me a while to realize that I needed to handle the “onuplodend” here, and invoke my services after the file was finished reading, rather than doing the read in the service.
After the file read completes, it then asks a service to calculate a pre-signed URL for a “PUT” operation; this is in upload.service.ts:
getUploadURL(file: File): Observable<UploadURL> {
const apiURL = `${this.baseURL}/upload_url?filename=${file.name}`;
const headers = { ‘Content-Type’: file.type };
const options = { ‘headers’: headers };
// API returns: {‘url’: url [, ‘warning’: ‘if no content-type specified’] }
return this.http.get<UploadURL>(apiURL, options).pipe(
tap(uploadURL => this.log(`got upload=${uploadURL.url}`)),
catchError(this.handleError<UploadURL>(‘getUploadURL’))
);
}
Then another service implements an HTTP PUT operation:
putUploadFile(uploadURL: UploadURL, file: File, fileBody): Observable<any> {
const headers = { ‘Content-Type’: file.type };
const options = { ‘headers’: headers };
return this.http.put(uploadURL.url, fileBody, options).pipe(
tap(res => this.log(`putUploadFile got res=${res}`)),
catchError(this.handleError<UploadURL>(‘putUploadFile’))
);
}
I discovered the hard way that I had to set the Content-Type, otherwise the http.put() assigned one of its choosing, and that broke the cryptographic signature calculated for the pre-signed URL; so make sure you specify this in both the calculation of the URL and the HTTP operation. Happily, it’s easy to get that from the HTML File object.
In my serverless.yml file, I specify a Lambda with an API Gateway endpoint:
getUploadURL:
handler: handler.get_upload_url
events:
– http:
method: get
path: /upload_url
cors: true
The Python Lambda function in handler.py calculates the URL; there’s more comment than code:
def get_upload_url(event, _context):
“””Return a presigned URL to PUT a file to in our S3 bucket, with read access.
We get the Content-Type from HTTP headers.
“””
content_type = event[‘headers’].get(‘content-type’) # APIG downcases this
filename = event[‘queryStringParameters’].get(‘filename’)
if not filename:
return {‘statusCode’: 400,
‘body’: ‘Must supply query string “filename=…”‘}
# We need to spec content-type since NG sets this header
# ContentType is proper boto3 spelling, no dash; value must be lowercase.
# ACL=public-read so NG can read it and display directly, without API
params = {‘Bucket’: UPLOAD_BUCKET_NAME, ‘Key’: filename, ‘ACL’: ‘public-read’}
if content_type:
params[‘ContentType’] = content_type
url = s3.generate_presigned_url(‘put_object’, Params=params, ExpiresIn=3600)
body_json = {‘url’: url}
if not content_type:
body_json[‘warning’] = ‘No Content-Type that AWS signature calculation may require’
return {‘statusCode’: 200,
‘headers’: {‘Access-Control-Allow-Origin’: ‘*’},
‘body’: dumps(body_json)}
And that’s it. Pretty straight-forward, quick and easy to run.
You can use the same technique to offer — say — limited-time downloads, just change the method from PUT to GET.
One word of warning: standard S3 uploads are limited to 5GB; this caused us a bit of surprise when we were building the video portion of images.nasa.gov — our large video files (frequently over 25GB) cannot be uploaded as a single part. For that, you’ll need to use S3 multipart uploads. You can use Lambdas to calculate the checksums required for each part, but you’ll have to do a bunch more work on the front-end side to calculate all the proper headers; probably best to use the AWS SDK in your browser rather than rolling your own as it’s hairy.
