Building and deploying AWS Lambda layers for Python on Ubuntu

2020-04-11

Simple docker script to build an AWS 'serverless' Lambda layer

A lambda "layer" is what AWS calls a dynamically loaded library that you build yourself for your serverless function.

[Note that some people have already built many layers and made them public for many usecases. Links to the ARNs of these layers are posted on GitHub at https://github.com/keithrozario/Klayers and links to others at https://github.com/mthenw/awesome-layers. You might want to try to use their versions first before you try this here script.].

AWS puts limits on the number and total size of the layers.

[We are not happy with these limits. Actually much closer to unhappy than happy. Google Cloud has similar limits. Azure has no such limits. See our other article on how we get around these limits {pull hair; bang forehead on screen; wipe blood off screen; grimace in the general direction of Seattle; write own dynamic library loader}]

Update: There is now (June, 2020) a way to attach AWS EFS to Lambda to overcome some of these issues - albeit at additional cost https://aws.amazon.com/blogs/aws/new-a-shared-file-system-for-your-lambda-functions/. We'll cover this in a future post.

Below is the script we use to build a layer for a lambda, in this case for a lambda function that is written in python 3.7.

requirements.txt contains the list of packages that the layer will contain. For example "pandas".

In our application, Automatic.ai, layers are atomic and we build a separate layer for each major library. For example, numpy, which pandas requires, is built as a separate layer. These atomic layers are later combined dynamically by our runtime API on a as needed basis to support the unique requirements of each user's intelligent service.

In many applications, however, there is some benefit and simplification to just fitting as many libraries as possible into the layer, as long as the total layer size stays within the hellishly small limits relative to the size of the average bloated library circa 2020.



#!/bin/bash

export PKG_DIR="python"

rm -rf ${PKG_DIR} && mkdir -p ${PKG_DIR}

docker run --rm -v $(pwd):/foo -w /foo lambci/lambda:build-python3.7 \
    pip install -r requirements.txt --no-deps --ignore-installed -t ${PKG_DIR}


Tells docker to create a AWS-linux-compatible environment for python3.7. The docker then installs all the library packages found in requirements.txt but to not load their dependencies and at the same time ignore any of these same libraries that you may have previously installed.

Simple script to deploy an AWS Lambda layer

Next is the script we use to upload the layer to AWS S3 and register it with AWS lambda.

AWS lambda functions refer to layers by their ARN - a long string of at-first intimidating characters which you may find becomes a friendly and useful sight-aid over time.

Replace OWNER with the user who is doing the upload and BUCKET with a S3 bucket. The second call to AWS in this script makes the layer's ARN publically available and the bucket needs to be similarly public if this is of importance to you [otherwise, if you want the layer to be private to your account, delete the line from the script]


#!/bin/bash

PKG_NAME="$1"
OWNER="me"
BUCKET="mybucket"

sudo chown -R ${OWNER}:${OWNER} python

rm -rf python/*.dist-info python/__pycache__

zip -r ${PKG_NAME}_layer.zip python

export AWS_PROFILE="production"

aws s3 cp ${PKG_NAME}_layer.zip s3://${BUCKET}/layer/  --acl public-read

# https://docs.aws.amazon.com/cli/latest/reference/lambda/publish-layer-version.html
aws lambda publish-layer-version --layer-name ${PKG_NAME}_layer_3_7 --content S3Bucket=${BUCKET},S3Key=layers/${PKG_NAME}_layer.zip --license-info "MIT" --compatible-runtimes python3.7

# https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html
# aws lambda add-layer-version-permission --layer-name ${PKG_NAME}_layer_3_7 --statement-id allow-everyone --version-number 1 --principal '*' --action lambda:GetLayerVersion

export AWS_PROFILE="staging"


Let's see... The line:


rm -rf python/*.dist-info python/__pycache__

Here we try to reduce the size of the library package in a generic manner. YMMV. All I can say here is that the current AWS limitation of 256MB for source code for each lambda is ridiculously small when you want to use something like ridiculously large machine learning libraries.


export AWS_PROFILE="production"

Lets AWS know that this is for the production environment, not staging or test. This AWS_PROFILE ENV variable is restored back to staging at the end of this script because we do not want to accidentally, at some future point while we are mucking around, to unwittingly change the production deployment. 'staging' is the default ENV setting for every terminal here.


aws s3 cp ${PKG_NAME}_layer.zip s3://${BUCKET}/layer/  --acl public-read

Now we upload the zip file to AWS S3 using the command line and the AWS CLI. We keep the uploaded zips in the 'layer' directory [yup, AWS doesn't call them directories. Or folders. They call them 'keys'. So many implementation details thrown about like sprinkles on a donut. Rain in Seattle. Laughs on a 70's sitcom laugh track.] and make them public --acl public-read [You may not want to do this] so that every service can have access to them for dynamically loaded layers. At some future date we may publicize the url of all our layers and their ARN addresses if it appears they will be useful outside our usecase.


aws lambda publish-layer-version --layer-name ${PKG_NAME}_layer_3_7 --content S3Bucket=${BUCKET},S3Key=layers/${PKG_NAME}_layer.zip --license-info "MIT" --compatible-runtimes python3.7

This line registers the uploaded zip file as a lambda layer, which will spit out the ARN address of the layer to STDOUT when it is done.


aws lambda add-layer-version-permission --layer-name ${PKG_NAME}_layer_3_7 --statement-id allow-everyone --version-number 1 --principal '*' --action lambda:GetLayerVersion

This last call to AWS makes the layer public. [You may not want to do this. But this is what keithrozario et. al. have done (see links above) and if you want to contribute your lambda layer ARNs to the world, this is how you would go about it].

- enjoy.