Intro to AWS Lambda Sandra Garcia, Jose San Pedro Data Learning Sessions. 16th March 2017
What is AWS Lambda - Serverless Computing - 100% managed - Scales automatically - Pay per time of CPU - Lambda Functions - Languages: NodeJS / Python / Java / Scala - Stateless - Suitable for quick event-based operations (<5min) - Fast to setup: a basic setup Lambda + trigger + basic function can be done in 1/2h ;)
Common Use Cases
Common Use Cases
Common Use Cases
● Inputs ○ Code: Controls what lambda does ○ RAM: Controls the amount of resources provisioned (CPU, etc) ○ Time: Controls the cost ● When a lambda function is triggered ○ AWS launches a container and executes it ○ Container creation is expensive, aws tries to reuse it for subsequent invocations ■ Global environment of the code is preserved ■ /tmp folder persists ○ No control for when containers are reused Execution Model
● CPU size based on allocated RAM ● Concurrent invocations: ○ Stream-based: by shard ○ Event-based ■ One invocation per event ■ N. Concurrent invocations ~ events_per_second * time_to_process_event ○ Hard limit to avoid massive bills ■ 100 concurrent invocations per zone Autoscaling Support
Fault Tolerance ● Stream-based ○ Repeated attempts to process failing records until they expire ● Event-based ○ Repeated twice. If error persists, notification to Dead Letter Queue if defined
VPC Support
Our setup(personalisation team): Reading content events - Processing ~300,000 events a day - Written in Scala - Consuming ad content events from Kinesis - Lambda function parses events for different Rocket sites (now aggeliopolis & kufar) and indexes the extracted ads into elasticsearch indices
Our setup: Reading content events Put simply: Kinesis Stream Lambda Elasticsearch VPC
Setting up AWS Lambda 1. Configure you trigger 2. Configuration your function 3. Logging & Monitoring 4. CD & CI
Configuring Triggers
Configuring your lambda
Configuring your lambda handler
Setting up Lambda with awscli ● Full command line support through awscli aws lambda <command> ● Automate deployment aws lambda create-function --function-name ${LAMBDA_FUNCTION_NAME} --runtime java8 --role ${LAMBDA_EXECUTION_ROLE_ARN} --handler "indexing.lambda.KinesisAdEventsProcessor::processAdEvents" --code S3Bucket=${S3BUCKET},S3Key=${EXECUTION_ENVIRONMENT}/${VERSION}/${S3KEY} --description "recsys-ad-content-index:${VERSION}" --environment Variables={commit_hash=${VERSION}} --vpc-config "${LAMBDA_VPC_CONFIG}" --memory-size 512 --timeout 30
Setting up Lambda with awscli aws lambda update-function-code --function-name ${LAMBDA_FUNCTION_NAME} --s3-bucket ${S3BUCKET} --s3-key ${EXECUTION_ENVIRONMENT}/${VERSION}/${S3KEY} aws lambda get-function --function-name ${LAMBDA_FUNCTION_NAME} aws lambda list-functions aws lambda update-function-configuration --function-name ${LAMBDA_FUNCTION_NAME} --description "recsys-ad-content-indexer:${VERSION}"
CD & CI Versioning and aliases - Aliases can point to multiple lambda versions so you can easily switch your stable version or rollback v1 arn:aws:lambda:eu-west-1:3144322:function:ad-content-indexer:v1 v2 $LATEST arn:aws:lambda:eu-west-1:3144322:function:ad-content-indexer:$LATEST
CD & CI Integration of Spinnaker + Lambda is coming soon ... In the meantime… Travis is your man https://docs.travis-ci.com/user/deployment/
CD & CI //////
Cloudwatch Logging
Logging Alternatives ● Sumologic ● ELK Setup
Elasticsearch + Kibana
Monitoring Cloudwatch
Monitoring Cloudwatch
Monitoring Cloudwatch
Monitoring Datadog
Monitoring Datadog
Practice Time!
Practice time! Requirements: - Scala + sbt - An AWS Account - Download the lambda-example repo: git clone git@github.schibsted.io:sandra-garcia/lambda-example.git sbt assembly
Practice time! You can find some sample Pulse events in /lambda-tutorial/data/
1. Upload your code to S3 a. Create a new bucket under spt-data-tests/lambda-tutorial/<YOUR-NAME> b. Upload your lambda code to S3 aws s3 cp lambda-function.jar s3://spt-data-tests/lambda-tutorial/<YOUR-NAME> c. Format of S3 URL: https://s3-eu-west-1.amazonaws.com/spt-data-tests/lambda-tut orial/<YOUR-NAME>/lambda-function.jar
2. Define your lambda trigger Create your lambda function in https://eu-west-1.console.aws.amazon.com/lambda/ a. Select “blank function” b. Select trigger (s3) i. Select event type: PUT ii. Select bucket: spt-data-tests/ and prefix: lambda-tutorial/<YOUR-NAME>/ http://docs.aws.amazon.com/lambda/latest/dg/intro-permission-model.html#lambda-intro-execution-role
3. Configure your function a. Give it a name: lambda-test-<YOUR-NAME> b. Handler: lambda.Main::processEvents c. Runtime: Java 8 d. S3 URL (Format: https://s3-<REGION>.amazonaws.com/<S3-BUCKET>/<S3-KEY>) https://s3-eu-west-1.amazonaws.com/spt-data-tests/lambda- tutorial/<YOUR-NAME>/lambda-function.jar e. Give lambda permission to read from S3(*). i. Choose existing role: service-role/lambda_s3_access * In this example we have already set up a role in the data-dev aws account. If you’re running this example in another account you will need to create a new role http://docs.aws.amazon.com/lambda/latest/dg/intro-permission-model.html#lambda-intro-execution-role
4. Configure your lambda function permissions Identity and Access Management (IAM) https://console.aws.amazon.co m/iam/home?region=eu-west-1# /roles/
5. Test your function Test your lambda with a test event already uploaded here: s3://spt-data-tests/lambda-tutorial/test-event.json NOTE: make sure to change the bucket region to the region where the lambda is running! (eu-west-1) Inspect your Cloudwatch logs. https://confluence.schibsted.io/pages/viewpage.action?spaceKey=SPTPER&title=Setting+up+lambda+functions
6. Upload a file to trigger your lambda a. Upload an events file to s3 aws s3 cp lambda-example/data/ad-events.json s3://spt-data-tests/lambda-tutorial/<YOUR-NAME> b. Inspect your Cloudwatch logs C. Check the output file on: aws s3 ls s3://spt-data-tests/results/lambda-tutorial/<YOUR-NAME>/ad-events.json.csv
Try out aws cli Try out operations with aws cli (aws lambda help) https://confluence.schibsted.io/pages/viewpage.action?spaceKey=SPTPER&title=Setting+up+lambda+functions > aws lambda help > aws lambda get-function --function-name lambda-test-sandra > aws lambda update-function-code --function-name lambda-test-sandra --s3-bucket spt-data-tests --s3-key lambda-tutorial/sandra/lambda-function.jar
Questions or Comments? jose.san.pedro@schibsted.com sandra@schibsted.com @sandra.garcia @jose.san.pedro

Intro to AWS Lambda

  • 1.
    Intro to AWSLambda Sandra Garcia, Jose San Pedro Data Learning Sessions. 16th March 2017
  • 2.
    What is AWSLambda - Serverless Computing - 100% managed - Scales automatically - Pay per time of CPU - Lambda Functions - Languages: NodeJS / Python / Java / Scala - Stateless - Suitable for quick event-based operations (<5min) - Fast to setup: a basic setup Lambda + trigger + basic function can be done in 1/2h ;)
  • 3.
  • 4.
  • 5.
  • 6.
    ● Inputs ○ Code:Controls what lambda does ○ RAM: Controls the amount of resources provisioned (CPU, etc) ○ Time: Controls the cost ● When a lambda function is triggered ○ AWS launches a container and executes it ○ Container creation is expensive, aws tries to reuse it for subsequent invocations ■ Global environment of the code is preserved ■ /tmp folder persists ○ No control for when containers are reused Execution Model
  • 7.
    ● CPU sizebased on allocated RAM ● Concurrent invocations: ○ Stream-based: by shard ○ Event-based ■ One invocation per event ■ N. Concurrent invocations ~ events_per_second * time_to_process_event ○ Hard limit to avoid massive bills ■ 100 concurrent invocations per zone Autoscaling Support
  • 8.
    Fault Tolerance ● Stream-based ○Repeated attempts to process failing records until they expire ● Event-based ○ Repeated twice. If error persists, notification to Dead Letter Queue if defined
  • 9.
  • 10.
    Our setup(personalisation team):Reading content events - Processing ~300,000 events a day - Written in Scala - Consuming ad content events from Kinesis - Lambda function parses events for different Rocket sites (now aggeliopolis & kufar) and indexes the extracted ads into elasticsearch indices
  • 11.
    Our setup: Readingcontent events Put simply: Kinesis Stream Lambda Elasticsearch VPC
  • 12.
    Setting up AWSLambda 1. Configure you trigger 2. Configuration your function 3. Logging & Monitoring 4. CD & CI
  • 13.
  • 14.
  • 15.
  • 16.
    Setting up Lambdawith awscli ● Full command line support through awscli aws lambda <command> ● Automate deployment aws lambda create-function --function-name ${LAMBDA_FUNCTION_NAME} --runtime java8 --role ${LAMBDA_EXECUTION_ROLE_ARN} --handler "indexing.lambda.KinesisAdEventsProcessor::processAdEvents" --code S3Bucket=${S3BUCKET},S3Key=${EXECUTION_ENVIRONMENT}/${VERSION}/${S3KEY} --description "recsys-ad-content-index:${VERSION}" --environment Variables={commit_hash=${VERSION}} --vpc-config "${LAMBDA_VPC_CONFIG}" --memory-size 512 --timeout 30
  • 17.
    Setting up Lambdawith awscli aws lambda update-function-code --function-name ${LAMBDA_FUNCTION_NAME} --s3-bucket ${S3BUCKET} --s3-key ${EXECUTION_ENVIRONMENT}/${VERSION}/${S3KEY} aws lambda get-function --function-name ${LAMBDA_FUNCTION_NAME} aws lambda list-functions aws lambda update-function-configuration --function-name ${LAMBDA_FUNCTION_NAME} --description "recsys-ad-content-indexer:${VERSION}"
  • 18.
    CD & CI Versioningand aliases - Aliases can point to multiple lambda versions so you can easily switch your stable version or rollback v1 arn:aws:lambda:eu-west-1:3144322:function:ad-content-indexer:v1 v2 $LATEST arn:aws:lambda:eu-west-1:3144322:function:ad-content-indexer:$LATEST
  • 19.
    CD & CI Integrationof Spinnaker + Lambda is coming soon ... In the meantime… Travis is your man https://docs.travis-ci.com/user/deployment/
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
    Practice time! Requirements: - Scala+ sbt - An AWS Account - Download the lambda-example repo: git clone git@github.schibsted.io:sandra-garcia/lambda-example.git sbt assembly
  • 31.
    Practice time! You canfind some sample Pulse events in /lambda-tutorial/data/
  • 32.
    1. Upload yourcode to S3 a. Create a new bucket under spt-data-tests/lambda-tutorial/<YOUR-NAME> b. Upload your lambda code to S3 aws s3 cp lambda-function.jar s3://spt-data-tests/lambda-tutorial/<YOUR-NAME> c. Format of S3 URL: https://s3-eu-west-1.amazonaws.com/spt-data-tests/lambda-tut orial/<YOUR-NAME>/lambda-function.jar
  • 33.
    2. Define yourlambda trigger Create your lambda function in https://eu-west-1.console.aws.amazon.com/lambda/ a. Select “blank function” b. Select trigger (s3) i. Select event type: PUT ii. Select bucket: spt-data-tests/ and prefix: lambda-tutorial/<YOUR-NAME>/ http://docs.aws.amazon.com/lambda/latest/dg/intro-permission-model.html#lambda-intro-execution-role
  • 34.
    3. Configure yourfunction a. Give it a name: lambda-test-<YOUR-NAME> b. Handler: lambda.Main::processEvents c. Runtime: Java 8 d. S3 URL (Format: https://s3-<REGION>.amazonaws.com/<S3-BUCKET>/<S3-KEY>) https://s3-eu-west-1.amazonaws.com/spt-data-tests/lambda- tutorial/<YOUR-NAME>/lambda-function.jar e. Give lambda permission to read from S3(*). i. Choose existing role: service-role/lambda_s3_access * In this example we have already set up a role in the data-dev aws account. If you’re running this example in another account you will need to create a new role http://docs.aws.amazon.com/lambda/latest/dg/intro-permission-model.html#lambda-intro-execution-role
  • 35.
    4. Configure yourlambda function permissions Identity and Access Management (IAM) https://console.aws.amazon.co m/iam/home?region=eu-west-1# /roles/
  • 36.
    5. Test yourfunction Test your lambda with a test event already uploaded here: s3://spt-data-tests/lambda-tutorial/test-event.json NOTE: make sure to change the bucket region to the region where the lambda is running! (eu-west-1) Inspect your Cloudwatch logs. https://confluence.schibsted.io/pages/viewpage.action?spaceKey=SPTPER&title=Setting+up+lambda+functions
  • 37.
    6. Upload afile to trigger your lambda a. Upload an events file to s3 aws s3 cp lambda-example/data/ad-events.json s3://spt-data-tests/lambda-tutorial/<YOUR-NAME> b. Inspect your Cloudwatch logs C. Check the output file on: aws s3 ls s3://spt-data-tests/results/lambda-tutorial/<YOUR-NAME>/ad-events.json.csv
  • 38.
    Try out awscli Try out operations with aws cli (aws lambda help) https://confluence.schibsted.io/pages/viewpage.action?spaceKey=SPTPER&title=Setting+up+lambda+functions > aws lambda help > aws lambda get-function --function-name lambda-test-sandra > aws lambda update-function-code --function-name lambda-test-sandra --s3-bucket spt-data-tests --s3-key lambda-tutorial/sandra/lambda-function.jar
  • 39.