How to serve daily precomputed predictions in aws sagemaker?

Question

I'm trying to use Sagemaker to serve precomputed predictions. The predictions are in the following format in a python dictionary.

customer_group prediction 1 50 2 60 3 25 4 30 ...

Currently the docker serve API code goes to s3 and downloads the data daily.

The problem is that downloading the data blocks the api from responding to the Sagemaker health endpoint calls.

This a case study of how zappos did it using Amazon DynamoDB. However, is there a way to do it in Sagemaker?

Where and how can I add the s3 download function to avoid interrupting the health check?

Could this work? -> https://github.com/seomoz/s3po https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-x-email-support

app = flask.Flask(__name__) @app.route('/ping', methods=['GET']) def ping(): """Determine if the container is working and healthy. In this sample container, we declare it healthy if we can load the model successfully.""" health = ScoringService.get_model() is not None # You can insert a health check here status = 200 if health else 404 return flask.Response(response='\n', status=status, mimetype='application/json') @app.route('/invocations', methods=['POST']) def transformation(): """Do an inference on a single batch of data. In this sample server, we take data as CSV, convert it to a pandas data frame for internal use and then convert the predictions back to CSV (which really just means one prediction per line, since there's a single column. """ data = None # Convert from CSV to pandas if flask.request.content_type == 'text/csv': data = flask.request.data.decode('utf-8') s = StringIO.StringIO(data) data = pd.read_csv(s, header=None) else: return flask.Response(response='This predictor only supports CSV data', status=415, mimetype='text/plain') print('Invoked with {} records'.format(data.shape[0])) # Do the prediction predictions = ScoringService.predict(data) # Convert from numpy back to CSV out = StringIO.StringIO() pd.DataFrame({'results':predictions}).to_csv(out, header=False, index=False) result = out.getvalue() return flask.Response(response=result, status=200, mimetype='text/csv')

Chris Williams · Accepted Answer · 2020-05-10 09:17:23Z

2

+50

Why not call batch transform instead and let AWS do the heavy lifting.

You can either schedule to be completed every day, or instead trigger it manually.

After this use either API Gateway with a Lambda function or CloudFront to display the results from S3.

edited May 10, 2020 at 9:17

answered May 10, 2020 at 7:17

Chris Williams

35.7k4 gold badges46 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

dank Over a year ago

A persistent endpoint is needed. The training and writing to S3 is done. The problem is how to serve the data.

Chris Williams Over a year ago

Hi, I have amended answer to include the displaying via persistent endpoint

dank Over a year ago

Thanks mokugo! Do you have more resources on how to serve a S3 table through gateway lambda // cloudfront?

Chris Williams Over a year ago

docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html or medium.com/tensult/…

dank Over a year ago

lambda doesnt solve the issue because it just moves things from one bucket to the other. Also, a CDN is overkill. Thank you Chris :) The goal is to have a microservice that serves an inmemory dictionary using sagemaker.

Collectives™ on Stack Overflow

How to serve daily precomputed predictions in aws sagemaker?

1 Answer 1

5 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Related