In this article we would focus on invoking a SageMaker model endpoint for real time predictions using a SageMaker notebook instance as well as from a client outside of AWS (i.e a mobile app).

The model used in this article is the same as the one build in a previous article aiming to solve the Kaggle Bike sharing competition. Please read the previous article before continuing:

https://gdcoder.com/complete-guide-to-build-a-machine-learning-model-and-deployed-in-production-using-aws-sagemaker/

Table of Contents:

  • Invoking SageMaker Model EndPoints For Real Time Predictions
  • Invoking SageMaker Model EndPoints From Client Outside of AWS
  • Remove SageMaker endpoints and Shutdown Notebook Instance
  • Creating EndPoint from Existing Model Artifacts
  • Conclusion
Crunching the numbers
Photo by William Iven / Unsplash

Invoking SageMaker Model EndPoints For Real Time Predictions

First Step

Import the 'standard' python libraries along with boto3 for interacting with AWS. The SageMaker library provide an easy interface for running predictions on SageMaker endPoints.

Second Step

Now all we need to know is the SageMaker endPoint which can easily be found by clicking on the 'Endpoints' in the SageMaker console.

SageMaker Notebook Instance needs permission to communicate with the endpoint. We granted those permissions when launching the Notebook instance by specifying those permissions with the IAM role.

All we need now is to create an instance of real time predictor class and also to specify the format of the data that we send for prediction. In this case we specify them as csv. We use also a csv_serializer which will automatically convert an array of numeric values to csv when running predictions.

Third Step

Next, we load the data as pandas dataframe and we convert them to arrays in order to use the csv_serializer using the .as_matrix() function The datetime column is ignored as the model accept only numerical values. Each listed array contain one row worth of data.

Fourth Step

Let's now try running predictions; A good practice is to split the data into batches and run predictions into smaller batches.This has a significant advantage as if they are failure in the communications we can simply retry failed batches.

An easy way to split it is to use the np.array_split() function and specify the number of batches we want. Finally the predictions are return a byte array and you need to convert to a parsable string by using the .decode('utf-8') method and split the comma separating predictions and append them to a list.

Fifth Step

Remember that we used log(count) therefore we need to convert to actually value as following;

Invoking SageMaker Model EndPoints From Client Outside of AWS

Previously we run predictions from core running on SageMaker notebook instance. However, a lot of time the client will be running out of SageMaker i.e a mobile app and this will need to run predictions. Now I will show how we can launch a Jupyter notebook locally on my laptop and communicate with the SageMaker endpoint. This request anaconda, botto3 and Sagemaker library to be installed on the laptop.

To communicate with SageMaker it will need to verify your identity and you can do that by specifying a user account and configure this account on your local computer using the AWS command line interface. You will also need to specify the region where the endpoint is hosted. The account has to have necessary permissions to make prediction calls for endpoint belonging to this account.

Finally establish a SageMaker session using boto3 session;

Once this step is completed the next steps are exactly the same as in the case when we run predictions from core running on SageMaker notebook instance.

So that's how simple it is to interact with SageMaker and calling from outside the only thing that you have to do is to authenticate your AWS call and ensure the account that you are using has proper permissions to invoke SageMaker endpoints.

Remove SageMaker endpoints and Shutdown Notebook Instance

Remove EndPoints

When you deploy a model endpoint, the endpoint is available for prediction queries 24x7. So, you will incur charges if you leave them running. For development and testing, clean-up the endpoint after you are done with your testing by following the below steps:

  • Logon to AWS Management Console
  • Open SageMaker Console
  • From Navigation Pane, select 'EndPoints'
  • Select the EndPoint you would like to delete and from 'Actions' remove the endpoint

Stop Notebook instance

  • Logon to AWS Management Console
  • Open SageMaker Console
  • From Navigation Pane, Select 'Notebook'
  • Select the Notebook instance you would like to stop.
  • Under 'Actions', select 'Stop' to shutdown the instance

Creating EndPoint From Existing Model Artifacts

Creating EndPoint from Existing Model Artifacts

If you already have a model available in S3, you can deploy using SageMaker SDK. From your SageMaker Notebook instance,

  • Open the below notebook file
  • Update algorithm and S3 location to point to your model artifacts
  • Run the code to deploy endpoint

It is straight forward!

Conclussions

Let's summarise the points for Invoking SageMaker Model EndPoints For Real Time Predictions:

  • Use a RealTimePredictor
  • Specify the endPoint
  • Prepare the data in a format that can be understood from the predictor EndPoint
  • Invoke the EndPoint
  • Extract the result

We then discuss how to invoke a SageMaker Model EndPoints From Client Outside of AWS which require a user account with necessary permissions to make prediction calls for endpoint belonging to this account. Finally we discuss how to create an EndPoint from Existing Model Artifacts and how to delete SageMaker endpoints and shutdown Notebook Instance.

The complete Jupyter Notebooks can be found in my below github page;

https://github.com/geodra/AWS-SageMaker/tree/master

Thanks for reading and I am looking forward to hearing your questions :)
Stay tuned and Happy Coding.