How to experiment prediction on housing price using machine learning? Part 5 – Deploy & maintain your prediction service

This post is dedicated to one of my reader that asked the following questions: after experimentation, how do I exploit my model into production? Feel free to ask on which topic you are interested into, so that we can dig some Machine learning topics into more details in this blog.

Recap on the Housing price prediction,

Through the experimentation, we end up with the best model we can find given a dataset at time T:

Two questions remain to make a running business out of the experimentation:

  • How do I publish my services into production so that end user consume the prediction service?
  • How to maintain the performance over time and keep the model relevant?
Nasa Lab to production ground Floor

How publish your machine learning model into production?

Basically your notebook is a prototype accessible by you only. You can’t ask a real estate agent to run a notebook. Instead you can provide a Web page that call your prediction service online. You need to expose the notebook prediction services as Web services (RESTful API) such as:

  • GET input Data
  • REQUEST prediction
  • UPDATE training model

2 options are available to implement the Machine learning  service:

  • DIY (Do it yourself):
    • Develop python API in the notebook to transform JSON (document format standard in the web) into dataframe. You can use a library kernel gateway to facilitate the API annotation :https://github.com/jupyter/kernel_gateway
    • Run your notebook python code in a production server. The best way to deploy is to dockerize the Machine learning service with all the dependencies (machine learning libraries, dataframe lib, API Libs,…).

    • Rely on Operation teams to provision and design an machine infrastructure
  • Delegate to a serverless Cloud provider to deploy and run the service:

If you don’t want to manage infrastructure, I strongly recommend to go  on early stage with a Cloud provider. The advantage of  the cloud are:

  • to pay per use  (call by API) and
  • to automatically scale the production server

How to maintain & update your model?

For your machine learning model to remain relevant, you need to monitor the online performance of the model in production. By the way the Cloud provider mentioned above will all automate that operation part for you.

To update your model two way to go,

  • Update Online: The platform prediction service capture online the feedback of the prediction. For example, the user post the real price of the house and check what would be the market price. In that case, we can update the model with input data iteratively.
    • At each input we update a version of the model and bench its performance compare to the previous one : could be A/B testing on line or versus historic data. we promote the best while keeping the legacy model in archive in order to revert to the old one
  • Update offline : In case you choose not to update online, you can take a larger dataset every 6 months to update the machine learning prediction model.
  • This fine tuning and testing is either performed
    • Manually by a data scientist or
    • Automate using an Machine learning application called autoML. Bascially the application will test a set of strategy to achieve the best model.

I personally recommend to design platform that predict and learn from user interactions in order to collect the dataset and automatically update the production model: That’s Google, Amazon and Facebook way to improve their respective machine learning applications search engine, recommendation engine, photo recognition.

 

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Close Menu