5.6. Model Service Deployment¶

5.6.1. Overview¶

Paddle Serving aims to help deep-learning researchers to easily deploy online inference services, supporting one-click deployment of industry, high concurrency and efficient communication between client and server and supporting multiple programming languages to develop clients.

Taking HTTP inference service deployment as an example to introduce how to use PaddleServing to deploy model services in PaddleClas.

5.6.2. Serving Install¶

It is recommends to use docker to install and deploy the Serving environment in the Serving official website, first, you need to pull the docker environment and create Serving-based docker.

nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
nvidia-docker exec -it test bash

In docker, you need to install some packages about Serving

pip install paddlepaddle-gpu
pip install paddle-serving-client
pip install paddle-serving-server-gpu

If the installation speed is too slow, you can add -i https://pypi.tuna.tsinghua.edu.cn/simple following pip to speed up the process.
If you want to deploy CPU service, you can install the cpu version of Serving, the command is as follow.

pip install paddle-serving-server

5.6.2.1. Export Model¶

Exporting the Serving model using tools/export_serving_model.py, taking ResNet50_vd as an example, the command is as follow.

python tools/export_serving_model.py -m ResNet50_vd -p ./pretrained/ResNet50_vd_pretrained/ -o serving

finally, the client configures, model parameters and structure file will be saved in ppcls_client_conf and ppcls_model.

5.6.2.2. Service Deployment and Request¶

Using the following commands to start the Serving.

python tools/serving/image_service_gpu.py serving/ppcls_model workdir 9292

serving/ppcls_model is the address of the Serving model just saved, workdir is the work directory, and 9292 is the port of the service.

Using the following script to send an identification request to the Serving and return the result.

python tools/serving/image_http_client.py  9292 ./docs/images/logo.png

9292 is the port for sending the request, which is consistent with the Serving starting port, and ./docs/images/logo.png is the test image, the final top1 label and probability are returned.

For more Serving deployment, such RPC inference service, you can refer to the Serving official website: https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet