This question was asked by many friends I met , This reasoning API Not difficult , It mainly involves four steps : And hardware model loading , Detect input and output , Input and execute reasoning . You can roughly refer to the following picture :

1. use Core() Initialize inference engine
from openvino.runtime import Core ie = Core()

The inference engine can load the network on the device . The equipment here refers to CPU, Intel GPU, Neural computing rod 2 etc .available_devices Properties displays the devices available on the system .get_property() of “FULL_DEVICE_NAME” Option displays the name of the device .

What I use on my device is CPU equipment . If using integration GPU Just change it to device_name=“GPU”. be careful , stay GPU Loading the network on will be easier than CPU Slow loading on , But extrapolation may be faster .
devices = ie.available_devices for device in devices: device_name =
ie.get_property(device_name=device, name="FULL_DEVICE_NAME") print(f"{device}:

2. Loading model
After initializing the inference engine , First use read_model() Read model file , Then use compile_model() Compile it to the specified device .

IR Model
One IR (Intermediate
Representation) The model consists of a xml File and one bin Document composition , It contains information about the network topology , and bin The file contains weight and deviation binary data .read_model() Default means (bin)weights Documents and xml The files are located in the same directory , Same file name , Extension is .bin, That doesn't need to be made , Otherwise, you need to specify .
from openvino.runtime import Core ie = Core() classification_model_xml =
"model/classification.xml" model =
ie.read_model(model=classification_model_xml) compiled_model =
ie.compile_model(model=model, device_name="CPU")
ONNX Model
ONNX The model is a single file . Read and load ONNX How the model works is related to reading and loading IR The model works the same way .model Parameter pointing ONNX file name .
from openvino.runtime import Core ie = Core() onnx_model_path =
"model/segmentation.onnx" model_onnx = ie.read_model(model=onnx_model_path)
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")
In order to use the default settings ONNX Export model to IR, You can also use .serialize() method .
from openvino.offline_transformations import serialize
serialize(model=model_onnx, model_path="model/exported_onnx_model.xml",
2. Detect input / output and pictures
IENetwork Instances store information about models . Information about the input and output of the model is in the model . Input and model.outputs. Because now it supports static input , The second is to make the input graphics conform to the input layer , Therefore, it is necessary to detect the input and output layer .
from openvino.runtime import Core ie = Core() classification_model_xml =
"model/classification.xml" model =
ie.read_model(model=classification_model_xml) model.inputs[0].any_name
The upper cell shows that the loaded model requires an input , This input has a name input . If you load a different model , You may see a different input layer name , And you may see more input .
It is often useful to provide a reference to the first input layer name . For models with only one input , use next(iter(model.inputs)) You'll get the name .
input_layer = next(iter(model.inputs)) input_layer
The information of this input layer is stored in the input . The next cell prints the input layout , Precision and shape .
print(f"input precision: {input_layer.element_type}") print(f"input shape:
Notice that I changed the name in the picture , The process is the same

This cell output tells us , The shape the model expects to input is [1,3,224,224], And this is in NCHW In layout . This means the batch size of data that the model expects to input (N) by 1,3 Channels ©, Image height (H) And width (W) by 224. Input data expected FP32( floating-point ) accuracy .

Model input
from openvino.runtime import Core ie = Core() classification_model_xml =
"model/classification.xml" model =
ie.read_model(model=classification_model_xml) model.outputs[0].any_name
Model output information is stored in Model
.outputs in . The upper cell shows that the model returns an output , Its name is MobilenetV3/Predictions/Softmax. If you load a different model , You may see a different output layer name , And you may see more output .

Because this model has only one output , So follow the same method as the input layer to get its name .
output_layer = next(iter(model.outputs)) output_layer
Obtaining output precision and shape is similar to obtaining input precision and shape .
print(f"output precision: {output_layer.element_type}") print(f"output shape:

The output shape returned by this cell output display model is [1,1001], among 1 Is the batch size (N), 1001 Is the number of classes ©. Output as 32 Bit floating point number return .

3. Input remodeling
Here, let's connect the previous steps and execute them once , Get the input and output of graphics
from openvino.runtime import Core ie = Core() classification_model_xml =
"model/classification.xml" model =
ie.read_model(model=classification_model_xml) compiled_model =
ie.compile_model(model=model, device_name="CPU") input_layer =
next(iter(compiled_model.inputs)) output_layer =
preparation : Load image and convert to input shape
To propagate images across the network , It needs to be loaded into an array , Adjust its size to meet the requirements of the network , And convert it into the input layout of the network .
import cv2 image_filename = "data/coco_hollywood.jpg" image =
cv2.imread(image_filename) image.shape
The shape of the image is (663,994,3). Its high 663 pixel , wide 994 pixel , have 3 Multiple color channels . We get the reference of the desired height and width of the network , And resize the image to this size .
# N,C,H,W = batch size, number of channels, height, width N, C, H, W =
input_layer.shape # OpenCV resize expects the destination size as (width,
height) qresized_image = cv2.resize(src=image, dsize=(W, H)) resized_image.shape
Now the format of the image is H,W,C format . We first call np.transpose() Change it to N,C,H,W format ( among N=1), Then by calling np.
expand_dimms() take N Dimensional addition . use np.astype() Convert data to FP32.
import numpy as np input_data = np.expand_dims(np.transpose(resized_image, (2,
0, 1)), 0).astype(np.float32) input_data.shape

Executive reasoning

Since the shape of the input data is correct , Then infer .
result = compiled_model([input_data])[output_layer] request =
request.infer(inputs={input_layer.any_name: input_data}) result =
This network can return an output , So we store the reference to the output layer in output_layer in .Index Parameter time , We can use request.
get_output_ tensor (output_layer.index) get data . To get from the output numpy array , We need a suffix .data.
The output shape is (1,1001), What we see is the expected output shape . This output shape indicates that the network returns 1001 The probability of a class is 1.

The above is the whole process of all reasoning codes .

©2019-2020 Toolsou All rights reserved,
evo Tool usage problems ——Degenerate covariance rank, Umeyama alignment is not possible Experiment 4 Automated test tools - software test mysql Export data sql sentence _mysql according to sql Query statement export data Create a thread ——— Javaweb (3) Data structure experiment ( three )—— Stacks and queues TS stay vue2 Writing in the project web Front end signature plug-in _signature_pad Plug in implements electronic signature function docker Where is the image stored Qt Getting Started tutorial 【 Basic controls 】QCalendarWidget calendar control springboot How to get reality in ip address