干貨 | 函數詳解 ?OpenVINO Inference Engine SDK

作者：時間：2021-01-20 來源：OpenVINO中文社區

加入技術交流群
- 掃碼加入
  和技術大咖面對面交流
  海量資料庫查詢

基本介紹

本文引用地址：http://www.j9360.com/article/202101/422243.htm

OpenVINO 是針對英特爾針對自家現有的硬件平臺開發的高性能計算機視覺和深度學習視覺應用的工具套件，支持英特爾自家的 CPU、GPU、FPGA、VPU 等硬件。OpenVINO 包含兩個大模塊：模型轉換模塊 Model Optimizer 和推理模塊 Inference Engine。本文講解推理模塊常見的 C++、API 函數說明以及使用方法，推理模塊 API 也提供 C、Python 接口，筆者安裝的 OpenVINO 版本是 2020.3 版本。

工作流程

推理模塊的工作流程一般包含如下步驟：

創建推理對象：該推理對象可以支持不同的設備，所有的設備插件自動通過 Core 來進行管理。Core::SetConfig 來配置設備屬性，使用 Core::AddExtension 來注冊設備第三方庫，增加自定義層實現
讀取中間表示：使用 Core 對象來讀取中間表示文件 Core::ReadNetwork 創建 CNNNetwork 對象，該網絡存在于宿主機的內存中
設置輸入輸出：CNNNetwork::getInputsInfo 和 CNNNetwork::getOutputsInfo 函數用于設置輸入輸出層的精度、數據排列等
加載神經網絡：CNNNetwork::LoadNetwork 編譯并加載網絡到設備，得到ExecutableNetwork 對象
設置輸入數據：使用 ExecutableNetwork 對象來創建 InferRequest，可以直接將宿主機的內存復制到設備內存
執行推理過程：可以選擇同步推理 InferRequest::Infer，也可以選擇異步推理模式 InferRequest::StartAsync
獲取輸出結果：InferRequest::GetBlob 讀取推理結果

接口詳解

下面逐一講解上述工作流程中提到的 API 函數，以下函數聲明來自于安裝好的 OpenVINO 庫，也可以從 OpenVINO 源碼中理解這些函數的實現，由于篇幅有限，本文重點講解函數接口以及使用方法。

創建推理對象

Core 類是 InferenceEngine 的核心管理類，負責設備的管理，網絡加載等功能。頭文件為 openvino/inference_engine/ie_core.hpp，實現文件在openvino/inference_engine/src/inference_engine/ie_core.cpp。Core 構造函數聲明如下所示

explicit Core(const std::string& xmlConfigFile = std::string());
// xmlConfigFile：指定插件配置文件，如果為空的話加載默認配置

一般創建對象不輸入插件配置文件，使用默認的插件庫，默認參數位于openvino/deployment_tools/inference_engine/lib/intel64/plugins.xml，默認參數如下所示，name 表示支持的設備類型名稱，location 表示支持設備對應的庫名稱。

<ie>
    <plugins>
        <plugin name="GNA" location="libGNAPlugin.so">
        </plugin>
        <plugin name="HETERO" location="libHeteroPlugin.so">
        </plugin>
        <plugin name="CPU" location="libMKLDNNPlugin.so">
        </plugin>
        <plugin name="MULTI" location="libMultiDevicePlugin.so">
        </plugin>
        <plugin name="GPU" location="libclDNNPlugin.so">
        </plugin>
        <plugin name="MYRIAD" location="libmyriadPlugin.so">
        </plugin>
        <plugin name="HDDL" location="libHDDLPlugin.so">
        </plugin>
        <plugin name="FPGA" location="libdliaPlugin.so">
        </plugin>
    </plugins>
</ie>

一個詳細版本的設備配置文件如下所示：

<ie>
    <plugins>
        <plugin name="" location="">
            <extensions>
                <extension location="">
            </extensions>
            <properties>
                <property key="" value="">
            </properties>
        </plugin>
    </plugins>
</ie>

以上默認的插件庫除FPGA外均可自行編譯，源碼位于openvino/blob/master/inference-engine/src/inference_engine 內。可通過 SetConfig 來配置設備的一些屬性也就是上面 xml 中的 property 字段，使用 AddExtension 來設置設備外掛第三方庫也就是上面 xml 中的 extension 字段，SetConfig 接口函數如下所示：

void SetConfig(const std::map<std::string, std::string>& config, const std::string& deviceName = std::string());
// config：指定配置的參數名稱和數值
// deviceName：指定配置設備名稱，可選參數，如果不設置可默認為所有注冊的設備都更改次配置

SetConfig 函數用于為設備設置某些屬性值，第一個參數為設備屬性及其值，第二個參數為設備名稱，如果設備名稱為空，則設置所有的設備屬性。設備屬性查詢列表可以參考openvino/inference_engine/include/ie_plugin_config.hpp。

AddExtension 函數用于配置設備的外掛第三方庫，來支持 OpenVINO 中沒有實現的某些層。目前不支持設備名稱為 HETERO 和 MULTI 的外掛第三方庫，可用第二個函數接口檢驗設備名稱。

void AddExtension(const IExtensionPtr& extension);
void AddExtension(IExtensionPtr extension, const std::string& deviceName);
// extention：已加載的擴展的指針
// deviceName：設備名稱

創建推理對象步驟如下示例代碼所示：

// 使用默認的 plugins.xml 文件創建 Core 對象
Core ie;

// 設置設備屬性
ie.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, config_file}}, device_name);

// 設置設備的外掛第三方庫用于支持用戶自定義層
IExtensionPtr extension_ptr = make_so_pointer<IExtension>(extension_name);
ie.AddExtension(extension_ptr, "CPU");

// 設置設備的外掛函數用于支持用戶自定義層
IExtensionPtr inPlaceExtension = std::make_shared<InPlaceExtension>();
ie.AddExtension(inPlaceExtension);

讀取中間表示

讀取中間表示的函數接口為 Core::ReadNetwork 函數，該函數輸入參數為 Model Optimizer 轉換得到中間表示，將其加載到宿主機內存中。函數聲明如下所示：

CNNNetwork ReadNetwork(const std::string& modelPath, const std::string& binPath = "") const;
// modelPath：中間表示的配置文件
// binPath：中間表示的權重文件，如果為空，則嘗試加載 modelPath 同名的權重文件，如果找不到同名文件則不加載權重
CNNNetwork ReadNetwork(const std::string& model, const Blob::CPtr& weights) const;
// model：中間表示的配置文件，權重文件必須與配置文件同名
// weights：共享指針，指向常量 Blob

第二個函數不常用，下面僅舉例說明第一個函數用法，通過 ReadNetwork 函數創建 CNNNetwork 對象：

/** Read network model **/
CNNNetwork network = ie.ReadNetwork(modelPath);

設置輸入輸出

設置輸入輸出屬性，包括數據精度、輸入數據的排列方式（NCHW、NHWC等）、BatchSize 等操作。

排列方式可以從openvino/inference_engine/include/ie_common.h查詢目前支持的輸入輸出數據 Layout 方式如下：

NCHW = 1,  //!< NCHW layout for input / output blobs
NHWC = 2,  //!< NHWC layout for input / output blobs
NCDHW = 3,  //!< NCDHW layout for input / output blobs
NDHWC = 4,  //!< NDHWC layout for input / output blobs

精度參數可以從openvino/inference_engine/include/ie_precision.hpp查詢，目前支持的精度參數方式如下：

enum ePrecision : uint8_t {
    UNSPECIFIED = 255, /**< Unspecified value. Used by default */
    MIXED = 0,         /**< Mixed value. Can be received from network. No applicable for tensors */
    FP32 = 10,         /**< 32bit floating point value */
    FP16 = 11,         /**< 16bit floating point value */
    Q78 = 20,          /**< 16bit specific signed fixed point precision */
    I16 = 30,          /**< 16bit signed integer value */
    U8 = 40,           /**< 8bit unsigned integer value */
    I8 = 50,           /**< 8bit signed integer value */
    U16 = 60,          /**< 16bit unsigned integer value */
    I32 = 70,          /**< 32bit signed integer value */
    I64 = 72,          /**< 64bit signed integer value */
    U64 = 73,          /**< 64bit unsigned integer value */
    BIN = 71,          /**< 1bit integer value */
    BOOL = 41,         /**< 8bit bool type */
    CUSTOM = 80        /**< custom precision has it's own name and size of elements */
};

輸入數據的屬性設置示例代碼如下所示：

InputsDataMap inputInfo(network.getInputsInfo());
        
InputInfo::Ptr& input = inputInfo.begin()->second;
auto inputName = inputInfo.begin()->first;

// 設置精度和數據排列方式
input->setPrecision(Precision::U8);
input->getInputData()->setLayout(Layout::NCHW);

// 設置 BatchSize 大小
ICNNNetwork::InputShapes inputShapes = network.getInputShapes();
SizeVector& inSizeVector = inputShapes.begin()->second;
inSizeVector[0] = 1;  // set batch to 1
network.reshape(inputShapes);

輸出數據的屬性設置示例代碼如下所示：

OutputsDataMap outputInfo(network.getOutputsInfo());
for (auto &output : outputInfo) {
    // 設置精度和數據的排列方式
    output.second->setPrecision(Precision::FP32);
    output.second->setLayout(Layout::NCHW);
}

加載神經網絡

加載神經網路是將網絡編譯并加載到設備上，使用的函數接口為 Core::LoadNetwork，其函數聲明如下所示：

ExecutableNetwork LoadNetwork(
const CNNNetwork network, const std::string& deviceName,
const std::map<std::string, std::string>& config = std::map<std::string, std::string>());
// network：在步驟二讀取中間表示中創建的網絡
// deviceName：執行推理的設備名稱
// config：設備配置屬性，可選參數，該屬性也可以通過 SetConfig 來設置所有設備屬性

該函數的用法如下所示：

ExecutableNetwork executable_network = ie.LoadNetwork(network, device_name, configure);

LoadNetwork 所執行的加載動作實現代碼存在于設備的 Plugin 插件中，也就是plugins.xml文件中設置的 so 動態庫，入口函數為 CreatePluginEngine。

加載神經網絡

在執行推理之前需要設置網絡的輸入數據，通過 GetBlob 函數獲取 Blob 指針，然后將輸入數據拷貝到設備內存上，在openvino/inference_engine/sample中提供了一種通用的數據拷貝方式matU8ToBlob，首先在宿主機上實現圖像的縮放，然后將其拷貝到設備內存，其調用過程如下：

// infer_request 在下面`執行推理過程`時講述
Blob::Ptr input = infer_request.GetBlob(input_name);
for (size_t b = 0; b < batch_size; b++) {
    matU8ToBlob<uint8_t>(image, input, b);
}

matU8ToBlob 函數位于openvino/inference_engine/samples/cpp/common/samples/ocv_common.hpp 頭文件內，其實現代碼如下所示

template <typename T>
void matU8ToBlob(const cv::Mat& orig_image, InferenceEngine::Blob::Ptr& blob, int batchIndex = 0) {
    // orig_image：原始圖像
    // blob：輸入數據內存
    // batchIndex：批處理的index

   // 首先讀取網路尺寸
    InferenceEngine::SizeVector blobSize = blob->getTensorDesc().getDims();
    const size_t width = blobSize[3];
    const size_t height = blobSize[2];
    const size_t channels = blobSize[1];
    if (static_cast<size_t>(orig_image.channels()) != channels) {
        THROW_IE_EXCEPTION << "The number of channels for net input and image must match";
    }
    T* blob_data = blob->buffer().as<T*>();

    // CPU下執行原圖的縮放
    cv::Mat resized_image(orig_image);
    if (static_cast<int>(width) != orig_image.size().width ||
            static_cast<int>(height) != orig_image.size().height) {
        cv::resize(orig_image, resized_image, cv::Size(width, height));
    }

    // 獲得內存中數據偏移位移
    int batchOffset = batchIndex * width * height * channels;

    // 完成數據從宿主機到設備的拷貝過程，僅支持單通道或者三通道輸入數據推理
    if (channels == 1) {
        for (size_t  h = 0; h < height; h++) {
            for (size_t w = 0; w < width; w++) {
                blob_data[batchOffset + h * width + w] = resized_image.at<uchar>(h, w);
            }
        }
    } else if (channels == 3) {
        for (size_t c = 0; c < channels; c++) {
            for (size_t  h = 0; h < height; h++) {
                for (size_t w = 0; w < width; w++) {
                    blob_data[batchOffset + c * width * height + h * width + w] =
                            resized_image.at<cv::Vec3b>(h, w)[c];
                }
            }
        }
    } else {
        THROW_IE_EXCEPTION << "Unsupported number of channels";
    }
}

Blob 類是 OpenVINO 的基礎數據單元，其頭文件為openvino/inference_engine/include/ie_blob.h，網絡輸入輸出數據的傳遞是通過 Blob 類來實現的。

執行推理過程

OpenVINO 提供了兩種模式來執行推理過程：同步模式下推理函數 Infer 一直會阻塞直到推理結束；異步模式下調用 StartAsync 函數會立刻返回，然后再調用 Wait 函數等待執行結束，對于視頻分析或者基于視頻的目標檢測任務，OpenVINO 官方推薦使用異步方式，可以實現更快幀率的處理，提高推理設備的利用率，在設備推斷同時完成數據拷貝過程，減少推理設備等待時間。

同步模式下函數的調用例子如下所示：

InferRequest infer_request = executable_network.CreateInferRequest();
infer_request.Infer();
// 獲取網絡輸出結果

異步模式下必須建立一定的循環機制，才可做到正確使用，首先必須創建2個推理請求，其次是必須結合使用 StartAsync 和 Wait 函數來分別發送下一幀的推理請求和獲取上一幀的推理結果。

InferRequest::Ptr async_infer_request_curr = network.CreateInferRequestPtr();
InferRequest::Ptr async_infer_request_next = network.CreateInferRequestPtr();

while (true)
{
    // 設置下一幀推理數據
    frameToBlob(curr_frame, async_infer_request_next, imageInputName);
    // 發送下一幀推理請求
    async_infer_request_next->StartAsync();
    // 獲取上一幀推理結果
    if (OK == async_infer_request_curr->Wait(IInferRequest::WaitMode::RESULT_READY))
    {
    // 獲取網絡輸出結果
    }
    // 交換兩個推理指針
    async_infer_request_curr.swap(async_infer_request_next);
}

執行推理過程

獲取輸出結果需要調用 GetBlob 函數獲取輸出數據的地址，然后對其進行解析，同步和異步模式下獲取輸出結果的方法類似，下面方法一直接獲取 Blob 指針，然后可以解析 Blob 指針內數據，方法二則直接獲取輸出結果數值。

// 獲取輸出結果方法一
Blob::Ptr output_blob = infer_request.GetBlob(output_name);
MemoryBlob::CPtr moutput = as<MemoryBlob>(infer_request.GetBlob(output_name));

// 獲取輸出結果方法二
const float *detections = async_infer_request_curr->GetBlob(output_name)->buffer().as<PrecisionTrait<Precision::FP32>::value_type*>();

寫在最后

本文對 OpenVINO 推理模塊常見的 API 接口函數以及用法進行了說明，撰寫過程中參考了openvino/inference_engine/sample中例子程序。

▼

參考鏈接

https://github.com/openvinotoolkit/openvino
https://docs.openvinotoolkit.org/latest/classInferenceEngine_1_1Core.html
https://docs.openvinotoolkit.org/latest/_docs_IE_DG_inference_engine_intro.html
https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html