Running TensorFlow and OpenVINO Models in C# for Audio Processing

Author:

Introduction

Here’s an article written mostly by chatGPT based on my code. I modified some things that were wrong, and added some details. Both C# files are included in this post.

In this article, we will explore two distinct model wrapper implementations in C#: `TensorFlowModelWrapper` and `OpenVINOModelWrapper`. These wrappers are designed to facilitate the integration of TensorFlow and OpenVINO models into a C# application, and are used for my audio player. We will examine the key functionalities, the differences in data handling, and the unique requirements of each wrapper.

Both classes implements interface ITensorFlowModelWrapper so they can be swapped easily during development.

1. Overview of TensorFlowModelWrapper

The `TensorFlowModelWrapper` class interfaces with TensorFlow models using ML.NET, providing a structured way to integrate these models into C# applications.

Key Characteristics:

– **Model Initialization and Data Handling**: Unlike OpenVINO, TensorFlow models in ML.NET require a dedicated class for input and output data. This involves defining a schema that represents the model’s input and output formats, which is then used to handle data preprocessing and postprocessing. If you have a model with dynamic batch size, it will automatically configure the model for a batch size of 1.

– **Fitting Process**: While TensorFlow models are generally pre-trained, ML.NET requires an explicit “fit” process to prepare the model for inference. This step configures the input pipeline and ensures that the model is ready to handle the input data effectively.

– **Inference Execution**: Requires to create objects for input/output data before each inference.

**Memory Management**: TensorFlow models in ML.NET may require explicit memory management techniques such as invoking garbage collection to ensure that memory usage is optimized, particularly when dealing with large models or high-frequency inference requests. In my case I had some memory leak after each call. I switched to OpenVino  for this reason.

### 2. Overview of OpenVINOModelWrapper

The `OpenVINOModelWrapper` is designed to work with models optimized by the OpenVINO toolkit, which provides accelerated inference on various Intel hardware.

#### Key Characteristics:

**Simpler Data Handling**: OpenVINO models do not require a separate class to handle input and output data. Instead, the data is directly set into the model’s tensor, simplifying the data handling process. This approach is more straightforward compared to TensorFlow’s requirement for data schema definitions.
However, if your model has a dynamic batch size, it is necessary to reshape the model to the desired batch size.

**No Fitting Required**: Unlike TensorFlow models in ML.NET, OpenVINO models do not require a fitting process. The models are pre-trained and optimized for inference directly. This reduces the setup overhead and speeds up the integration process.

– **Optimized Inference**: OpenVINO is designed to optimize model inference across different hardware configurations. The `OpenVINOModelWrapper` handles the model compilation and inference request setup automatically, making it more efficient for running AI models on various Intel platforms, including CPUs, GPUs, VPUs, and FPGAs.

### 3. Key Differences in Data Handling and Model Integration

There are several fundamental differences between how `TensorFlowModelWrapper` and `OpenVINOModelWrapper` handle data and model integration:

– **Input and Output Data Classes**: TensorFlow in ML.NET requires defining classes that describe the structure of input and output data. These classes are essential for converting between C# data types and the tensor formats expected by TensorFlow models. In contrast, OpenVINO handles input and output tensors directly without needing separate data classes, streamlining the data processing pipeline.

Conclusion

With the current state of the ML.net TensorFlow API, it is safer to use OpenVino through the guojin-yan OpenVINO-CSharp-API. This however requires to first convert the model into an IR.

On the other hand, TensorFlow API can load TensorFlow models directly, and would be useful if the model is constantly being updated.

TensorFlowModel.cs

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Transforms;

namespace AIAudioPlayer.ML
{

    public enum TensorFormat_1D_Enum : int
    {
        NWC, // tensorflow standard
        NCW  // torch ?
    }

    public interface ITensorFlowModelWrapper
    {
        bool Initialized { get; }
        bool FitCalled { get; }
        int InputChannels { get; }
        int OutputChannels { get; }

        /// <summary>
        /// Formato en memoria del tensor
        /// </summary>
        TensorFormat_1D_Enum TensorFormat { get; }

        int DataSampleRate { get; }
        int DataLengthSamples { get; } // Used for both input and output
        int DataTrimSize { get; } // Samples

        bool Init(string modelPath = "");
        bool Fit(float[] inputTensor);
        bool Run(float[] inputTensor, ref float[] outputTensor);
        bool Run_wLock(float[] inputTensor, ref float[] outputTensor);

        /// <summary>
        /// Returns true if everything was loaded correctly
        /// </summary>
        /// <returns></returns>
        bool IsReady();
    }

    internal class TensorFlowModelWrapper : ITensorFlowModelWrapper
    {
#if DEBUG
        protected string modelPath = "H:\\Training\\WaveUnet\\saves\\model529\\"; // 58 = lindo
        //protected string modelPath = "H:\\Training\\WaveUnet\\saves\\modelBypass\\"; // Para probar comunicacion entre .net y tensorflow
#else
        protected string modelPath = "{AppExe}/modelB"; 
#endif
        protected MLContext mlContext = new MLContext();
        protected TensorFlowModel? model = null;
        protected TensorFlowTransformer? transformer = null;

        protected TensorFlowEstimator? pipeline = null;

        private bool _fitCalled = false;

        public  int tensor_InputChannels = 2;
        public  int tensor_OutputChannels = 8;

        public  int tensor_DataLength = 196608;
        public  int tensor_SampleRate = 44100;
        
        private  TensorFormat_1D_Enum _tensorFormat = TensorFormat_1D_Enum.NWC;
        public TensorFormat_1D_Enum TensorFormat { get { return _tensorFormat; } }

        public bool Initialized { get { return pipeline != null; } }

        public bool FitCalled { get => _fitCalled; }

        public int InputChannels { get => tensor_InputChannels; }
        public int OutputChannels { get => tensor_OutputChannels; }
        public int DataLengthSamples { get => tensor_DataLength; }

        public int DataSampleRate { get => tensor_SampleRate; }
        
        /// <summary>
        /// Trim size in samples (for each side.)
        /// -1 use default
        /// </summary>
        public int DataTrimSize { get { return -1; } }

         public const string modelOutputName = "StatefulPartitionedCall"; // 

#if DEBUG
        public const string modelInputName = "serving_default_input_4";
#else
        public const string modelInputName = "serving_default_input_3"; // ModelA = 9, modelB
#endif
        private string _toNativeSeparators(string path)
        {
            return Path.GetFullPath(path);
        }

        public bool Init(string modelPath = "")
        {

            if (modelPath != "")
                throw new ArgumentException("Custom modelPath not supported in this class");

            string AppExePath = Path.GetDirectoryName(Application.ExecutablePath);

            string wModelPath = _toNativeSeparators(modelPath.Replace("{AppExe}", AppExePath));

            // Load the TensorFlow model
            model = mlContext.Model.LoadTensorFlowModel(wModelPath);
            Debug.Print("Model loaded from {0}", wModelPath);

            //var schema = tensorFlowModel.GetInputSchema();
            var schema = model.GetModelSchema();

            foreach (var column in schema)
            {
                Console.WriteLine($"{column.Name}: {column.Type}");
            }

            // Define the schema of the TensorFlow model
            pipeline = model.ScoreTensorFlowModel(
                outputColumnNames: new[] { modelOutputName },
                inputColumnNames: new[] { modelInputName }, // 11 - model  42, 2- model 44, 4 -model 55
                addBatchDimensionInput: false);

            return true;
        }

        public bool Fit(float[] inputTensor)
        {
            if (model == null)
            {
                throw new Exception("Init not called");
                return false;
            }
            if (_fitCalled)
            {
                throw new Exception("Fit already called");
                return false;
            }

            _fitCalled = true;

            // Calls transform to check the model
            var input = new Input { Data = inputTensor };
            var inputList = new List<Input>() { input }; // Create list so it implements IEnumerable

            var dataView = mlContext.Data.LoadFromEnumerable(inputList);

            transformer = pipeline.Fit(dataView);

            return true;
        }

        public bool Run(float[] inputTensor, ref float[] outputTensor)
        {
            if (model == null)
            {
                Debug.Print("Fit not called");
                return false;
            }

            // Prepare input data
            var input = new Input { Data = inputTensor };
            var inputList = new List<Input>() { input };

            // Load data into IDataView
            var inputDv = mlContext.Data.LoadFromEnumerable(inputList);

            // Transform data
            var transformedValues = transformer.Transform(inputDv);

            // Extract output
            var output = mlContext.Data.CreateEnumerable<Output>(transformedValues, reuseRowObject: false).Single();
            outputTensor = output.Data;

            // Explicitly nullify objects to aid garbage collection
            inputDv = null;
            transformedValues = null;

            // Optionally force garbage collection
            GC.Collect();
            GC.WaitForPendingFinalizers();

            return true;
        }


        /// <summary>
        /// Runs the model using a sync lock
        /// </summary>
        public bool Run_wLock(float[] inputTensor, ref float[] outputTensor)
        {
            lock (this)
            {
                return Run(inputTensor, ref outputTensor);
            }
        }

        public bool IsReady()
        {
            return _fitCalled;
        }

        public class Input
        {
            [VectorType(196608 * 2)] // Our model takes a three-dimensional tensor as input but ML.NET takes a flatten vector as input
            //[VectorType(tensorBHWC_height, tensorBHWC_width, tensor_InputChannels)]
            [ColumnName(modelInputName)] // This name must match the input node's name
            public float[] Data { get; set; }
        }

        public class Output
        {
            [VectorType(196608 * 8)] // Again, a 3 dimensional tensor
            //[VectorType(tensorBHWC_height, tensorBHWC_width, tensor_OutputChannels)]
            [ColumnName(modelOutputName)] // This name must match the output node's name
            public float[] Data { get; set; }
        }
    }
}

OpenVinoModel.cs

using System;
using System.CodeDom;
using System.Diagnostics;
using System.IO;
using OpenVinoSharp;

namespace AIAudioPlayer.ML
{
    /// <summary>
    /// Carga un modelo de OpenVino que implenenta ITensorFlowModelWrapper
    /// </summary>
    internal class OpenVINOModelWrapper : ITensorFlowModelWrapper
    {
#if DEBUG
        protected string DefaultModelPath = "H:\\Training\\WaveUnet\\saves\\model529\\saved_model.xml";
        //protected string modelPath = "H:\\Training\\WaveUnet\\saves\\modelBypass\\saved_model.xml";
#else
        protected string DefaultModelPath = "H:\\Training\\WaveUnet\\saves\\model529\\saved_model.xml";
        protected string modelPath = "{AppExe}/modelB.xml";
#endif
        private static Core? _core;
        private Model _model;
        private CompiledModel _compiledModel;
        private InferRequest _inferRequest;

        private bool _fitCalled = false;

        /// <summary>
        /// The inputTensor from the model, if empty, OPENVINO.get_input is called to get the default input
        /// </summary>
        protected string model_InputTensor = "";

        protected int model_InputChannels = 2;
        protected int model_OutputChannels = 8;

        protected int model_DataLength = 196608;
        protected int model_SampleRate = 44100;

        /// <summary>
        /// Models that don't have the batch dimension set to a fixed size needs
        /// this to be TRUE
        /// </summary>
        protected bool model_needsReshape = true;

        // These are for models that takes multiple inputs, in case it can work correctly 
        // with just one. De otra forma usar un modelo específico
        protected ulong model_inputN = 0;
        protected ulong model_OutputN = 0;

        private TensorFormat_1D_Enum _tensorFormat = TensorFormat_1D_Enum.NWC;
        public TensorFormat_1D_Enum TensorFormat { get { return _tensorFormat; } }

        public bool Initialized { get { return _compiledModel != null; } }

        public bool FitCalled { get => _fitCalled; }

        public int InputChannels { get => model_InputChannels; }
        public int OutputChannels { get => model_OutputChannels; }
        public int DataLengthSamples { get => model_DataLength; }

        public int DataSampleRate { get => model_SampleRate; }

        public int DataTrimSize { get { return -1; } }

        public void SetModelSettings(string _defaultPath = "",
                                     int _model_InputChannels = 2, int _model_OutputChannels = 8, int _model_DataLength = 196608,
                                     int _model_SampleRate = 44100, TensorFormat_1D_Enum _model_TensorFormat = TensorFormat_1D_Enum.NWC,
                                     bool _model_needsReshape = true,
                                     string _model_InputTensor = "",
                                     ulong _model_inputN = 0,
                                     ulong _model_outputN = 0)
        {
            if (_model != null)
                throw new Exception("Init already called");

            if (_defaultPath != "")
                DefaultModelPath = _defaultPath;

            this.model_InputChannels = _model_InputChannels;
            this.model_OutputChannels = _model_OutputChannels;
            this.model_DataLength = _model_DataLength;
            this.model_SampleRate = _model_SampleRate;
            this._tensorFormat = _model_TensorFormat;
            this.model_InputTensor = _model_InputTensor;
            this.model_needsReshape = _model_needsReshape;
            this.model_inputN = _model_inputN;
            this.model_OutputN = _model_outputN;
        }

        public bool Init(string modelPath = "")
        {

            if (modelPath == "")
                modelPath = DefaultModelPath;

            string AppExePath = Path.GetDirectoryName(System.Windows.Forms.Application.ExecutablePath);
            string wModelPath = DefaultModelPath.Replace("{AppExe}", AppExePath);

            if (OpenVINOModelWrapper._core == null)
            {
                OpenVINOModelWrapper._core = new Core();
            }

            _model = OpenVINOModelWrapper._core.read_model(wModelPath);
            
            // ---- Get some info (n of inputs, input shape)
            ulong inputSize = _model.get_inputs_size();
            ulong outputSize = _model.get_outputs_size();
            PartialShape inputShape;
            if (model_InputTensor != "")
            {
                inputShape = _model.get_input(model_InputTensor).get_partial_shape();
            }
            else
            { // Use default input (FROM model)
                inputShape = _model.get_input().get_partial_shape();
            }

            string inputShapeSTRING = inputShape.to_string();


            // ------------ Reshape model (set batch dimension to 1)
            if (model_needsReshape)
            {

                PartialShape newShape = new PartialShape(3, new long[] { 1, DataLengthSamples, InputChannels }); // NWC

                if (_tensorFormat == TensorFormat_1D_Enum.NCW)
                    newShape = new PartialShape(3, new long[] { 1, InputChannels, DataLengthSamples }); // NCW

                string staticShape2 = newShape.to_string();


                if (inputSize == 1)
                { // Will reshape model input always that there's a single input
                    _model.reshape(newShape);
                }
                else
                    throw new Exception("Can't reshape a model with multiple inputs");
            }

            // --------- Compile model

            _compiledModel = OpenVINOModelWrapper._core.compile_model(_model, "AUTO");

            // ---- List inputs of compiled model
            // ... List inputs
            for (ulong i = 0; i < inputSize; i++)
            {
                using OpenVinoSharp.Input tmpInput = _compiledModel.input(i);
                Debug.Print("Input {0} Name: {1} Shape: {2} Index {3}", i, tmpInput.get_any_name(), tmpInput.get_shape().to_string(), tmpInput.get_index());
            }
            // ... List outputs
            for (ulong i = 0; i < outputSize; i++)
            {
                using OpenVinoSharp.Output tmpOutput = _compiledModel.output(i);
                Debug.Print("Output {0} Name: {1} Shape: {2} Index {3}", i, tmpOutput.get_any_name(), tmpOutput.get_shape().to_string(), tmpOutput.get_index());
            }

            // Create inference request
            _inferRequest = _compiledModel.create_infer_request();

            Debug.Print("Model loaded and compiled from {0}", wModelPath);

            return true;
        }

        public bool Fit(float[] inputTensor)
        {
            if (_compiledModel == null || _inferRequest == null)
            {
                throw new Exception("Init not called");
                return false;
            }
            if (_fitCalled)
            {
                throw new Exception("Fit already called");
                return false;
            }

            _fitCalled = true;
            // No actual fitting necessary, as we're only running inference

            return true;
        }

        public bool Run(float[] inputTensor, ref float[] outputTensor)
        {
            if (_compiledModel == null || _inferRequest == null)
            {
                Debug.Print("Fit not called");
                return false;
            }

            // Prepare input tensor
            using Tensor inputTensorObj = _inferRequest.get_input_tensor(model_inputN);
            
            // Shape input_shape = inputTensorObj.get_shape();
            // Debug.Print("Input tensor Size: {0} {1}", (int)inputTensorObj.get_size(), inputTensorObj.ToString());
            inputTensorObj.set_data(inputTensor);

            // Perform inference
            _inferRequest.infer();

            // Retrieve output tensor
            using Tensor outputTensorObj = _inferRequest.get_output_tensor(model_OutputN);
            // Debug.Print("Output tensor Size: {0} ", (int)outputTensorObj.get_size());
            outputTensor = outputTensorObj.get_data<float>((int)outputTensorObj.get_size());

            return true;
        }


        public bool Run_wLock(float[] inputTensor, ref float[] outputTensor)
        {
            lock (this)
            {
                return Run(inputTensor, ref outputTensor);
            }
        }

        public bool IsReady()
        {
            if (_compiledModel == null || _inferRequest == null)
            {
                Debug.Print("IsReady = False");
                return false;
            }
            return true;
        }
    }
}

Leave a Reply

Your email address will not be published. Required fields are marked *