tflite-server with custom model?

Denver Bennett

Hi Chad and team,

I am trying to load a custom object detection .tflite model on the VOXL 2 and get it to work on voxl-tflite-server to read in the object detection data. Should I

modify the code for main.cpp and inference_helper.h for voxl-tflite-server
or
simply go into /etc/modalai/voxl-tflite-server.conf and ask it to read in my model?

I went to TensorFlow Hub, downloaded 035-128 classification variation of mobilenet_v2, float16 quantized the model, and then converted it with the following code:

`import os
import tensorflow as tf

Set name for the output .tflite file

tf_lite_model_file_name = "custom_net_01.tflite"

Provide the path to the directory where the saved model is located

saved_model_dir = "./"

Convert the model

tflite_converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
print('Path successfully found!')
tflite_model = tflite_converter.convert()

Write the TFLite model to a file

with open(tf_lite_model_file_name, "wb") as f:
f.write(tflite_model)`

Once I had custom_net_01.tflite, I put the model back on the VOXL 2. More specifically, I moved the model to /usr/bin/dnn/ where all the other .tflite models are located. I then went to etc/modalai/voxl-tflite-server.conf and edit the .conf files so that the input for "model": was "/usr/bin/dnn/custom_net_01.tflite",. I then ran the commands systemctl stop voxl-tflite-server, systemctl disable voxl-tflite-server, systemctl enable voxl-tflite-server, and systemctl start voxl-tflite-server, as a way to source voxl-configure-tflite. When I ran saw the camera feed on voxl-portal, I could seethe camera feed with no problem, but it said that tflite had an unknown source on the portal. When I then reconfigured voxl-tflite-server to us the mobilenet option, tflite was able to work perfectly fine and I was able to see the bounding boxes from the model.

When I went to the Deep Learning with VOXL-TFLite-Server webpage I tried to run the tflite converstion and quantization code, but it did not work for me as that code is outdated (tensorflow v2.8 instead of 2.16 or 2.17). Then when I went down to the Implementing your Model in voxl-tflite-server section, clicked on the InferenceHelper GitLab page, read the file, but then wasn't sure what to do next. (To be fair, I write in Python, not C++).

I on the voxl-tflite-server GitLab page for main.cpp it looks like assigns specific post-processing functions and sometimes normalization and label usage for each of the models already loaded on the VOXL 2 in /usr/bin/dnn. After looking at main.cpp, I felt like the correct thing to do was to edit main.cpp, inference_helper.h, and voxl-configure-tflite all to include my new model and tell main.cpp which post-processing and normalization to use until the model worked, but I was told that with C++ code you need to recompile the code.

After having read the Implementing your Model in voxl-tflite-server section on the Deep Learning with VOXL-TFLite-Server webpage, I wasn't entirely sure what to do next.

What should I do with inferencehelper.h? What next step can I take to get voxl-tflite-server to use my custom model?

Thanks!

Denver Bennett

Good morning @Chad-Sweet, @modaltb, and team,

Following up on my post from yesterday, I created a back-up folder of dnn (located in /usr/bin/dnn), went into the new folder, renamed ssdlite_mobilenet_v2_coco.tflite to ssdlite_mobilenet_v2_coco.bak.tflite, renamed custom_net_01.tflite to ssdlite_mobilenet_v2_coco.tflite, and restarted voxl-tflite-server. When I opened the web-portal I was able to not only view the camera feed, tflite was also showing a camera feed (something which hadn't happened before), but it wasn't drawing new bounding boxes.

Thankfully, it feels like I'm getting really close to cracking the code on this, but I am still just missing the last step of getting voxl-tflite-server to actually draw bounding boxes using my custom model. I look forward to hearing from you all!

Best wishes,

Denver Bennett

Hello all,

I was able to successfully train a MobileNet v2 (object detection mode), quantize it, and convert it into .tflite format.

Basically, my custom model had to meet 3 criteria:

The model is not too large (models should probably be below 17 KB)
The model's input shape is [1 300 300 3] and the input data type is uint8 or numpy.uint8
The model's has four outputs in THIS ORDER:

output one shape: [1 10 4]
output two shape: [1 10]
output three shape: [1 10]
output four shape: [1]
all of these outputs in float32 or numpy.float32 format

From what I can best understand, the model architecture doesn't have to be an exact match, so long as the inputs and outputs are compatible.

Here is my code:

import tensorflow as tf
from tensorflow.keras import layers, models

Load the MobileNetV2 feature vector model directly from TensorFlow

base_model = tf.keras.applications.MobileNetV2(
input_shape=(300, 300, 3), # Use 300x300 input shape as required
include_top=False,
weights='imagenet')

Freezing the base model

base_model.trainable = False

Adjust input shape to 300x300x3 and use uint8 data type

inputs = tf.keras.Input(shape=(300, 300, 3), dtype='uint8')

Use Lambda layer to cast inputs to float32

x = layers.Lambda(lambda image: tf.cast(image, tf.float32))(inputs)

Pass the cast inputs through the base model

x = base_model(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(1280, activation='relu')(x)

Bounding box output (10 detections, 4 coordinates each)

bbox_outputs = layers.Dense(40, activation='sigmoid')(x)
bbox_outputs = layers.Lambda(lambda t: tf.reshape(t, [1, 10, 4]), name="bbox_outputs")(bbox_outputs)

Class ID output (10 detections)

class_outputs = layers.Dense(10, activation='softmax')(x)
class_outputs = layers.Lambda(lambda t: tf.reshape(t, [1, 10]), name="class_outputs")(class_outputs)

Confidence score output (10 detections)

confidence_outputs = layers.Dense(10, activation='sigmoid')(x)
confidence_outputs = layers.Lambda(lambda t: tf.reshape(t, [1, 10]), name="confidence_outputs")(confidence_outputs)

Number of detections (single value)

num_detections = layers.Lambda(lambda t: tf.constant([10], dtype=tf.float32))(x)
num_detections = layers.Lambda(lambda t: tf.reshape(t, [1]), name="num_detections")(num_detections)

Define the outputs explicitly in the order you want:

1. Bounding boxes

2. Class IDs

3. Confidence scores

4. Number of detections

model = tf.keras.Model(inputs, [bbox_outputs, class_outputs, confidence_outputs, num_detections])

Compile the model

model.compile(optimizer='adam', loss='mean_squared_error')

Define a ConcreteFunction for the model with explicit output signatures

@tf.function(input_signature=[tf.TensorSpec([1, 300, 300, 3], tf.uint8)])
def model_signature(input_tensor):
outputs = model(input_tensor)
return {
'bbox_outputs': outputs[0],
'class_outputs': outputs[1],
'confidence_outputs': outputs[2],
'num_detections': outputs[3]
}

Convert the model to TensorFlow Lite using signatures

converter = tf.lite.TFLiteConverter.from_concrete_functions([model_signature.get_concrete_function()])

Apply float16 quantization

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16] # Use float16 quantization

Convert the model

tflite_model = converter.convert()

Save the TensorFlow Lite model

with open('/content/mobilenet_v2_custom_quantized.tflite', 'wb') as f:
f.write(tflite_model)

Load the TFLite model and check the input/output details to confirm correct mapping

interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()

Get the input and output details to verify correct input/output structure

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print("Input Details:", input_details)
print("Output Details:", output_details)