0

I have the following Tensorflow model that I want to quantize:

model = Sequential([
    Input(shape=input_shape),
    LSTM(lstm_units_1, return_sequences=True),
    Dropout(dropout_rate),
    LSTM(lstm_units_2, return_sequences=False),
    Dropout(dropout_rate),
    Dense(4, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
model_checkpoint = ModelCheckpoint(model_path, monitor='val_loss', save_best_only=True, save_weights_only=False, mode='min')

history = model.fit(X_train, y_train, 
                    epochs=epochs, 
                    batch_size=batch_size, 
                    validation_split=0.2, 
                    callbacks=[early_stopping],
                    verbose=1)

model.save(model_path)

I am trying to perform the quantization like this:

annotated_model = tfmot.quantization.keras.quantize_annotate_model(model)

with tfmot.quantization.keras.quantize_scope():
    quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)

quant_aware_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

But I am receiving this error: ValueError: `to_annotate` can only be a `keras.Model` instance. Use the `quantize_annotate_layer` API to handle individual layers. You passed an instance of type: Sequential.

Trying to quantize each layer as the error suggest didn't work for me as well with another value error about LSTM layers not being accepted as inputs.

annotated_model = tf.keras.Sequential([
    tfmot.quantization.keras.quantize_annotate_layer(layer)
    for layer in model.layers
])

What is the correct way to quantize the particular model I am using here?

1 Answer 1

0

Rewrite your TensorFlow model in pure TensorFlow, or alternatively, use Keras. However, do not mix the two, as this can lead to numerous errors. I recommend using Keras for everything.

Example in Keras


import keras
from keras import layers
from keras import ops
import numpy as np

# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)

# Compile the model
model.compile(
    optimizer='adam',
    loss='mean_squared_error',
    metrics=['accuracy']
)

# Example data for training
x_train = np.random.random((100, 3)).astype('float32')
y_train = np.random.random((100, 4)).astype('float32')

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32)

# Convert the model to TensorFlow Lite with post-training quantization to float16
import tensorflow as tf

# Convert the model to a TensorFlow Lite model
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()

# Save the quantized model
with open("model_quantized_f16.tflite", "wb") as f:
    f.write(tflite_model)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.