is it possible to quantify the AnimeGANv3 model from FP32 to INT8 #58

zczjx · 2024-09-03T07:58:05Z

is it possible to quantify the AnimeGANv3 model from FP32 to INT8?

TachibanaYoshino · 2024-09-03T08:12:41Z

Quantization is possible, but I haven't tried it yet.
For generative models, quantization needs to ensure the visual quality of its output. It may be more demanding than recognition models. Do you have any good suggestions?

zczjx · 2024-09-03T08:21:03Z

I just want to know the feasibility, and try to do that

TachibanaYoshino · 2024-09-04T04:43:34Z

import os,cv2
import numpy as np
from tqdm import tqdm
# tensorflow version:    2.7.0
import tensorflow.compat.v1 as tf


def load_test_data(image_path, input_shape):
    img = cv2.imread(image_path)
    img = cv2.resize(img, (input_shape[1], input_shape[0]), interpolation = cv2.INTER_LINEAR)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = img.astype(np.float32)/127.5 - 1.0
    img = np.expand_dims(img, axis=0)
    return img

def pb_to_tflite_QUANTIZED_INT8(pb_file, input_shape, data_path):
    def representative_dataset_gen():
        files = [os.path.join(data_path, x) for x in os.listdir(data_path) ]
        for f in tqdm(files):
            img = load_test_data(f, input_shape)
            # Get sample input data as a numpy array in a method of your choosing.
            yield [img]

    converter = tf.lite.TFLiteConverter.from_frozen_graph(pb_file,
                                                                    input_arrays = ["AnimeGANv3_input"],  # The name of the model input node
                                                                    input_shapes = {'AnimeGANv3_input': [1, input_shape[0], input_shape[1], 3]},
                                                                    output_arrays = ["generator_1/main/out_layer"])  # The name of the model output node
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
    converter.allow_custom_ops = True
    # Ensure that if any ops can't be quantized, the converter throws an error
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.target_spec.supported_types = [tf.lite.constants.INT8]    # QUANTIZED_INT8
    converter.inference_input_type = tf.uint8
    converter.inference_output_type = tf.uint8
    converter.representative_dataset = representative_dataset_gen
    tflite_quant_model = converter.convert()
    open("QUANTIZED_int8.tflite", "wb").write(tflite_quant_model)

def pb_to_tflite_QUANTIZED_FLOAT16(pb_file, input_shape, data_path=None):
    converter = tf.lite.TFLiteConverter.from_frozen_graph(pb_file,
                                                                    input_arrays = ["AnimeGANv3_input"],  # The name of the model input node
                                                                    input_shapes = {'AnimeGANv3_input': [1, input_shape[0], input_shape[1], 3]},
                                                                    output_arrays = ["generator_1/main/out_layer"])  # The name of the model output node
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
    converter.allow_custom_ops = True
    converter.target_spec.supported_types = [tf.lite.constants.FLOAT16]   # QUANTIZED FLOAT16
    tflite_quant_model = converter.convert()
    open("QUANTIZED_fp16.tflite", "wb").write(tflite_quant_model)

if __name__ =="__main__":
    data_path = r"../data/imgs"  # At least 100 images
    pb_file = r'../AnimeGANv3_Hayao_36.pb'
    input_shape = [512, 512]
    pb_to_tflite_QUANTIZED_FLOAT16(pb_file, input_shape)
    pb_to_tflite_QUANTIZED_INT8(pb_file, input_shape, data_path)

As shown above, I built a script to convert tensorflow's pb model to a quantized tflite model. The quantization formats include INT8 and float16. You can deploy them on mobile devices that support tflite, such as Android phones.
Taking the style AnimeGANv3_Hayao_36.pb mentioned in the paper as an example, the quantized model file is as follows:
QUANTIZED_fp16.tflite
QUANTIZED_int8.tflite

The comparison results before and after quantification are as follows:

It can be seen that after quantization, the output visual effect of the AnimeGANv3 model still maintains a high quality, and the model file is also reduced a lot.

zczjx · 2024-09-05T02:07:34Z

awesome, thanks a lot
let me try to migrate it to rknn on rk3588 arm linux platform

zczjx · 2024-09-11T14:51:07Z

with porting tflite lib to rk3588, at least it can work by cpu inference
rknn model transfer failed, need to do more adaptive work to transfer the tflite model to rknn formart model

TachibanaYoshino added the enhancement New feature or request label Sep 4, 2024

TachibanaYoshino pinned this issue Sep 4, 2024

zczjx closed this as completed Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is it possible to quantify the AnimeGANv3 model from FP32 to INT8 #58

is it possible to quantify the AnimeGANv3 model from FP32 to INT8 #58

zczjx commented Sep 3, 2024

TachibanaYoshino commented Sep 3, 2024

zczjx commented Sep 3, 2024

TachibanaYoshino commented Sep 4, 2024

zczjx commented Sep 5, 2024

zczjx commented Sep 11, 2024

is it possible to quantify the AnimeGANv3 model from FP32 to INT8 #58

is it possible to quantify the AnimeGANv3 model from FP32 to INT8 #58

Comments

zczjx commented Sep 3, 2024

TachibanaYoshino commented Sep 3, 2024

zczjx commented Sep 3, 2024

TachibanaYoshino commented Sep 4, 2024

zczjx commented Sep 5, 2024

zczjx commented Sep 11, 2024