I'm trying to switch a PaddleOCR model to ONNX for better performance. It normally takes just the image path or base64 and gives results with preprocessing handled internally. But in ONNX, it suddenly needs an extra dimension in the input. I know images are 3D, so I'm confused why it's asking for a 4D vector and how to handle that in preprocessing.
ort_session = rt.InferenceSession('model.onnx')
so = rt.SessionOptions()
print(f"ort_session.get_inputs()[0].shape: {ort_session.get_inputs()[0].shape}" )
results :
ort_session.get_inputs()[0].shape: ['p2o.DynamicDimension.0', 3, '?', 'p2o.DynamicDimension.1']
I tried the simple dimension expansion but that just made the onnxruntime freeze my entire computer with excessive usage of CPU.
def preprocess_image(image_path):
with open(image_path, 'rb') as file:
image_data = file.read()
image_bytes = np.frombuffer(image_data, dtype=np.uint8)
image = cv2.imdecode(image_bytes, cv2.IMREAD_COLOR)
print(image.shape)
image_array = image.astype(np.float32) / 255.0
image_array = np.expand_dims(image_array, axis=0)
image_array = np.transpose(image_array, (1, 3, 0, 2))
return image_array
execution code:
input_data = preprocess_image(img_path)
ort_outputs = ort_session.run(None, {ort_session.get_inputs()[0].name: input_data.astype(np.float32)})[0]
im well aware the problem might be a compatibility problem between onnx and paddleOCR and i've tried the conversion tools ik of which are paddle2onnx and paddleocr-convert but they didn't work