1

I have a set of medical forms that may or may not contain a 2d datamatrix into a corner of the page. I need to detect if the 2d datamatrix is present or not. For now, it's not necessary to read the content of the barcode. I've been looking for different libraries but I can't find one with OCR or something that may detect the presence of the 2d datamatrix. I need to do this with Python.

Attached is a medical form example where a 2d datamatrix is located at the right bottom of the page. In this case, the algorithm should say "True" as the datamatrix exists in the page.

PS: I've tested AWS Textract and is not detecting the datamatrix.

Medical form example

3
  • This looks relevant: stackoverflow.com/questions/67922660/… Commented Oct 15, 2024 at 20:47
  • Thanks for the comment. I tested and it's not working due to a dependency error. Commented Oct 16, 2024 at 19:21
  • Id retry that after installing pypi.org/project/frontend or download the files directly from that website if there's version compatibility issues Commented Oct 16, 2024 at 23:40

2 Answers 2

1

Here are the steps to detect the 2D DataMatrix:

  1. Install dynamsoft_capture_vision_bundle and opencv python

    pip install dynamsoft-capture-vision-bundle opencv-python
    
  2. Create a Python script as follows. You need to replace the test.png and LICENSE-KEY with your own. The license key can be obtained from here.

    import sys
    from dynamsoft_capture_vision_bundle import *
    import os
    import cv2
    import numpy as np
    from utils import *
    
    image_path = 'test.png'
    
    error_code, error_message = LicenseManager.init_license(
     "LICENSE-KEY")
    if error_code != EnumErrorCode.EC_OK and error_code != 
    EnumErrorCode.EC_LICENSE_CACHE_USED:
        print("License initialization failed: ErrorCode:",
           error_code, ", ErrorString:", error_message)
    
        sys.exit()
    
    cvr_instance = CaptureVisionRouter()
    result = cvr_instance.capture(
     'test.png', EnumPresetTemplate.PT_READ_BARCODES.value)
    if result.get_error_code() != EnumErrorCode.EC_OK:
        print("Error:", result.get_error_code(),
           result.get_error_string())
    
        sys.exit()
    
    cv_image = cv2.imread(image_path)
    
    items = result.get_items()
    print('Found {} barcodes.'.format(len(items)))
    for item in items:
        format_type = item.get_format()
        text = item.get_text()
        print("Barcode Format:", format_type)
        print("Barcode Text:", text)
    
        location = item.get_location()
        x1 = location.points[0].x
        y1 = location.points[0].y
        x2 = location.points[1].x
        y2 = location.points[1].y
        x3 = location.points[2].x
        y3 = location.points[2].y
        x4 = location.points[3].x
        y4 = location.points[3].y
        del location
    
        cv2.drawContours(
         cv_image, [np.intp([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
    
        cv2.putText(cv_image, text, (x1, y1 - 10),
                 cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
    
    max_width = 1600
    max_height = 1000
    
    original_height, original_width = cv_image.shape[:2]
    
    scaling_factor = min(
     max_width / original_width, max_height / original_height)
    
    new_width = int(original_width * scaling_factor)
    new_height = int(original_height * scaling_factor)
    
    resized_image = cv2.resize(
     cv_image, (new_width, new_height), interpolation=cv2.INTER_AREA)
    
    cv2.imshow(
     "Original Image with Detected Barcodes", resized_image)
    
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    
    
  3. Run the script:

    Found 1 barcodes.
    Barcode Format: 134217728
    Barcode Text: Edu-barcode-45
    

    enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

More recent copy of those forms when issued generally DO NOT include a matrix however it may be detected even when not clear what the data is as in the sample.

Normally PDF does not use "DPI" and each pixel would be as sharp as a vector image. However when reprinted as an image then a DPI is required and that is when anti-aliasing causes issues.

The code should be roughly the equivalence of 4 or more pixels per square although one will do.

enter image description here

Python has PyMuPDF and I don't know how well it functions but taking the example page we get verification there is a datamatrix and its decoded content seen as

Call function via shell

mutool barcode -d MedicalForm.png

Response

datamatrix: Edu-barcode-45

Thus verified easily in a single command.

If you need to regenerate as a crisp clean image use

mutool barcode -c -o Sample.png -F datamatrix -t -q "Edu-barcode-45"

However do not be surprised if a different compact format is used it is the same type and data.

enter image description here

Both source and replacement are identical type and data. Just based on a different Modular Value.

enter image description here

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.