0

Trying to train an object detection model on the Citypersons dataset following this tutorial. https://neptune.ai/blog/how-to-train-your-own-object-detector-using-tensorflow-object-detection-api

I run the following command:

python model_main_tf2.py --pipeline_config_path=models/MaskCNN/v1/pipeline.config --  model_dir=models/MaskCNN/v1/  --checkpoint_every_n=4  --num_workers=2  --alsologtostderr

And getting the following error:

File "/Users/Desktop/TensorFlow/tf2_api_env/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 7215, in raise_from_not_ok_status
raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__IteratorGetNext_output_types_18_device_/job:localhost/replica:0/task:0/device:CPU:0}} indices[0] = 0 is not in [0, 0)
     [[{{node GatherV2_7}}]]
     [[MultiDeviceIteratorGetNextFromShard]]
     [[RemoteCall]] [Op:IteratorGetNext]

I am fairly new to this and can't figure out the problem, any help would be greatly appreciated. Thanks.

1 Answer 1

0

I faced this same issue when I tried to train efficientDet-object-detection model, from exported datasets as tfrecord from Roboflow. Here's how I solve this:

This error means your generated tf-record files are corrupted. Use this following script to check the state of the tf-records:

import tensorflow as tf

def is_tfrecord_corrupted(tfrecord_file):
    try:
        for record in tf.data.TFRecordDataset(tfrecord_file):
            # Attempt to parse the record
            _ = tf.train.Example.FromString(record.numpy())
    except tf.errors.DataLossError as e:
        print(f"DataLossError encountered: {e}")
        return True
    except Exception as e:
        print(f"An error occurred: {e}")
        return True
    return False

# Replace with your TFRecord file paths 
tfrecord_files = ['your_test_record_fname', 'your_train_record_fname']

for tfrecord_file in tfrecord_files:
  if is_tfrecord_corrupted(tfrecord_file):
      print(f"The TFRecord file {tfrecord_file} is corrupted.")
  else:
      print(f"The TFRecord file {tfrecord_file} is fine.")

To fix the corrupted tfrecords, I exported the datasets as pascal-voc format, and then I wrote the following script hosted here on GitHub to generate new tfrecords from the pascal-voc formatted dataset.

Script to generate new tfrecords is here: https://github.com/arrafi-musabbir/license-plate-detection-recognition/blob/main/generate_tfrecord.py

  • Create your own label-map-pbtxt according to your dataset:
label_path = "your label_map.pbtxt path"

# modify according to your dataset class names
labels = [{'name':'license', 'id':1}]

with open(label_path, 'w') as f:
    for label in labels:
        f.write('item { \n')
        f.write('\tname:\'{}\'\n'.format(label['name']))
        f.write('\tid:{}\n'.format(label['id']))
        f.write('}\n')
  • Run the script as the following:
python generate_tfrecord.py -x {train_dir_path} -l {labelmap_path} -o {new_train_record_path}
python generate_tfrecord.py -x {valid_dir_path} -l {labelmap_path} -o {new_valid_record_path}
python generate_tfrecord.py -x {test_dir_path} -l {labelmap_path} -o {new_test_record_path}

Afterwards, run the is_tfrecord_corrupted(tfrecord_file) again and you will see that the tfrecords are fine.

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answer. I don't think my issue is the same though. I ran your script to check the state of the original TfRecord files and it outputs that those are fine.
oh, my issue was with corrupted tfrecords that I originally exported. anyway let me know if you find something about your issue

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.