Generated image IDs are non-unique #3

msnidal · 2020-03-04T19:24:05Z

Firstly, thanks for creating this script, it was a great help to me.

When I first ran it, it worked almost perfectly, but with one problem - the COCO format image IDs were all over the place, many non-unique (many 0s for example) which breaks the COCO format. I saw how you're generating them as a function of the filename, and given the image IDs have no VOC equivalent, I think it would make more sense to do a strict ordering per image.

I did a hacky solution for now, I'm leaving this issue open so I can come back to it later and open a PR with a fix. If anybody else is having this problem, look for img_id and you can try incrementing it manually for the moment.

yukkyo · 2020-03-07T07:04:03Z

@msnidal
Thanks for sharing !

I will also check this problem.
I would be grateful if you could share a way or data to reproduce this problem.

davidhuangal · 2020-04-15T07:45:07Z

@msnidal can you please show us your hacky solution?

amitkumar-delhivery · 2020-06-05T10:52:08Z

@davidhuangal , this code works on assumption that your file names are according to serial integer. like image1,image2,image3 or any_name1,any_name2... , so if you're having file which is like a_1.jpg,b_1.jpg then reges used in the code assigns the same id. so if you want to solve it then you can use this method:

img_id_dict={}
for filename is filename_list:
    img_id_dict[filename.split(".")[0]]=len(img_id_dict)+1

replace

    if extract_num_from_imgid and isinstance(img_id, str):
        img_id = img_id_dict[img_id]

dinis-rodrigues · 2020-06-30T00:46:00Z

Yeah having the same issue.
My images are named like (example):

480_0_36.png
480_0_37.png
...
499_0_5.png
499_0_6.png

And for each filename ("X_Y_Z.png") it assumes the id is always X.

AntonioNuAc · 2020-07-03T11:33:55Z

Is there any solution for this?
Does it affect when using 'annotation paths list'?

SubramanianKrish · 2020-08-07T23:44:21Z

Yeah. I'm seeing the same here. My test image IDs are J073-xxxxxxxxxx. This fix works

95: for img_id, a_path in enumerate(tqdm(annotation_paths)):
102: img_info['id'] = img_id

karen-gishyan · 2020-08-13T19:38:14Z

I see that the issue is still open, which I encountered as well. I share a quite simple solution, which seems to do the job. Adding a simple count generates unique ids.

`
count=0
def get_image_info(annotation_root, extract_num_from_imgid=True):

global count
path = annotation_root.findtext('path')
if path is None:
    filename = annotation_root.findtext('filename')
else:
    filename = os.path.basename(path)
img_name = os.path.basename(filename)
img_id = count
count+=1

# if extract_num_from_imgid and isinstance(img_id, str):
#     img_id = int(re.findall(r'\d+', img_id)[0])

`

If you guys encounter another issue, let me know so we can take a look.

XudongWang97 · 2020-09-03T08:10:06Z

I have the same issue here. I fixed this issue in my forked repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated image IDs are non-unique #3

Generated image IDs are non-unique #3

msnidal commented Mar 4, 2020

yukkyo commented Mar 7, 2020

davidhuangal commented Apr 15, 2020

amitkumar-delhivery commented Jun 5, 2020

dinis-rodrigues commented Jun 30, 2020

AntonioNuAc commented Jul 3, 2020

SubramanianKrish commented Aug 7, 2020

karen-gishyan commented Aug 13, 2020

XudongWang97 commented Sep 3, 2020

Generated image IDs are non-unique #3

Generated image IDs are non-unique #3

Comments

msnidal commented Mar 4, 2020

yukkyo commented Mar 7, 2020

davidhuangal commented Apr 15, 2020

amitkumar-delhivery commented Jun 5, 2020

dinis-rodrigues commented Jun 30, 2020

AntonioNuAc commented Jul 3, 2020

SubramanianKrish commented Aug 7, 2020

karen-gishyan commented Aug 13, 2020

XudongWang97 commented Sep 3, 2020