Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occlusion with multiple objects of the same type? #34

Open
pgaston1 opened this issue Sep 27, 2021 · 11 comments
Open

Occlusion with multiple objects of the same type? #34

pgaston1 opened this issue Sep 27, 2021 · 11 comments

Comments

@pgaston1
Copy link

pgaston1 commented Sep 27, 2021

I currently have linemod working well detecting one of my custom objects. Thank you!

I now would like to extend this to support multiple objects of the same type using the occlusion model. However, your approach seems hard-coded to one model type per image. i.e., one 'drill' per image. This seems to manifest itself in creating the mask_all/merged_masks images - I need to identify each specific instance aligning the masked area to a specific object.

How would you recommend I implement multiple objects per type? (the default idea is to create new 'pseudo types', e.g., drill1, drill2, drill3 - but that fails as the objects are identical and would conflict.)

Here's an example of a single pallet being found in a simulated warehouse... And following that is an example of what I would like to detect, i.e., detect the pose of each pallet in a stack of 5 pallets.

10

and

stack5

@alishan2040
Copy link

Dear @pgaston1, can you please tell how did you generated dataset for your custom object? I'm also working on some similar problem but facing issues in dataset generation in required linemod format. I'm particularly interested in how to calculate object parameters such as diameter, min_x, min_y, min_z, size_x, size_y, size_z from .ply file for example. I've tried MeshLab/Blender to set object origin to (0, 0, 0) to get these params but they seem incorrect. I would like to hear your approach on this.

Any sort of help is appreciated.
Thanks

@pgaston1
Copy link
Author

pgaston1 commented Oct 3, 2021 via email

@alishan2040
Copy link

alishan2040 commented Oct 5, 2021

Hi @pgaston1 Thanks for sharing the code and explanation! very helpful.
Can you please tell what should be the content of 0.npy file in getBaseInfo() method. How did you created this file?
Generally we do not have such files in efficientpose format?
Thanks again!

@pgaston1
Copy link
Author

pgaston1 commented Oct 5, 2021 via email

@ybkscht
Copy link
Owner

ybkscht commented Oct 5, 2021

Hi @pgaston1,

very nice results so far, thanks for sharing them!

You are right that the currently provided Linemod and Occlusion generators do not support multiple instances of the object category.
The simple reason for this is because these datasets do not contain multiple instances per image.
But nevertheless, EfficientPose itself is fully capable of handling multiple instances of the same object category.
The problem is just how the dataset is stored and the generator is implemented.

I personally would recommend the following solution:

  • Creating an instance segmentation mask in which each object instance got an unique value (possibly also RGB)
  • Add this unique id to the corresponding object in gt.yml so that every annotated object got it's own mask value.
  • Add an additional mask_values array to the annotations dictionary (see the convert_gt function in generators/occlusion.py) and fill in the unique mask values for every object when loading the dataset.
  • Instead of using the mask values from the hardcoded name_to_mask_value dictionary in the generator, change these code lines to use the mask values of the annotations dict. Also check generators/common.py.

Then you should be fine with multiple object categories as well as multiple instances per object category.

Sincerely,
Yannick

@nikonnext
Copy link

Hi @pgaston1, i'm wondering how many images you had to train on one object and if you have any overtraining. The fact is that I am training on 10k images with a heavily loaded scene, and the images are synthetic. But it seems that this is not enough for good results.

Did you manage to implement the MiSo or MiMo task for this model? I have just started to deal with a similar task, can you give me hints or instructions?

@satpalsr
Copy link

Hey @pgaston1, how did it turn out? Were you able to make it identify multiple instances of the same object?

@pgaston1
Copy link
Author

pgaston1 commented Jul 18, 2022 via email

@satpalsr
Copy link

Thanks @pgaston1 , I'll try it out.

@monajalal
Copy link

@alishan2040 were you able to figure how to calculate min_x, min_y, min_z, size_x, size_y, size_z for a given object for

For example,
"1": {"diameter": 102.099, "min_x": -37.9343, "min_y": -38.7996, "min_z": -45.8845, "size_x": 75.8686, "size_y": 77.5992, "size_z": 91.769},
is object 1 in linemod dataset in (base) mona@ada:/data2/data/BOP$ vi ./lm/models/models_info.json

I have some guesses about min_x/y/z and I assume it is the lowest corner on 3D bbox but not sure how to calculate.

What about size_x? Is there an easy way for these calculations or an automated way?

@monajalal
Copy link

I got it figured here is my answer https://gist.github.com/monajalal/ca0eb02bae787bc556a2b17656c7e58e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants