Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Face Rectangle size #76

Open
mousomer opened this issue Jan 25, 2017 · 11 comments
Open

Face Rectangle size #76

mousomer opened this issue Jan 25, 2017 · 11 comments
Labels

Comments

@mousomer
Copy link

I am trying to evaluate the clandmark models with different face detectors. They have different face scaling. So, for example, one may detect a face at [200 100 60 60] pixels, and the other at [190 90 80 80].
Should it make a difference which face rectangle I send over to the JOINTMV detectors?
Should I retrain the models for different face detectors?

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

Hi @mousomer,

the input image is rescaled internally to a fixed resolution, so called normalized frame. So, if the face is smaller than this (you can check what the precise size is in the .xml models, there are <bw_width> and <bw_height> tags defining it), the detection will be "more precise", and when the face is bigger, there is a systematic error introduced by scaling the image down to this fixed resolution.

In short, if the images are bigger, the retraining with a bigger normalized frame would increase the precision. However, the bigger the normalized frame is, the slower the detection (and therefore also training of the model) would be.

In case of any further question, please do not hesitate to ask them either here or on email.

@mousomer
Copy link
Author

mousomer commented Feb 5, 2017

I see. thanks. From reviewing the code I had the impression that the NormalizedFrame was constant size (per model type). Was I wrong?

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

Hi @mousomer,

yes, it is constant size per model type. But the input image is always rescaled to this size. So it can detect landmarks on "arbitrary" sized faces, however the detection precision is beside others also influenced by the normalized frame size.

@mousomer
Copy link
Author

mousomer commented Feb 5, 2017

So there is an optimal face size per model?

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

Yep, we could call the faces which are of the same size (or smaller) that the model's normalized frame optimal. Because there is no precision loss due to the downscaling.

However, it is definitely not necessary to have very huge normalized frames. Look for example on the results of CLandmark in the 300-W and 300-VW challenges, where the face size per example was very big. Our solution C2F-DPM used normalized frame of 80 x 80 px for the coarse detector and 160 x 160 px for the fine one.

@mousomer
Copy link
Author

mousomer commented Feb 5, 2017

Thanks.
Well, the problem I'm having is with the joint MV models (profiles and half-profiles). Suppose the vertical distance eyes-to-mouth is 100 pixels. What box should I send over to detect_optimized?

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

I would go first for the detected face size, check the results and only if they were not satisfactory enough, I would start thinking about re-training the model.

The learning scripts for the jointmv model are very time demanding. I have some unpublished improvements which reduce the time from 2 weeks to 2 days for the current model. But those will require some time before being published. And both variants are quite heavy on memory requirements (around 20GB RAM is needed).

@mousomer
Copy link
Author

mousomer commented Feb 5, 2017

Ah, but I'm trying to work with 3-rd party detectors. I guess I could run the openCV cascade first and gather statistics from there.

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

Yeah, I haven't tried OpenCV cascades for profiles yet myself, but it should be surely possible.

@mousomer
Copy link
Author

mousomer commented Feb 5, 2017

That's not what you're using for [pre-model] detection? (I was assuming that's the right thing do to because that's what you use in the static_input.cpp example).

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2017

Nope, I was using the commercial face detector (http://www.eyedea.cz/) for the development of the landmark detector. It provides square face sizes for arbitrary yaw angle oriented faces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants