Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-view face landmark extraction #38

Open
nitheeshas opened this issue Feb 4, 2016 · 28 comments
Open

Multi-view face landmark extraction #38

nitheeshas opened this issue Feb 4, 2016 · 28 comments
Assignees
Labels

Comments

@nitheeshas
Copy link

nitheeshas commented Feb 4, 2016

The output screenshots seem impressive, especially the multi view landmark extraction, but I can't figure out how to run it from the code you have provided. Please help.

@uricamic
Copy link
Owner

uricamic commented Feb 4, 2016

Hi @nitheeshas,

do you want C++ example, MATLAB or Python?

@nitheeshas
Copy link
Author

Thanks for replying. I would like to see an example in C++.

@uricamic
Copy link
Owner

uricamic commented Feb 4, 2016

Ok, I will skip the face detector part, since I now have code only using a commercial one. However, it is possible to use combination of OpenCV haarcascades for frontal and profile faces.

Lets assume that we have the bbox of the face in the image (the format of bbox is described e.g. here: #20 ). The function which jointly detects the discretized yaw angle and landmarks looks like this:

void jointmv_detector(Flandmark **flandmarkPool, int *bbox, int *viewID)
{
    const int PHIS = 5;
    fl_double_t scores[PHIS];
    fl_double_t maximum = -INFINITY;

    for (int phi=0; phi < PHIS; ++phi)
    {
        Flandmark *flandmark = flandmarkPool[phi];
        flandmark->detect_optimizedFromPool(bbox);

        // compute score
        scores[phi] = flandmark->getScore();

        if (scores[phi] > maximum)
        {
            maximum = scores[phi];
            *viewID = phi;
        }
    }
}

the viewID serves as a pointer to flandmarkPool, so we can later extract landmarks and view label.

Now how to initialize the flandmarkPool. Lets assume we have a following .txt file:

./models/PART_fixed_JOINTMV_-PROFILE.xml
./models/PART_fixed_JOINTMV_-HALF-PROFILE.xml
./models/PART_fixed_JOINTMV_FRONTAL.xml
./models/PART_fixed_JOINTMV_HALF-PROFILE.xml
./models/PART_fixed_JOINTMV_PROFILE.xml

Then we can use the following function to parse it

std::vector<std::string> readModelList(const char *file)
{
    std::vector<std::string> out;
    std::ifstream infile;
    infile.open(file);
    std::string line;
    while (std::getline(infile, line))
    {
        out.push_back(line);
    }

    return out;
}

So, in the main function we can use it as follows:

// read models from a text file
std::vector<std::string> models = readModelList(argv[3]);
Flandmark *flandmarkPool[models.size()];    // pool of Flandmark instances

for (int i=0; i < models.size(); ++i)
{
    flandmarkPool[i] = Flandmark::getInstanceOf(models[i].c_str());

    if (!flandmarkPool[i])
    {
        cerr << "Couldn't create instance of flandmark with model " << models[i] << endl;
        return -1;
    }
}
tim = timer.toc();

const int * bw_size = flandmarkPool[0]->getBaseWindowSize();
CFeaturePool * featuresPool = new CFeaturePool(bw_size[0], bw_size[1]);
featuresPool->addFeaturesToPool(
            new CSparseLBPFeatures(featuresPool->getWidth(),
                                   featuresPool->getHeight(),
                                   featuresPool->getPyramidLevels(),
                                   featuresPool->getCumulativeWidths()
                )
            );

for (unsigned int i=0; i < models.size(); ++i)
{
        flandmarkPool[i]->setNFfeaturesPool(featuresPool);
}

this initializes flandmarkPool (view dependent instances of Flandmark with corresponding models loaded) and featuresPool (the helper structure, which shares precomputed features among Flandmark instances).

Prior calling jointmv_detector function, do not forget to do this

featuresPool->updateNFmipmap(featuresPool->getWidth(), featuresPool->getHeight(), flandmarkPool[0]->getNF(frm_gray, &bbox[0])->data());

where cimg_library::CImg* frm_gray is supposed to be filled with the grayscale input image. This initiates feature computation in featuresPool a necessary step for the function jointmv_detector to work properly.

@nitheeshas
Copy link
Author

Thanks a lot for the detailed explanation!
I had doubt regarding the face detector too for multi view since opencv's profile face detector gave only an average result. I saw in the website that you were using Eyedea face detector. Their face detection seems to be almost perfect.
Anyway, I'll try this out right away. Thanks again!

@uricamic
Copy link
Owner

uricamic commented Feb 5, 2016

Yeah, Eyedea face detector is performing really well. It implements this paper, if you would like to re-implement it.
I guess another option is to re-train the OpenCV profile detector.

@nitheeshas
Copy link
Author

Wow, waldboost? Its actually already implemented by someone. Its available in opencv-contrib. I'll try to train it and check how good it performs.

@nitheeshas
Copy link
Author

Hi @uricamic
I was able to build the multi view landmark extraction using Dlib's face detector. I used the jointly learned landmarks pool. But the extracted landmarks are not that proper. Is it a known problem?

@uricamic
Copy link
Owner

uricamic commented Feb 9, 2016

Hi @nitheeshas,

the models currently available are learned on a very limited training set. We are currently learning them on a bigger database.
It is also possible that since the search spaces are shrinked (in order to get the detector as fast as possible), the dlib's face detector should be corrected to match the expectations for the face detector used in training.
Hard to tell without seeing some examples, though.

@nitheeshas
Copy link
Author

I've uploaded an example demo video of the outputs I got. Please check.
https://www.youtube.com/watch?v=25dbq7KSLsI

Sorry for the poor quality!

@uricamic
Copy link
Owner

uricamic commented Feb 9, 2016

It seems that the face detection is really suffering a huge variance in scale and position. On the other hand, when it is as one would expect, it looks quite nice, I would say.

One quick suggestion which should improve the accuracy a lot is to stabilize the face detector output by e.g. Kalman filtering.

The new models should also improve the quality a lot, however, they are not yet fully learned.

@nitheeshas
Copy link
Author

Yes, I just started modifying the code for Kalman filter now. Will update how it works :)

@nitheeshas
Copy link
Author

@uricamic I was not able to add kalman filter since i got caught up with some other work.
But the thing is, while testing the previous output, even when i was standing perfectly still, and the face detector output was also pretty much constant, the detected landmarks kept jumping a lot.

Maybe the best solution for this problem is to build fully learned models as you said. Are you still working on creating better models?

@uricamic
Copy link
Owner

uricamic commented Feb 9, 2016

@nitheeshas, I think in such case the problem is with a noise in the webcam input. The new models should help a bit, but depending on how severe the noise is.

New version should be learned within few days, the biggest benefit should be the better yaw estimation precision and I hope to some extent also the landmark localization accuracy. However, the accuracy is limited due to relatively small normalized frame. The idea is to have this detector as an initial phase and then for precise landmark detection or tracking use better model (either with increased size of normalized frame or using regression, to remove the systematic error introduced by transforming landmarks from the normalized frame back to image).

@nitheeshas
Copy link
Author

In that case, it will be better to wait for the new learned model and if its still shaky, will add a Kalman filter and check again.

Hope you'll update soon.

@nitheeshas
Copy link
Author

Hi @uricamic Can you share the dataset which you are using to train the multi view landmark detection?

@uricamic
Copy link
Owner

Hi @nitheeshas,

we are still working on that. Maybe, some smaller portion of examples could be published soon. Sorry for the delay.

@mousomer
Copy link

mousomer commented Feb 8, 2017

@uricamic : A small question:

You suggest adding a call to updateNFmipmap prior to detect_optimized.
But the latter function already includes a call to updateNFmipmap. And the static_input example you supply does not have that independent call to updateNFmipmap and yet it seems to give good results nonetheless.

Also, since the CSparseLBPFeatures class inside the CFeaturePool is protected, this cannot be done on an image-by-image basis, but only during the CFeaturePool initialization. If this is indeed a critical stage, then you should add an init_CSparseLBPFeatures function to the CFeaturePool class.

@uricamic
Copy link
Owner

uricamic commented Feb 8, 2017

Hi @mousomer,

I think there is some misunderstanding. The updateNFmipmap method of CFeaturePool is needed if you want to call the detection on multiple images. You simply exhange the image on which the detection is performed. Without costly re-initialization of the objects. Btw, the static_input example is also using it (see here).

The features are computed automatically inside the CFeaturePool class, when you call this updateNFmipmap method (see here), user is not supposed to interfere in features computation anyhow.

Maybe the names of some methods are a bit confusing, I am sorry if it is the case. However, all the important functionality is there and working. Some methods are there also just because of the MATLAB interface, especially for the purpose of the model learning, where the speed is very important.

@mousomer
Copy link

mousomer commented Feb 8, 2017

Well, you pointed into the detect_optimized function, which I suppose is the main API for extracting the features. So, am I correct in understanding that I don't need an extra call to updateFmipmap before I call detect_optimized?

@uricamic
Copy link
Owner

uricamic commented Feb 8, 2017

Hi @mousomer,

yes, for detect_optimized you really do not need to do that extra call of updateFmipmap.

However, check the post, where I was suggesting this call. It was for the jointmv_detector, which is internally calling detect_optimizedFromPool. Then, you have to call updateFmipmap prior calling the detector, because in that case there is no other way how to update the image and let the features to be computed. The reason why it is so is simply because there are multiple detectors to run, and the landmarks of the detector which has the maximal response are returned. The features are computed just once per face image and used in all detectors.

@mousomer
Copy link

mousomer commented Feb 9, 2017

I see. Thanks!
Oh, and if I haven't mentioned it before - this package is really awesome.

@uricamic
Copy link
Owner

@mousomer
No problem, it is always good to ask questions ;-)

Thanks!

@mousomer
Copy link

mousomer commented Feb 12, 2017

I re-run my sample set with optimizedFromPool instead of detect_optimized.
Got exactly the same results.
And the score is always biased towards the NegativeHalfProfile.
I've run a few thousand examples. This is the statistics I'm seeing:

Frontal NegProf NefHProf PosHProf PosProf
Mean score 1.076 1.363 2.995 1.818 -1.523
StdDev score 0.064 0.046 0.065 0.062 0.049

@uricamic
Copy link
Owner

Hi @mousomer,

thank you for reporting this. The values you show seem to be a bit suspicious, I would expect highest score for the frontal views, since those have the highest number of landmarks.
Maybe there is a bias term missing in the code sample. I will check it soon and come back with an answer.

@uricamic uricamic self-assigned this Feb 17, 2017
@mousomer
Copy link

@uricamic I've tested a few of the images. Seems that when translating scores to Z-score (subtracting means, dividing by standard deviation), the best z-score does yield the best model match. I need to verify this on a large batch of images.

@mousomer
Copy link

mousomer commented Jun 19, 2017

@uricamic
I did testing with NIST face set 18 - which has right and left face profiles:
https://catalog.data.gov/dataset/nist-mugshot-identification-database-mid-nist-special-database-18

The scoring is still bad. Even when translating into z-scores or tail scores, the results are not good.
So, basically, I need some external reference software to decide on the right model (frontal, R/L profile or half profile).

@uricamic
Copy link
Owner

Hi @mousomer,

no z-score translation should be needed. I will try to check on the database you mention and share the code with you. I hope I can manage it within a week, cannot guarantee that though.

@mousomer
Copy link

mousomer commented Jun 25, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants