Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard deviation or other measures of error for AUC #31

Open
xgao32 opened this issue Aug 4, 2015 · 1 comment
Open

Standard deviation or other measures of error for AUC #31

xgao32 opened this issue Aug 4, 2015 · 1 comment

Comments

@xgao32
Copy link

xgao32 commented Aug 4, 2015

Is there a way to get some kind of error measurement (eg: standard deviation, confidence interval) of the AUC after doing cross validations? It would be helpful to get a list of all such values from doing cross validations instead of just a single AUC.

The relevant functions are AUC(test_results=None, multiclass=0, ignore_weights=False) and cross_validation(learners, examples, folds=10, stratified=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None, store_classifiers=False, store_examples=False).

@janezd
Copy link
Contributor

janezd commented Aug 4, 2015

Orange 2 is maintained, but no new features are added. We could add this in Orange 3, but there's a general problem with the approach. Folds of cross validation are correlated, so if you treat the results as independent sample (of AUCs, in your case), you'll underestimate the variance. See the paper by Nadeau and Bengio, Inference for the generalization error, Machine Learning 52(3), 239-281. It describes the problem, but I'm not sure that you can apply this same correction to compute the variance of AUC, though. The correction is related to the t-test and it is also based on some ad-hocish assumptions, as I recall. In general, estimating variance from cross validation is an unsolved problem.

If you would like to look into it yourself, here are my two cents. You can use a single AUC. AUC is equivalent to the statistics of Wilcoxon-Mann-Whitney's test, so it has a known distribution. For WMW, it's here: https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#Normal_approximation, for AUC you have to scale it. I forgot the details, I've just seen it done somewhere, years ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants