Name		Name	Last commit message	Last commit date
parent directory ..
cifar-10h		cifar-10h
extra-benchmarks		extra-benchmarks
wall-robot-completely-labeled		wall-robot-completely-labeled
wall-robot		wall-robot
README.md		README.md
requirements.txt		requirements.txt

README.md

Benchmarking methods to select examples to relabel in active learning for data labeled by multiple annotators

Code to reproduce results from the paper:

ActiveLab: Active Learning with Re-Labeling by Multiple Annotators

This repository benchmarks algorithms to compute an active learning score that quantifies how valuable it is to collect additional labels for specific examples in a classification dataset. We consider settings with multiple data annotators such that each example can be labeled more than once, if needed to ensure high-quality consensus labels.

This repository is only for intended for scientific purposes. To apply the ActiveLab algorithm to your own active learning loops with multiannotator data, you should instead use the implementation from the official cleanlab library.

Install Dependencies

To run the model training and benchmarks, you need to install the following dependencies:

pip install -r requirements.txt
pip install cleanlab

Benchmarks

Three sets of benchmarks are conducted with 3 different datasets:

	Dataset	Description
1	CIFAR-10H	Image classification with a total of 5000 examples, where 1000 examples have annotator labels at the beginning, we collect 500 new labels each round.
2	Wall Robot	Tabular classification with a total of 2000 examples, where 500 examples have annotator labels at the beginning, we collect 100 new labels each round.
3	Wall Robot Complete	Tabular classification with a total of 2000 examples, where all 2000 examples have annotator labels at round 0, we collect 100 new labels each round.

The datasets used in the benchmark are downloaded from:

Additional Benchmarks

Two supplementary benchmarks were conducted on the Wall Robot dataset:

	Benchmark	Description
1	Single Annotator vs Multiannotator	Compare labeling new data vs relabeling existing datapoints.
2	Methods for Single Label	Benchmark the performance of various method in the scenario where each examples only has one label.

Results

The results/ folder for each dataset contains .npy files that are the saved results (model accuracy and consensus label accuracy) from each run of the benchmark. These files are used to vizualize the results in the plot_results.ipynb notebooks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

active_learning_benchmarks

active_learning_benchmarks

README.md

Benchmarking methods to select examples to relabel in active learning for data labeled by multiple annotators

Install Dependencies

Benchmarks

Additional Benchmarks

Results

Files

active_learning_benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

active_learning_benchmarks

Folders and files

parent directory

README.md

Benchmarking methods to select examples to relabel in active learning for data labeled by multiple annotators

Install Dependencies

Benchmarks

Additional Benchmarks

Results