Open-Vocabulary Universal Image Segmentation with MaskCLIP (ICML 2023)

Data preparation

For COCO and ADE20k data preparation, please refer to Preparing Datasets in Mask2Former.

Environment Setup

Please follow the following codes to set up the environment.

conda create -n maskclip python=3.9
conda activate maskclip
conda install pytorch=1.10 cudatoolkit=11.3 torchvision=0.11 -c pytorch -c conda-forge
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
pip install setuptools==59.5.0
pip install timm opencv-python scipy einops
pip install git+https://github.com/openai/CLIP.git
pip install git+https://github.com/cocodataset/panopticapi.git

cd mask2former/modeling/pixel_decoder/ops/
sh make.sh

Training

Training Class-Agnostic Mask Proposal Network

You can train a class-agnostic mask proposal network by removing the classification head of previous segmentation models e.g., Mask2Former, MaskRCNN. We provide our trained class-agnostic mask proposal network here.

Training MaskCLIP on COCO dataset

With the trained class-agnostic mask proposal network, we can train the MaskCLIP model through the following command. We train our model for 10,000 iterations with a batch size of 8.

python train_net.py --num-gpus 8 --config-file configs/coco/maskformer2_R50_bs16_50ep.yaml

Testing MaskCLIP on ADE20K dataset

You can test our model on ADE20K dataset to get the results using the trained model. We also provide our trained model here. You need to change the path of MODEL.WEIGHTS in the yaml file or add to the line

python train_net.py --num-gpus 1 --config-file configs/ade20k/maskformer2_R50_bs16_160k.yaml --eval-only MODEL.WEIGHTS model_final.pth

Citation

If you find this work helpful, please consider citing MaskCLIP using the following BibTeX entry.

@inproceedings{ding2023maskclip,
  author    = {Zheng Ding, Jieke Wang, Zhuowen Tu},
  title     = {Open-Vocabulary Universal Image Segmentation with MaskCLIP},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
}

Please also checkout MasQCLIP for our lastest work on open-vocabulary segmentation.

Acknowledgement

This codebase was built upon and drew inspirations from CLIP and Mask2Former. We thank the authors for making those repositories public.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
datasets		datasets
figs		figs
maskclip		maskclip
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open-Vocabulary Universal Image Segmentation with MaskCLIP (ICML 2023)

Data preparation

Environment Setup

Training

Training Class-Agnostic Mask Proposal Network

Training MaskCLIP on COCO dataset

Testing MaskCLIP on ADE20K dataset

Citation

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

License

mlpc-ucsd/MaskCLIP

Folders and files

Latest commit

History

Repository files navigation

Open-Vocabulary Universal Image Segmentation with MaskCLIP (ICML 2023)

Data preparation

Environment Setup

Training

Training Class-Agnostic Mask Proposal Network

Training MaskCLIP on COCO dataset

Testing MaskCLIP on ADE20K dataset

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages