Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with Colab #8

Open
laurelrr opened this issue Feb 13, 2024 · 0 comments
Open

problems with Colab #8

laurelrr opened this issue Feb 13, 2024 · 0 comments

Comments

@laurelrr
Copy link

Hello,
I have fielded a few questions recently about running AlphaTracker in Colab. The current colab script is out of date with your most recent updates. Specifically, the version of Miniconda in the colab notebook installs python 3.6, which seems to have a threading issue that gives the error ValueError: signal number 32 out of range when attempting to train the SPPE model. I've changed line 230 in AlphaTracker/Tracking/AlphaTracker/trainCOLAB.py from --nThreads 6 \\\n \ to --nThreads 1 \\\n \ but this did not fix it either.

I've played around with trying to install other Miniconda versions and updating the packages that are installed to match those in setup.py but I have not had any luck in getting the SPPE to train properly. My most recent pairing uses

MINICONDA_INSTALLER_SCRIPT=Miniconda3-latest-Linux-x86_64.sh; MINICONDA_PREFIX=/usr/local; wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT; ./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX

and

!conda install -y pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
!conda install -y munkres numpy scipy matplotlib pandas 
!pip install opencv-python tqdm gdown h5py nibabel tensorboardx  visdom scikit-learn seaborn umap requests  "git+https://github.com/ZexinChen/torchsample"
!pip install pycocotools gdown'

but this pairing seems to give the following error at the SPPE train step:

*** training sppe ***
training with following setting:
CUDA_VISIBLE_DEVICES=0 python train.py \
             --dataset coco \
             --img_folder_train /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//data/Trial/color/ \
             --annot_file_train /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe//data/Trial/data_newLabeled_01_train.h5 \
             --img_folder_val /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//data/Trial/color/ \
             --annot_file_val /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe//data/Trial/data_newLabeled_01_val.h5 \
             --expID Trial \
             --nClasses 4 \
             --LR 0.0001 --trainBatch 10 \
             --nEpochs 200 \
             --nThreads 1 \
             --loadModel /gdrive/AlphaTracker/Tracking/AlphaTracker/models/sppe/duc_se.pth
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
	Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.

Would you please have a look at it? I think there is a lot of interest in using AlphaTracker on Colab. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant