Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in the Evaluation and Submission of segmentation #5

Open
Shixiaomeng7 opened this issue Jan 21, 2024 · 4 comments
Open

Problems in the Evaluation and Submission of segmentation #5

Shixiaomeng7 opened this issue Jan 21, 2024 · 4 comments

Comments

@Shixiaomeng7
Copy link

Thanks for this excellent piece of work! I encountered "Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. This error was first reported for process rank 0; it may have occurred for other processes as well", but I have already run the chmod +x evaluate.sh command, and I would like to know how to create a hyperlink to the semantickitti dataset if it is placed in a different location, and is hyperlink necessary?

@inspirelt
Copy link
Collaborator

Hi, thanks for your attention. You may check whether the mpirun is installed correctly according to the instructions. Because there are different versions of mpirun (like open mpi, intel mpi, mpich etc.), and they work in slightly different ways. And, the hyperlink can be created by ln -s stored/path/of/semantickitti data/semantickitti. Feel free to contact me if you have more questions.

@Shixiaomeng7
Copy link
Author

Thank you for such a quick reply, I can now run . /evaluate.sh now, but it doesn't move when it loads to this location, and it doesn't report an error to exit, but it just stops, have you ever encountered this?
/root/miniconda3/envs/LinK_seg/bin/python evaluate.py --load_path ../checkpoints/max-iou-val.pt
[2024-01-21 08:44:43.634] Experiment started: "runs/run-db770b11".
workers_per_gpu: 2
distributed: True
amp_enabled: False
data:
num_classes: 20
ignore_label: 0
training_size: 19132
train:
seed: 1588147245
deterministic: False
dataset:
name: semantic_kitti
root: ./data/SemanticKITTI/dataset/sequences
num_points: 80000
voxel_size: 0.05
num_epochs: 25
batch_size: 2
model:
cr: 1.0
name: linkunet
base_op: cos_x
r: 2
s: 3
groups: 1
criterion:
name: lovasz_softmax
ignore_index: 0
optimizer:
name: sgd
lr: 0.24
weight_decay: 0.0001
momentum: 0.9
nesterov: True
scheduler:
name: cosine_warmup

@Shixiaomeng7
Copy link
Author

And I've already completed the training once, and it didn't stop moving while training.

@inspirelt
Copy link
Collaborator

Well, I've met this before and found several reasons may cause this. You can check: 1. whether the process get stuck in loading data (due to wrong data path); 2. whether the CUDA_VISIBLE_DEVICES is set correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants