-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] DependencyCRF.log_prob returns a positive value when the input arc score is large. #61
Comments
Hmm, this is worrisome, we are pretty careful about precision but it is possible we are losing some bits somewhere. I will try to replicate. This is on CUDA right? just for sanity would you mind printing |
That is strange. If possible, do you think you could send me a minimal example, perhaps as a colab? Is it plausible that your gold tree is invalid somehow (either non-projective, or with multiple heads?). I am not sure why it would be so different than the partition. |
My dataset contains a tiny proportion of non-projective trees. If this issue results in the wrong results, I think I should try to convert these trees into projective ones. |
I see, so I think there are two things happening here :
I would recommend just dropping the non-projective trees, they are not correctly scored by the model and I don't check. If you want to add a check, then I would happy for a PR (it's a non-trivial check to write fast, you need check all arc-pairs). |
Thanks for your suggestions. I may try some non-projective trees in the dataset for the model. By the way, can I use |
I wouldn't recommend switching between the two. NonProj will be faster and more general, but less accurate in cases where the languages 99% projective. Unfortunately there is currently no easy way to get an argmax tree! You can try taking the max for each arc, but that will not guarantee it is a tree. If you are interested in implementing the argmax, the algorithm is the Chiu-Liu-Edmonds algorithm. It shouldn't be too hard to implement, it's just a bit different so I didn't include it. |
Well, now I'm trying cases for
When the sentence length is long (I tried 50 here), the output |
No ideas yet, but I will look into both issues. |
Hi, sorry for jumping in, but the issue with pytorch-struct/torch_struct/deptree.py Line 196 in 9f5a0f5
It should've used Slightly off topic, but |
After setting a breakpoint to find out when the problem occurs in my training code, I find that probably there is something wrong with the
I have uploaded these files on the OneDrive, I hope this will help us to solve the problem quickly. |
@kmkurn ah fantastic, I changed that line and made another issue about variable length. |
@wangxinyu0922 thanks for the repro, I will see what I can do to fix. |
@wangxinyu0922 I think your code had a bug in it. Your mask is zeroing out the potentials for the 0th head word. |
In my code, there is an redundant '' word at position 0, and the Why the third example (which masks the score of '' to 0 for all words/setting the diagonal of |
Sorry I'm still having trouble understanding. I'm really trying to figure it out, but I need a more clear example.
|
I still do not understand exactly where you think the problem is. Do you think the problem is with the score function? We could try use torch.where instead of torch.mul and see if the value is different. If you think the partition calculation is the issue, can you please send me a working minimal example where the partition is less than the value of a single tree? |
debug.zip |
Hi, I found that if I input
log_potential
into theDependencyCRF
with some large values, the marginal distribution will be slightly over 1, and this also results in thelog_prob
returns a very large positive value in training.My code is somthing like this:
I found my
arc_scores
contains a large value like 807687.0625This will result in the marginal contains values like 1.002
The final
log_prob
will return a very large positive value of 750633.4375.I think this problem is something like floating-point precision and can you suggest something to me?
The text was updated successfully, but these errors were encountered: