-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug?] Positive probabilities in Alignment CRF when length nontrivial #46
Comments
Yup, seems like a bug. It seems like it is not finding any valid solution at all when you set length < t. Is there anyway you could avoid using lengths on alignment? The Alignment code is a bit gnarly, so I am guessing I just messed up the length constraint. An alternative method is just to pad your sequences with matches on the boundaries. |
Hello Dr. Rush, Many thanks for your rapid response and diagnosis of the problem. My data is intrinsically variable length, so I appreciate your help thinking through alternatives to If I understand your suggestion, you are recommending I create fake Thank you again! |
Dear Dr. Rush, First, thank you for writing the excellent arxiv explainer on this library! It very much helped me understanding. I have experimented with padding with matches, but I am not confident in my choice of log potentials as I still do not fully understand how this will effect the partition function and argmax. I wonder if you might confirm that 0 is the proper choice of log potential for certain match? Or perhaps it would be possible to write a wrapper which does this padding automatically for a given length constraint. Either way, thank you and hope to hear from you soon. |
Hi Tim, Sorry for the slow reply. I started looking at the issue, but I am still not happy with the speed of this particular function and am worried it might not useful. Are you using it in a production system or a demo? Either way I will try to get a padded demo this week, but I want to make sure you are not waiting on a function that will not be directly applicable. |
Hi Dr. Rush, I look forward to the padding demo, and once again very grateful for your help! |
Hi Dr. Rush and struct team, Hope all is well with the virus and what not. I know it must be very hectic for you, but I wanted to check in on this. I'm now basically stuck at home and have a lot of time for this project and was hoping I could get a prototype of the alignment working. Let me know if you have any updates or could point me in the right direction. |
I'm so sorry, things have been a bit crazy. I will try to get to this this week. |
Hi, I finally got around to making you a full example: https://github.com/harvardnlp/pytorch-struct/blob/master/notebooks/CTC_with_padding.ipynb Let me know if it is clear. I will eventually get his checked in under the lengths command. |
This looks like exactly what I needed! Thank you SO much Dr. Rush, best wishes and stay safe |
Awesome. I am going to leave it open so I remember to build it in. |
Hello!
First thanks for this awesome project. I think it will make a big difference in the NLP community and I am already excited to use it in my work.
Problem TLDR: AlignmentCRF.log_prob returns values greater than 0 (marginal probabilities greater than 1. I think this might be a bug unless I'm misusing the API.
More background
I'm attempting to use an Alignment CRF (smith waterman) as the loss function for a generative model. The "true" labels are sequences of tokens of various length, and I like the alignment because it will still work if my model accidentally adds an extra token. Obviously my data is variable length, so I provide the lengths tensor argument which has the length of my true labels in each batch. When I do this and compute
dist.log_prob(dist.argmax)
I get some batches who have log likelihood which is very positive. I checked the partition function and it looks very negative, so my guess is the error might have to do with masking in the partition but not the score or vice-versa.To reproduce this error
I get the same are in the CTC.ipynb example in the notebooks folder when I initialize the distribution with
dist = torch_struct.AlignmentCRF(log_potentials, lengths=torch.Tensor([t] * 2 + [t-1] * 3).long())
and then
dist.log_prob(dist.argmax)
.Image of the error is below. Here is a collab notebook which has the error https://colab.research.google.com/drive/1C1uDWNe8IcXB-Re6WcnwRsif-q-JQySe.
Do you agree this is a bug or is this the intended behavior? Any suggestions for how I might debug this or correctly use your API would be much appreciated. I am not familiar with SemiRings but if you explain what might fix the issue I can try to submit a pull request (still fairly new to pytorch so I'm not sure how much help I can be).
Thanks for your attention and again for the great library!
Tim Y.
The text was updated successfully, but these errors were encountered: