Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault in TFRecurrentLanguageModel.cc #12

Closed
mattiadg opened this issue Dec 15, 2021 · 4 comments
Closed

Segmentation Fault in TFRecurrentLanguageModel.cc #12

mattiadg opened this issue Dec 15, 2021 · 4 comments

Comments

@mattiadg
Copy link

mattiadg commented Dec 15, 2021

I went around issue #11 by commenting the line searching for the bias tensor, however I'm now getting another error that I'm not sure it's related.
The problem is that this loop can go on until a parent is a nullptr and then it crashes.

while (parent->state.empty()) {

In my case it performs a full execution of the while body and then crashes the second time it checks the condition.

I get there the first time TFRecurrentLanguageModel::forward is called. It enters from this line

request_graph.add_cache(const_cast<ScoresWithContext*>(sc));

@mattiadg
Copy link
Author

mattiadg commented Dec 15, 2021

If I change the while condition with
while (parent != nullptr && parent->state.empty())
then it crashes in

require(initial_cache != nullptr);

Now, I'll try to understand if the first error is linked to the absence of a bias, though I currently don't see how it can be.

@mattiadg
Copy link
Author

I think that the problem is related to #11 because in BlasNceSoftmaxAdapter::get_score we have this line that assumes the existence of the bias tensor

result += tensors_[1].data<float>()[output_idx];

@mattiadg
Copy link
Author

Update: now I have the bias tensor in the output layer wrapped by BlasNceSoftmaxAdapter but I still get the precondition error. Something odds happens in the history when using this layer

@mattiadg
Copy link
Author

The error doesn't occur when the lstm in Returnn is compiled with the option "initial_state": "keep_over_epoch_no_init", but it is probably because the network was trained with the same option and it was missing from my compiled graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant