Fixing the non-deterministic behaviour of multi-thread FA. #38
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I noticed that in Linux the code is not thread-safe, and produces different models and alignments in different runs. In Mac, however, it behaves deterministically. I assume this is due to different thread handling mechanisms of Linux and Mac.
The issue is due to the Increment function in ttables.h where there is a race between the threads in updating the table. The safest and most efficient way to handle this situation is to use "atomic" declaration which just locks the current cell for the given thread and leaves the rest of the table free for the other threads to write on.
By this fix we can solve the issue and obtain identical models in different runs. However, I noticed that in very few cases the final alignment might slightly vary. This is another issue which doesn't have anything to do the with the thread safety of the code. I believe it happens only in the very special cases where the probability of aligning a source word to target words is very low (almost zero). In those cases the code might select one of the candidate by chance. In my test set this situation happens only in 0.005% of the cases.