Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing the non-deterministic behaviour of multi-thread FA. #38

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

afarajian
Copy link

I noticed that in Linux the code is not thread-safe, and produces different models and alignments in different runs. In Mac, however, it behaves deterministically. I assume this is due to different thread handling mechanisms of Linux and Mac.
The issue is due to the Increment function in ttables.h where there is a race between the threads in updating the table. The safest and most efficient way to handle this situation is to use "atomic" declaration which just locks the current cell for the given thread and leaves the rest of the table free for the other threads to write on.
By this fix we can solve the issue and obtain identical models in different runs. However, I noticed that in very few cases the final alignment might slightly vary. This is another issue which doesn't have anything to do the with the thread safety of the code. I believe it happens only in the very special cases where the probability of aligning a source word to target words is very low (almost zero). In those cases the code might select one of the candidate by chance. In my test set this situation happens only in 0.005% of the cases.

Copy link

@janetzki janetzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PersianNLPer, thank you so much for finding and fixing this non-determinism. I had the same issue, and it works on my machine if you remove the . (see below).

counts[e].find(f)->second += x; // Ignore race conditions here.
// Each thread locks only the part of the table which is writing on. Other threads have access to the rest of the table to update!
#pragma omp atomic
counts[e].find(f)->second += x;.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the . at the end intended? It does not compile on my machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants