Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make training Simple Skip-gram model code? I try to fix your code. Plz see this issue! #23

Open
young-hun-jo opened this issue Dec 5, 2021 · 1 comment

Comments

@young-hun-jo
Copy link

young-hun-jo commented Dec 5, 2021

I want to realize the code that simple Skip-gram model fit training data. So I try to it, unlike CBOW training code, I encountered the error.
At first, I used this trainer.py, simple_skip_gram.py and I run the below training code like train.py(this train.py can be run by SImple CBOW model)

# Skip-gram으로 학습시켜보기
import numpy as np
from common.util import preprocess, create_contexts_target, convert_one_hot
from common.optimizer import Adam
from common.trainer import Trainer
from simple_skipgram import SimpleSkipGram

# 1. 말뭉치 전처리
window_size = 1

text = 'You say goodbye and I say Hello.'
corpus, word_to_id, id_to_word = preprocess(text)
contexts, target = create_contexts_target(corpus, window_size)

vocab_size = len(word_to_id)
contexts_ohe = convert_one_hot(contexts, vocab_size)
target_ohe = convert_one_hot(target, vocab_size)

# 2. 하이퍼파라미터 설정
hidden_size = 5
batch_size = 3
epochs = 1000

# 3. Skip-gram 모델
model = SimpleSkipGram(vocab_size, hidden_size)
optimizer = Adam()
trainer = Trainer(model, optimizer)

# 4. 학습
trainer.fit(x=target_ohe, 
            t=contexts_ohe, max_epochs=epochs, batch_size=batch_size)

but this code occur this error
스크린샷 2021-12-05 오후 2 50 04

So, I find this error because of Matmul class of layer.py and I fix it from origin Matmul class of layer.py like below.

class Matmul:
    def __init__(self, W):
        self.params = [W]
        self.grads = [np.zeros_like(W)]
        self.x = None
        
    def forward(self, x):
        W, = self.params 
        out = np.matmul(x, W)
        self.x = x
        return out
    
    def backward(self, dout):
        W, = self.params
        # I append below if statements
        if dout.ndim == 3:
            dout = np.sum(dout, axis=1)
        if self.x.ndim == 3:
            self.x = np.sum(self.x, axis=1)
        dx = np.matmul(dout, W.T)
        dW = np.matmul(self.x.T, dout)
        self.grads[0][...] = dW
        return dx

After running this fixed code, I succeed in Skip-gram training code. but contrast with SimpleCBOW, the loss value is higher and can't decrease it. I wanna check that my code is right.. Under this circumstance, the reason why my simple Skip-gram model has high loss is just very small corpus..?

If my fixed code is unright, how to revise original code? Please reply to me. I am learning much from your book. Thanks!

@haithink
Copy link

haithink commented Jan 9, 2024

vocab_size = len(word_to_id)
contexts, target = create_contexts_target(corpus, window_size)
target = convert_one_hot(target, vocab_size)
contexts = convert_one_hot(contexts, vocab_size)

model = SimpleSkipGram(vocab_size, hidden_size)
optimizer = Adam()
trainer = Trainer(model, optimizer)

trainer.fit(contexts, target, max_epoch, batch_size)
trainer.plot()

This code will work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants