Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

下游的年龄性别模型训练 #4

Open
kebinC opened this issue Dec 23, 2020 · 6 comments
Open

下游的年龄性别模型训练 #4

kebinC opened this issue Dec 23, 2020 · 6 comments

Comments

@kebinC
Copy link

kebinC commented Dec 23, 2020

hello, 问下,下游的年龄性别分类模型是用 PeterRec_noncau_parallel_classifier.py 跑的吗 ?

@yuan2961634811
Copy link

是的

@kebinC
Copy link
Author

kebinC commented Jan 4, 2021

是的

def random_negs(l,r,no,s):
    # set_s=set(s)
    negs = []
    for i in range(no):
        t = np.random.randint(l, r)
        # while (t in set_s):
        while (t== s):
            t = np.random.randint(l, r)
        negs.append(t)
    return negs

PeterRec_noncau_parallel_classifier.py 代码中使用这段代码采样负样本进行分类测试,会采样相同类别的负样本,对于分类指标会偏高?

@yuan2961634811
Copy link

yuan2961634811 commented Jan 4, 2021 via email

@kebinC
Copy link
Author

kebinC commented Jan 4, 2021

你好,从统计角度讲,对结果没有影响。当然您也可以去除重复的都可以。

对于年龄性别这些下游任务,类别都较少,随机采负样本测试,统计上指标都会偏高的,整体上相当于测试的类别变少了

实际跑的时候,不去重跑出来与你论文的指标差不多,稍微低点;去重后指标要低个10个点左右

@yuan2961634811
Copy link

yuan2961634811 commented Jan 4, 2021 via email

@kebinC
Copy link
Author

kebinC commented Jan 4, 2021

你好,只需要所有baseline的evaluation保持一致就可以哈 在 2021-01-04 10:50:40,"Kaibing Chen" [email protected] 写道: 你好,从统计角度讲,对结果没有影响。当然您也可以去除重复的都可以。 对于年龄性别这些下游任务,类别都较少,随机采负样本测试,统计上指标都会偏高的,整体上相当于测试的类别变少了 实际跑的时候,不去重跑出来与你论文的指标差不多,稍微低点;去重后指标要低个10个点左右 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants