We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TDM3.0训练时是采用beam search得到样本,然后根据样本的子节点max得到该节点的拟合值,但是一个样本假设耗时0.3s,一个batch有2048个样本,那一个batch的耗时不就将近600s?这不会超级耗时吗
The text was updated successfully, but these errors were encountered:
不太明白是什么意思,做beam search及取max都可以batch-wise的进行,没有串行的开销。具体细节可以参考我们tf的实现:http://proceedings.mlr.press/v119/zhuo20a.html
Sorry, something went wrong.
借楼请教一个问题,如图对一个用户做推荐召回,从上至下,需要做深度模型的推理总共 O(2logNK) 个,按照 N=100万来算,logN=20,按照 K=10来算(通常K都要大于10),总共要做推理 2* 20 * 10=400 次的深度模型计算。
这个计算量跟通常的 youtube DNN +ANN 相比,也大了 400 倍,而且这个还是最差的量级情况下
即使全部都是并行的,整个延迟 p99 的耗时其实是变成了 p(99 *400)
请问大神,这个工程上是怎么实现的
请问3.0在哪里 我只能看到最高1.2版本呀
No branches or pull requests
TDM3.0训练时是采用beam search得到样本,然后根据样本的子节点max得到该节点的拟合值,但是一个样本假设耗时0.3s,一个batch有2048个样本,那一个batch的耗时不就将近600s?这不会超级耗时吗
The text was updated successfully, but these errors were encountered: