-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
it seems the inference is very slow on my linux server? #9
Comments
Hello, xiaoxiong: |
Hi, NJU-Jet so you mean that this model is suitable for run on mobile device's GPU (not CPU) ? do you mean that it is fast on mobile device's GPU and I'd better to test it on your AI benchmark app(using mobile device's GPU)? |
I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device. |
The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:
where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys. |
Thank you so much!
…----------
该邮件从移动设备发送
--------------原始邮件--------------
发件人:"duanshengliu ***@***.***>;
发送时间:2022年5月24日(星期二) 晚上11:53
收件人:"NJU-Jet/SR_Mobile_Quantization" ***@***.***>;
抄送:"杜宗财 ***@***.***>;"Comment ***@***.***>;
主题:Re: [NJU-Jet/SR_Mobile_Quantization] it seems the inference is very slow on my linux server? (#9)
-----------------------------------
I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.
The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:
interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)
where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hi, Dear NJU-Jet
my linux server: several 2.6GHz CPU + several V100, and I run the generate_tflite.py to got a quantized model.
and then in function evaluate, I add below code to measure the inference time:
and it seems the inference time is very slow, it cost about 70 seconds per image.
I wonder that this inference is run on cpu or gpu? and why it is so slow?
thank you very much!
The text was updated successfully, but these errors were encountered: