Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

it seems the inference is very slow on my linux server? #9

Open
xiaoxiongli opened this issue Jun 4, 2021 · 5 comments
Open

it seems the inference is very slow on my linux server? #9

xiaoxiongli opened this issue Jun 4, 2021 · 5 comments

Comments

@xiaoxiongli
Copy link

xiaoxiongli commented Jun 4, 2021

Hi, Dear NJU-Jet

my linux server: several 2.6GHz CPU + several V100, and I run the generate_tflite.py to got a quantized model.

and then in function evaluate, I add below code to measure the inference time:
image

and it seems the inference time is very slow, it cost about 70 seconds per image.

image

I wonder that this inference is run on cpu or gpu? and why it is so slow?

thank you very much!

@NJU-Jet
Copy link
Owner

NJU-Jet commented Jun 4, 2021

Hello, xiaoxiong:
It runs on CPU so it's really slow. I and other paricipants also encounter the same problem.

@xiaoxiongli
Copy link
Author

xiaoxiongli commented Jun 4, 2021

Hi, NJU-Jet
thank you for you reply!

so you mean that this model is suitable for run on mobile device's GPU (not CPU) ?

do you mean that it is fast on mobile device's GPU and I'd better to test it on your AI benchmark app(using mobile device's GPU)?

@NJU-Jet
Copy link
Owner

NJU-Jet commented Jun 4, 2021

I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.

@duanshengliu
Copy link

I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.

The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:

interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)

where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.

@NJU-Jet
Copy link
Owner

NJU-Jet commented May 25, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants