add interpolate_like for cpu #10544

woaixiaoxiao · 2024-07-09T08:23:26Z

在npu上进行SD的推理时发现用到了interpolate_like算子，但在开源版本中还没提供该算子的cpu实现。

因此该pr将闭源版本的interpolate_like算子的正向计算过程补充到开源版本。

在该分支下编译 oneflow，就可以在 cpu 和 npu 端调用 interpolate_like 算子了

github-actions · 2024-07-09T08:24:35Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

… into add_interpolate_like

github-actions · 2024-07-09T09:37:08Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

Flowingsun007 · 2024-07-10T10:28:34Z

python/oneflow/nn/modules/interpolate_like.py

+def interpolate_like(
+    input, like, mode="nearest", align_corners=None,
+):
+    """The interface is consistent with PyTorch.


参考interpolate，需要在docs/source/nn.functional.rst 里加一下interpolate_like

github-actions · 2024-07-29T03:18:28Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10544/

github-actions · 2024-07-29T03:39:30Z

Speed stats:

github-actions · 2024-08-02T09:57:26Z

CI failed when running job: cuda-misc. PR label automerge has been removed

github-actions · 2024-08-04T01:46:00Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10544/

github-actions · 2024-08-04T02:26:11Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.3ms (= 4326.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.4ms (= 5743.8ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.33 (= 57.4ms / 43.3ms)

OneFlow resnet50 time: 26.4ms (= 2635.4ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 37.3ms (= 3729.2ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.42 (= 37.3ms / 26.4ms)

OneFlow resnet50 time: 18.3ms (= 3664.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 35.0ms (= 7000.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.91 (= 35.0ms / 18.3ms)

OneFlow resnet50 time: 17.2ms (= 3439.3ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.5ms (= 6304.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.83 (= 31.5ms / 17.2ms)

OneFlow resnet50 time: 17.0ms (= 3398.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.2ms (= 5844.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.72 (= 29.2ms / 17.0ms)

OneFlow swin dataloader time: 0.200s (= 40.028s / 200, num_workers=1)
PyTorch swin dataloader time: 0.129s (= 25.747s / 200, num_workers=1)
Relative speed: 0.643 (= 0.129s / 0.200s)

OneFlow swin dataloader time: 0.054s (= 10.721s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.686s / 200, num_workers=4)
Relative speed: 0.624 (= 0.033s / 0.054s)

OneFlow swin dataloader time: 0.030s (= 6.047s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.340s / 200, num_workers=8)
Relative speed: 0.552 (= 0.017s / 0.030s)

❌ OneFlow resnet50 time: 49.1ms (= 4905.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 64.0ms (= 6400.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 64.0ms / 49.1ms)

OneFlow resnet50 time: 36.9ms (= 3693.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 46.1ms (= 4609.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.25 (= 46.1ms / 36.9ms)

OneFlow resnet50 time: 27.7ms (= 5540.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.8ms (= 8152.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.47 (= 40.8ms / 27.7ms)

OneFlow resnet50 time: 25.3ms (= 5051.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.6ms (= 7722.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.53 (= 38.6ms / 25.3ms)

OneFlow resnet50 time: 24.8ms (= 4969.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 35.8ms (= 7169.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.44 (= 35.8ms / 24.8ms)

github-actions · 2024-08-08T02:16:49Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.3ms (= 4327.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.3ms (= 5733.0ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.32 (= 57.3ms / 43.3ms)

OneFlow resnet50 time: 26.1ms (= 2605.7ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.1ms (= 3806.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.46 (= 38.1ms / 26.1ms)

OneFlow resnet50 time: 17.6ms (= 3526.9ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 37.1ms (= 7429.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 2.11 (= 37.1ms / 17.6ms)

OneFlow resnet50 time: 16.9ms (= 3384.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.6ms (= 6314.3ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.87 (= 31.6ms / 16.9ms)

OneFlow resnet50 time: 17.3ms (= 3463.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 31.5ms (= 6291.4ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.82 (= 31.5ms / 17.3ms)

OneFlow swin dataloader time: 0.200s (= 40.004s / 200, num_workers=1)
PyTorch swin dataloader time: 0.130s (= 25.939s / 200, num_workers=1)
Relative speed: 0.648 (= 0.130s / 0.200s)

OneFlow swin dataloader time: 0.053s (= 10.569s / 200, num_workers=4)
PyTorch swin dataloader time: 0.032s (= 6.431s / 200, num_workers=4)
Relative speed: 0.609 (= 0.032s / 0.053s)

OneFlow swin dataloader time: 0.030s (= 5.965s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.382s / 200, num_workers=8)
Relative speed: 0.567 (= 0.017s / 0.030s)

❌ OneFlow resnet50 time: 49.3ms (= 4928.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 64.0ms (= 6397.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 64.0ms / 49.3ms)

OneFlow resnet50 time: 36.1ms (= 3608.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 45.6ms (= 4559.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.26 (= 45.6ms / 36.1ms)

OneFlow resnet50 time: 27.8ms (= 5550.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.8ms (= 7964.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.43 (= 39.8ms / 27.8ms)

OneFlow resnet50 time: 25.2ms (= 5035.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.4ms (= 8070.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.60 (= 40.4ms / 25.2ms)

OneFlow resnet50 time: 24.7ms (= 4945.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 35.8ms (= 7158.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.45 (= 35.8ms / 24.7ms)

woaixiaoxiao added 2 commits July 9, 2024 07:47

success version

82c7543

clean

326cde9

woaixiaoxiao requested a review from hjchen2 as a code owner July 9, 2024 08:23

auto format by CI

0615dd3

woaixiaoxiao added 4 commits July 9, 2024 08:26

clean

73f8a6c

Merge branch 'add_interpolate_like' of github.com:Oneflow-Inc/oneflow…

a615764

… into add_interpolate_like

clean

3569303

clean

daefdb2

woaixiaoxiao requested review from ShawnXuan and Flowingsun007 July 9, 2024 08:39

woaixiaoxiao and others added 2 commits July 9, 2024 09:35

add test

169829b

auto format by CI

c745ab6

woaixiaoxiao added 2 commits July 9, 2024 09:38

clean

fc3633c

merge

1bc2807

ShawnXuan added enhancement op labels Jul 9, 2024

Flowingsun007 reviewed Jul 10, 2024

View reviewed changes

add rst

630707d

woaixiaoxiao requested a review from doombeaker as a code owner July 11, 2024 01:58

ShawnXuan requested a review from oneflow-ci-bot July 11, 2024 02:05

ShawnXuan and others added 4 commits July 11, 2024 11:03

Merge branch 'master' into add_interpolate_like

7212508

merge origin

f01d542

change docs

7abcbcb

merge

d025ab6

Flowingsun007 approved these changes Jul 15, 2024

View reviewed changes

Flowingsun007 enabled auto-merge (squash) July 15, 2024 04:00

Flowingsun007 added the automerge label Jul 15, 2024

ShawnXuan added the need-test-distributed label Jul 29, 2024

ShawnXuan removed the need-test-distributed label Jul 29, 2024

ShawnXuan approved these changes Jul 29, 2024

View reviewed changes

ShawnXuan self-requested a review August 2, 2024 09:47

github-actions bot removed the automerge label Aug 2, 2024

fix (#10549)

a575d22

Merge branch 'master' into add_interpolate_like

edc22fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add interpolate_like for cpu #10544

add interpolate_like for cpu #10544

woaixiaoxiao commented Jul 9, 2024 •

edited

Loading

github-actions bot commented Jul 9, 2024

github-actions bot commented Jul 9, 2024

Flowingsun007 Jul 10, 2024

woaixiaoxiao Jul 11, 2024

github-actions bot commented Jul 29, 2024

github-actions bot commented Jul 29, 2024

github-actions bot commented Aug 2, 2024

github-actions bot commented Aug 4, 2024

github-actions bot commented Aug 4, 2024

github-actions bot commented Aug 8, 2024

add interpolate_like for cpu #10544

Are you sure you want to change the base?

add interpolate_like for cpu #10544

Conversation

woaixiaoxiao commented Jul 9, 2024 • edited Loading

github-actions bot commented Jul 9, 2024

github-actions bot commented Jul 9, 2024

Flowingsun007 Jul 10, 2024

Choose a reason for hiding this comment

woaixiaoxiao Jul 11, 2024

Choose a reason for hiding this comment

github-actions bot commented Jul 29, 2024

github-actions bot commented Jul 29, 2024

github-actions bot commented Aug 2, 2024

github-actions bot commented Aug 4, 2024

github-actions bot commented Aug 4, 2024

github-actions bot commented Aug 8, 2024

woaixiaoxiao commented Jul 9, 2024 •

edited

Loading