Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Followup to #492: Enable C function wrapping #505

Merged
merged 1 commit into from
Dec 24, 2024
Merged

Conversation

christiangnrd
Copy link
Contributor

The added functions are not super useful since they're essentially constructors, but this removes the piracy of Generators.skip_check.

Ignore MPS functions since MPS does not seem to have a dylib. If anyone ever needs those functions I can look into it.

Also a few improvements to wrap.jl to make developing easier.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: 19da2bf Previous: ea1d6ad Ratio
private array/construct 27000 ns 27217.214285714286 ns 0.99
private array/broadcast 465583 ns 464500 ns 1.00
private array/random/randn/Float32 825041.5 ns 826000 ns 1.00
private array/random/randn!/Float32 678000 ns 658334 ns 1.03
private array/random/rand!/Int64 575667 ns 572854 ns 1.00
private array/random/rand!/Float32 610042 ns 598542 ns 1.02
private array/random/rand/Int64 765708 ns 771187.5 ns 0.99
private array/random/rand/Float32 617208 ns 585687.5 ns 1.05
private array/copyto!/gpu_to_gpu 689895.5 ns 662042 ns 1.04
private array/copyto!/cpu_to_gpu 621708 ns 683709 ns 0.91
private array/copyto!/gpu_to_cpu 828833 ns 804500 ns 1.03
private array/accumulate/1d 1310042 ns 1317083 ns 0.99
private array/accumulate/2d 1379375 ns 1373541 ns 1.00
private array/iteration/findall/int 2061063 ns 2028709 ns 1.02
private array/iteration/findall/bool 1781166 ns 1812688 ns 0.98
private array/iteration/findfirst/int 1691041 ns 1707834 ns 0.99
private array/iteration/findfirst/bool 1665667 ns 1660229 ns 1.00
private array/iteration/scalar 3606833.5 ns 3568021 ns 1.01
private array/iteration/logical 3165125 ns 3163792 ns 1.00
private array/iteration/findmin/1d 1750625 ns 1739750 ns 1.01
private array/iteration/findmin/2d 1345146 ns 1349604 ns 1.00
private array/reductions/reduce/1d 1047291 ns 1035312.5 ns 1.01
private array/reductions/reduce/2d 652000 ns 654666 ns 1.00
private array/reductions/mapreduce/1d 1045667 ns 1034083 ns 1.01
private array/reductions/mapreduce/2d 662458 ns 661125 ns 1.00
private array/permutedims/4d 2480354 ns 2484604 ns 1.00
private array/permutedims/2d 1021250 ns 1024083 ns 1.00
private array/permutedims/3d 1589041 ns 1571500 ns 1.01
private array/copy 578416.5 ns 577000 ns 1.00
latency/precompile 5788059541.5 ns 5769911666.5 ns 1.00
latency/ttfp 6616711062.5 ns 6647448292 ns 1.00
latency/import 1169882708 ns 1167766604 ns 1.00
integration/metaldevrt 726895.5 ns 719229 ns 1.01
integration/byval/slices=1 1589833 ns 1521583.5 ns 1.04
integration/byval/slices=3 10783021 ns 9443084 ns 1.14
integration/byval/reference 1615708 ns 1487604 ns 1.09
integration/byval/slices=2 2608458 ns 2653771 ns 0.98
kernel/indexing 456792 ns 531041 ns 0.86
kernel/indexing_checked 453916.5 ns 472333 ns 0.96
kernel/launch 8500 ns 10201.5 ns 0.83
metal/synchronization/stream 14500 ns 13917 ns 1.04
metal/synchronization/context 14708 ns 14625 ns 1.01
shared array/construct 25548.666666666668 ns 26330.357142857145 ns 0.97
shared array/broadcast 477000 ns 476083 ns 1.00
shared array/random/randn/Float32 827770.5 ns 768750 ns 1.08
shared array/random/randn!/Float32 678792 ns 657000 ns 1.03
shared array/random/rand!/Int64 568666 ns 554959 ns 1.02
shared array/random/rand!/Float32 609625 ns 599625 ns 1.02
shared array/random/rand/Int64 772500 ns 735208 ns 1.05
shared array/random/rand/Float32 621875 ns 626959 ns 0.99
shared array/copyto!/gpu_to_gpu 87917 ns 87500 ns 1.00
shared array/copyto!/cpu_to_gpu 87584 ns 87125 ns 1.01
shared array/copyto!/gpu_to_cpu 78833 ns 82292 ns 0.96
shared array/accumulate/1d 1349042 ns 1329375 ns 1.01
shared array/accumulate/2d 1387750 ns 1383292 ns 1.00
shared array/iteration/findall/int 1821917 ns 1790458 ns 1.02
shared array/iteration/findall/bool 1585666.5 ns 1556709 ns 1.02
shared array/iteration/findfirst/int 1389312.5 ns 1376958 ns 1.01
shared array/iteration/findfirst/bool 1363709 ns 1355917 ns 1.01
shared array/iteration/scalar 155417 ns 152208 ns 1.02
shared array/iteration/logical 2947354.5 ns 2949792 ns 1.00
shared array/iteration/findmin/1d 1454708 ns 1454250 ns 1.00
shared array/iteration/findmin/2d 1362041.5 ns 1354125 ns 1.01
shared array/reductions/reduce/1d 733791.5 ns 728312.5 ns 1.01
shared array/reductions/reduce/2d 669959 ns 662854.5 ns 1.01
shared array/reductions/mapreduce/1d 741584 ns 726833.5 ns 1.02
shared array/reductions/mapreduce/2d 672000 ns 657667 ns 1.02
shared array/permutedims/4d 2539229.5 ns 2555792 ns 0.99
shared array/permutedims/2d 1023042 ns 1009333 ns 1.01
shared array/permutedims/3d 1589666 ns 1585666 ns 1.00
shared array/copy 241500.5 ns 248500 ns 0.97

This comment was automatically generated by workflow using github-action-benchmark.

Ignore MPS functions for now since MPS does not seem to have a dylib. If anyone ever needs those functions I can look into it.
@maleadt maleadt merged commit 6a760a6 into main Dec 24, 2024
2 checks passed
@maleadt maleadt deleted the wrapperfollowup branch December 24, 2024 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants