Skip to content

AWS GPU Configuration#868

Draft
hameerabbasi wants to merge 2 commits intomainfrom
aws-gpu
Draft

AWS GPU Configuration#868
hameerabbasi wants to merge 2 commits intomainfrom
aws-gpu

Conversation

@hameerabbasi
Copy link
Copy Markdown
Collaborator

@hameerabbasi hameerabbasi commented May 5, 2025

This PR adds support for the following features (only COO):

  • Elemwise (if supported by CuPy)
  • Sum along an axis
  • Matmul
  • Converting to/from cupyx.scipy.sparse matrices.
  • to_device with "cpu" and CuPy devices, only stream=None,
  • Constructing from CuPy arrays.
@hameerabbasi hameerabbasi force-pushed the aws-gpu branch 2 times, most recently from b99e7a5 to 185c956 Compare May 5, 2025 08:14
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented May 5, 2025

CodSpeed Performance Report

Merging #868 will degrade performances by 97.81%

Comparing aws-gpu (55ae3bc) with main (afb5212)

Summary

⚡ 10 improvements
❌ 151 regressions
✅ 179 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
test_elemwise[side=100-rank=1-format='coo'-add] 2.9 ms 3.7 ms -21.3%
test_elemwise[side=100-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.45%
test_elemwise[side=100-rank=1-format='gcxs'-add] 3.4 ms 4.6 ms -26.65%
test_elemwise[side=100-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.45%
test_elemwise[side=100-rank=2-format='coo'-add] 3.3 ms 4.1 ms -19.78%
test_elemwise[side=100-rank=2-format='coo'-mul] 2.4 ms 2.9 ms -17.56%
test_elemwise[side=100-rank=2-format='gcxs'-add] 6.7 ms 7.7 ms -13.02%
test_elemwise[side=100-rank=2-format='gcxs'-mul] 5.8 ms 6.5 ms -10.95%
test_elemwise[side=1000-rank=1-format='coo'-add] 2.9 ms 3.8 ms -22.4%
test_elemwise[side=1000-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.34%
test_elemwise[side=1000-rank=1-format='gcxs'-add] 3.4 ms 4.7 ms -27.39%
test_elemwise[side=1000-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.25%
test_elemwise[side=500-rank=1-format='coo'-add] 2.9 ms 3.8 ms -22.45%
test_elemwise[side=500-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.39%
test_elemwise[side=500-rank=1-format='gcxs'-add] 3.4 ms 4.7 ms -27.43%
test_elemwise[side=500-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.29%
test_elemwise[side=500-rank=2-format='coo'-add] 7.1 ms 8 ms -10.33%
test_elemwise[side=500-rank=2-format='coo'-mul] 3.9 ms 4.4 ms -11.76%
test_elemwise_broadcast[side=100-format='coo'-mul] 2.6 ms 3.2 ms -18.38%
test_elemwise_broadcast[side=100-format='gcxs'-mul] 6.4 ms 7.4 ms -13.71%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@hameerabbasi hameerabbasi force-pushed the aws-gpu branch 4 times, most recently from 4e69a90 to ae8e04f Compare May 7, 2025 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant