Skip to content

Improve z3 trace management#1916

Merged
tjruwase merged 33 commits intomasterfrom
olruwase/zero_inference_type_mismatch
May 6, 2022
Merged

Improve z3 trace management#1916
tjruwase merged 33 commits intomasterfrom
olruwase/zero_inference_type_mismatch

Conversation

@tjruwase
Copy link
Contributor

@tjruwase tjruwase commented Apr 26, 2022

  • Trace cache invalidation when needed
  • Separate nvme prefetch from all-gather prefetch
  • Handle back-to-back execution of a submodule
tjruwase and others added 23 commits April 7, 2022 04:50
…eepSpeed into olruwase/zero_inference_type_mismatch
…crosoft/DeepSpeed into olruwase/zero_inference_type_mismatch
Separate nvme prefetch from all-gather prefetch
@ghost
Copy link

ghost commented Apr 29, 2022

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

❌ tjruwase sign now
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Collaborator

@jeffra jeffra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks @tjruwase this is really great.

@stas00
Copy link
Collaborator

stas00 commented May 5, 2022

please don't merge it yet, as while the functionality is correct, the performance regression hasn't been validated yet.

@tjruwase
Copy link
Contributor Author

tjruwase commented May 6, 2022

Comparing to v0.6.0, I don't see any performance degradation on 2 nodes of A100-40GB (16 GPUs). I think this is fine to merge.
image

@stas00, @jeffra, @SeanNaren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants