Skip to content

jfc4050/dlcalc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

250 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ dlcalc

PyPI version checks License: MIT

command-line tools for deep learning optimization

Installation โ€ข Tools โ€ข Contributing


๐Ÿ“‹ Overview

dlcalc is a collection of tools for deep learning practitioners, providing calculators and tools for:

  • ๐Ÿงฎ Performance Modeling - Estimate training throughput, memory usage, and MFU
  • ๐ŸŒ Topology Analysis - Analyze and optimize network topology for distributed training
  • ๐Ÿ“Š Metrics Conversion - Convert between different performance metrics
  • ๐Ÿ” Checkpoint Analysis - Inspect and summarize model checkpoints

๐Ÿ”ง Installation

Via pip (recommended)

pip install dlcalc

or

From source

git clone https://github.com/jfc4050/dlcalc cd dlcalc pip install -e .

After this you should have access to the command line tools described below. Some people may need to add --user to their pip install command for them to properly go under $PATH.

๐Ÿ›  Tools

๐Ÿ“ Performance Modeling

3D Training Calculator (3dtrn)

Calculator for estimating performance characteristics of ND parallel transformer training:

3dtrn examples/llama3_70b.yaml

We recommend to use this with profilers like NVIDIA Nsight Systems or PyTorch Profiler to give theoretical grounding to your performance profiling.

๐ŸŒ Topology Optimization

Tool Command Purpose
Visualizer topoviz Generate network topology graphs from Kubernetes clusters
Evaluator topoeval Analyze topology optimality for DP rings
Scheduler topoassign Compute topology-aware rank assignments
# Visualize cluster topology topoviz -h # Evaluate training job topology topoeval -h # Generate optimal rank assignments topoassign -h

๐Ÿ“Š Metrics & KPIs

Samples/Sec โ†’ MFU Converter (sps2mfu)

Convert training throughput to Model FLOPs Utilization (MFU):

sps2mfu --samples-per-sec 100 --seqlen 2048 --model-size 70b \ --n-accelerators 512 --tflops-per-accelerator 312

Samples/Sec โ†’ Tokens/Day Converter (sps2tpd)

Calculate daily token throughput:

sps2tpd --samples-per-sec 100 --seqlen 2048

๐Ÿ” Utilities

Checkpoint Summarizer (ckpt-summarize)

Analyze PyTorch checkpoint contents:

ckpt-summarize model.pt

๐Ÿง‘โ€๐Ÿ’ป Development

Setup Development Environment

# Install with development dependencies pip install -e .[dev] # Install pre-commit hooks pre-commit install

Run Quality Checks

# Run all checks (formatting, linting, type checking, tests) bash checks

Testing

# Run full test suite pytest tests/ -v # Run with coverage pytest tests/ --cov=dlcalc --cov-report=term-missing

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ“ฎ Support


Made with โค๏ธ for the deep learning community

About

random command line tools for deep learning

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors