Final project of the course "Large Scale AI Engineering" at ETH Zürich, FS2025. Implementation and benchmarking of pretokenization and Distributed Data Parallel (DDP) for efficient LLM training on the CSCS Alps supercomputer.
transformers tokenization hpc-cluster distributed-data-parallel cscs llm pretokenization padding-free
- Updated
May 19, 2025 - Python