Skip to content

lugimzzz/PaddleFormers

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

41 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 


PaddleFormers is a Transformer model library built on the PaddlePaddle deep learning framework, delivering both ease of use and high-performance capabilities. It provides a unified model definition interface, modular training components, and comprehensive distributed training strategies specifically designed for large language model development pipelines. This enables developers to train large models efficiently with minimal complexity, making it suitable for diverse scenarios ranging from academic research to industrial applications.

News

[2025/06/28] ๐ŸŽ‰ PaddleFormers 0.1 is officially released! This initial version supports SFT/DPO training paradigms, configurable distributed training via unified Trainer API, and integrates PEFT, MergeKit, and Quantization APIs for diverse LLM applications.

Highlights

โš™๏ธ Simplified Distributed Training

Implements 4D parallel strategies through unified Trainer API, lowering the barrier to distributed LLM training.

๐Ÿ›  Efficient Post-Training

Integrates Packing dataflow and FlashMask operators for SFT/DPO training, eliminating padding waste and boosting throughput.

๐Ÿ’พ Industrial Storage Solution

Features Unified Checkpoint storage tools for LLMs, enabling training resumption and dynamic resource scaling. Additionally implements asynchronous storage (up to 95% faster) and Optimizer State Quantization (78% storage reduction), ensuring industrial training meets both efficiency and stability requirements.

Installation

Requires Python 3.8+ and PaddlePaddle 3.1+.

# Install via pip pip install paddleformers # Install development version git clone https://github.com/PaddlePaddle/PaddleFormers.git cd PaddleFormers pip install -e .

Quickstart

Text Generation

This example shows how to load Qwen model for text generation with PaddleFormers Auto API:

from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="bfloat16") input_features = tokenizer("Give me a short introduction to large language model.", return_tensors="pd") outputs = model.generate(**input_features, max_new_tokens=128) print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))

SFT Training

Getting started with supervised fine-tuning (SFT) using PaddleFormers:

from paddleformers.trl import SFTConfig, SFTTrainer from datasets import load_dataset dataset = load_dataset("ZHUI/alpaca_demo", split="train") training_args = SFTConfig(output_dir="Qwen/Qwen2.5-0.5B-SFT", device="gpu") trainer = SFTTrainer( args=training_args, model="Qwen/Qwen2.5-0.5B-Instruct", train_dataset=dataset, ) trainer.train()

Community

We welcome all contributions! See CONTRIBUTING.md for guidelines.

License

This repository's source code is available under the Apache 2.0 License.

About

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 99.7%
  • Other 0.3%