libai/projects/Qwen at main · dumpmemory/libai

Name	Name	Last commit message	Last commit date
parent directory ..
configs	configs
utils	utils
README.md	README.md
pipeline.py	pipeline.py
qwen2.py	qwen2.py
qwen_dataset.py	qwen_dataset.py
tokenizer.py	tokenizer.py
train_net.py	train_net.py

Name

Last commit message

Last commit date

推理

cuda PASS

python projects/Qwen/pipeline.py --model_path=/root/models/Qwen1.5-7B-Chat --mode=huggingface

npu PASS

python projects/Qwen/pipeline.py --model_path=/data0/hf_models/qwen2/Qwen1.5-7B-Chat --mode=huggingface --device=npu

xpu PASS

python projects/Qwen/pipeline.py --model_path=/root/models/Qwen1.5-7B-Chat --mode=huggingface --device=xpu

训练

data preparation

python projects/Qwen/utils/data_prepare.py

cuda PASS

export NUM_GPUS=8 python3 -m oneflow.distributed.launch \ --nproc_per_node ${NUM_GPUS} \ --nnodes 1 \ --node_rank 0 \ --master_addr 127.0.0.1 \ --master_port 12345 \ tools/train_net.py --config-file=projects/Qwen/configs/qwen_sft.py \ graph.enabled=True \ train.input_placement_device="cuda" \ train.dist.device_type="cuda" \ train.dist.pipeline_parallel_size=${NUM_GPUS}

A100-PCIE-40GB x 4 OOM

xpu OOM

export NUM_GPUS=1 python3 -m oneflow.distributed.launch \ --nproc_per_node ${NUM_GPUS} \ --nnodes 1 \ --node_rank 0 \ --master_addr 127.0.0.1 \ --master_port 12345 \ tools/train_net.py --config-file=projects/Qwen/configs/qwen_sft.py \ graph.enabled=False \ train.input_placement_device="xpu" \ train.dist.device_type="xpu" \ train.dist.pipeline_parallel_size=${NUM_GPUS}

npu 没有测，应该不行

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

推理

训练

FilesExpand file tree

Qwen

Directory actions

More options

Directory actions

More options

Latest commit

History

Qwen

Folders and files

parent directory

README.md

推理

训练