Skip to content
Snippets Groups Projects
Unverified Commit 8c7be1f3 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!2941 Fix GPT Stand alone Running Commands

Merge pull request !2941 from huangxinjing/fix_script_error
parents cfb335bb ece8f5e9
No related branches found
No related tags found
No related merge requests found
......@@ -30,4 +30,5 @@ python train.py \
--epoch_size=$EPOCH_SIZE \
--device_id=$DEVICE_ID \
--data_path=$DATA_DIR \
--model_parallel=1 \
--optimizer="adam" > training_log.txt 2>&1 &
......@@ -241,7 +241,7 @@ Training 60B model using 8 NPU in one server requires that the server has at lea
```bash
# run distributed training example in one ascend machine
bash run_distributed_train_moe_host_device.sh /path/dataset /path/hccl.json 8 fp32 2.6B 1 1 1 0 8 36 0
bash run_distributed_train_moe_host_device.sh /path/dataset /path/hccl.json 8 fp32 2.6B 1 1 2 0 8 36 0
```
#### Training on homogeneous
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment