Skip to content
Snippets Groups Projects
Unverified Commit 5806e2ba authored by daquexian's avatar daquexian Committed by GitHub
Browse files

multi client launch (#5372)


* add changes for multi dev demo

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* add part of backward hook

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* update

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* add naive init_with_env

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* update

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* update

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* support_multi_client

* update

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* Remove unused code

* Fix multi client launch

* fix __main__ bug

* update abcd op

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* fix multi client sync, make nccl instr ordered

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* temp changes

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* Use functional api instead of op_expr_helper::XXXOp.

* align with latest master, remove unused code

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* local rank returns 0 when no env var, save is_multi_client in EnvDesc

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* move is_multi_client to ProcessCtx, rename cuda_d2d device to nccl, remove unused code

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* abcd -> return_first_input op

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* remove launch.py for now

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* refine

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* update IsMultiClient in env_util.py

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* rm multi_dev_demo.py

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* remove exported functions in env_util.py

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* remove unused op expr helper func

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* fix bug

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* remove ddp code

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* refine env.init. only call env.init in init.py when multi client

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* revert unrelated changes

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* add missing parameter

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* fix python api bug

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

* address comments

Signed-off-by: default avatardaquexian <daquexian566@gmail.com>

Co-authored-by: default avatarclackhan <han_binbin@163.com>
Co-authored-by: default avatarhjchen2 <chenhoujiangcug@gmail.com>
Co-authored-by: default avataroneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
parent 49b95c0d
No related branches found
No related tags found
Showing
with 155 additions and 73 deletions
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment