multi client launch (#5372)
* add changes for multi dev demo Signed-off-by:daquexian <daquexian566@gmail.com> * add part of backward hook Signed-off-by:
daquexian <daquexian566@gmail.com> * update Signed-off-by:
daquexian <daquexian566@gmail.com> * add naive init_with_env Signed-off-by:
daquexian <daquexian566@gmail.com> * update Signed-off-by:
daquexian <daquexian566@gmail.com> * update Signed-off-by:
daquexian <daquexian566@gmail.com> * support_multi_client * update Signed-off-by:
daquexian <daquexian566@gmail.com> * Remove unused code * Fix multi client launch * fix __main__ bug * update abcd op Signed-off-by:
daquexian <daquexian566@gmail.com> * fix multi client sync, make nccl instr ordered Signed-off-by:
daquexian <daquexian566@gmail.com> * temp changes Signed-off-by:
daquexian <daquexian566@gmail.com> * Use functional api instead of op_expr_helper::XXXOp. * align with latest master, remove unused code Signed-off-by:
daquexian <daquexian566@gmail.com> * local rank returns 0 when no env var, save is_multi_client in EnvDesc Signed-off-by:
daquexian <daquexian566@gmail.com> * move is_multi_client to ProcessCtx, rename cuda_d2d device to nccl, remove unused code Signed-off-by:
daquexian <daquexian566@gmail.com> * abcd -> return_first_input op Signed-off-by:
daquexian <daquexian566@gmail.com> * remove launch.py for now Signed-off-by:
daquexian <daquexian566@gmail.com> * refine Signed-off-by:
daquexian <daquexian566@gmail.com> * update IsMultiClient in env_util.py Signed-off-by:
daquexian <daquexian566@gmail.com> * rm multi_dev_demo.py Signed-off-by:
daquexian <daquexian566@gmail.com> * remove exported functions in env_util.py Signed-off-by:
daquexian <daquexian566@gmail.com> * remove unused op expr helper func Signed-off-by:
daquexian <daquexian566@gmail.com> * fix bug Signed-off-by:
daquexian <daquexian566@gmail.com> * remove ddp code Signed-off-by:
daquexian <daquexian566@gmail.com> * refine env.init. only call env.init in init.py when multi client Signed-off-by:
daquexian <daquexian566@gmail.com> * revert unrelated changes Signed-off-by:
daquexian <daquexian566@gmail.com> * add missing parameter Signed-off-by:
daquexian <daquexian566@gmail.com> * fix python api bug Signed-off-by:
daquexian <daquexian566@gmail.com> * address comments Signed-off-by:
daquexian <daquexian566@gmail.com> Co-authored-by:
clackhan <han_binbin@163.com> Co-authored-by:
hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by:
oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Showing
- oneflow/__main__.py 1 addition, 1 deletiononeflow/__main__.py
- oneflow/api/python/eager/multi_client.cpp 26 additions, 0 deletionsoneflow/api/python/eager/multi_client.cpp
- oneflow/api/python/eager/single_client.cpp 0 additions, 0 deletionsoneflow/api/python/eager/single_client.cpp
- oneflow/api/python/env/env.cpp 3 additions, 0 deletionsoneflow/api/python/env/env.cpp
- oneflow/api/python/env/env.h 12 additions, 5 deletionsoneflow/api/python/env/env.h
- oneflow/api/python/env/env_api.h 8 additions, 2 deletionsoneflow/api/python/env/env_api.h
- oneflow/core/control/ctrl_bootstrap.proto 1 addition, 1 deletiononeflow/core/control/ctrl_bootstrap.proto
- oneflow/core/job/env_global_objects_scope.cpp 2 additions, 1 deletiononeflow/core/job/env_global_objects_scope.cpp
- oneflow/core/job/env_global_objects_scope.h 1 addition, 1 deletiononeflow/core/job/env_global_objects_scope.h
- oneflow/core/rpc/include/global_process_ctx.h 2 additions, 0 deletionsoneflow/core/rpc/include/global_process_ctx.h
- oneflow/core/rpc/lib/global_process_ctx.cpp 14 additions, 0 deletionsoneflow/core/rpc/lib/global_process_ctx.cpp
- oneflow/core/vm/vm_util.cpp 13 additions, 0 deletionsoneflow/core/vm/vm_util.cpp
- oneflow/core/vm/vm_util.h 1 addition, 0 deletionsoneflow/core/vm/vm_util.h
- oneflow/init.py 12 additions, 4 deletionsoneflow/init.py
- oneflow/python/framework/c_api_util.py 2 additions, 2 deletionsoneflow/python/framework/c_api_util.py
- oneflow/python/framework/distribute.py 10 additions, 0 deletionsoneflow/python/framework/distribute.py
- oneflow/python/framework/env_util.py 45 additions, 25 deletionsoneflow/python/framework/env_util.py
- oneflow/python/framework/session_util.py 1 addition, 27 deletionsoneflow/python/framework/session_util.py
- oneflow/python/framework/unittest.py 1 addition, 4 deletionsoneflow/python/framework/unittest.py
Please register or sign in to comment