multi client multi machine test (#5685)
* add multi client multi machine test Signed-off-by:daquexian <daquexian566@gmail.com> * remove copy cores Signed-off-by:
daquexian <daquexian566@gmail.com> * use discover in bash Signed-off-by:
daquexian <daquexian566@gmail.com> * add tests in test.yml and refine Signed-off-by:
daquexian <daquexian566@gmail.com> * remove multi_client test files into test dir to reuse code Signed-off-by:
daquexian <daquexian566@gmail.com> * delete distributed_run_multi_client.py and move impl in distributed_run.py Signed-off-by:
daquexian <daquexian566@gmail.com> * if -> elif Signed-off-by:
daquexian <daquexian566@gmail.com> * try three times and upload log Signed-off-by:
daquexian <daquexian566@gmail.com> * add 'mode' arg in py Signed-off-by:
daquexian <daquexian566@gmail.com> * auto format by CI * remove --multi_client in yml Signed-off-by:
daquexian <daquexian566@gmail.com> * skip distributed test in cpu Signed-off-by:
daquexian <daquexian566@gmail.com> * use new test container Signed-off-by:
daquexian <daquexian566@gmail.com> * add host key to all machines Signed-off-by:
daquexian <daquexian566@gmail.com> * auto format by CI * fix python version Signed-off-by:
daquexian <daquexian566@gmail.com> * fix python version Signed-off-by:
daquexian <daquexian566@gmail.com> Co-authored-by:
oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by:
oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Showing
- .github/workflows/test.yml 42 additions, 24 deletions.github/workflows/test.yml
- ci/test/2node_op_test_multi_client.sh 23 additions, 0 deletionsci/test/2node_op_test_multi_client.sh
- ci/test/distributed_run.py 103 additions, 35 deletionsci/test/distributed_run.py
- ci/test/generic_test_multi_client.sh 0 additions, 0 deletionsci/test/generic_test_multi_client.sh
- ci/test/test_speed_multi_client.sh 0 additions, 0 deletionsci/test/test_speed_multi_client.sh
- python/oneflow/framework/unittest.py 4 additions, 1 deletionpython/oneflow/framework/unittest.py
- python/oneflow/test/modules/test_allreduce.py 18 additions, 3 deletionspython/oneflow/test/modules/test_allreduce.py
Please register or sign in to comment