Skip to content
Snippets Groups Projects
  1. Jul 19, 2021
  2. May 18, 2021
  3. May 14, 2021
  4. May 09, 2021
  5. Apr 27, 2021
  6. Apr 26, 2021
  7. Apr 24, 2021
  8. Apr 19, 2021
  9. Apr 16, 2021
  10. Apr 01, 2021
  11. Mar 31, 2021
  12. Sep 10, 2020
    • daquexian's avatar
      split BlobObject and EagerBlobObject (#3485) · 4131ea97
      daquexian authored
      * split BaseObject and EagerBaseObject
      
      * Use BlobObject in FeedOrFetch, rename functions
      
      * ForEachIbnAndEagerBlobObject->ForEachIbnAndBlobObject
      
      * address reviews
      
      * give lazy ref blob a full blob desc
      
      * rename ForEachObnAndEagerBlobObject->ForEachMutBnAndBlobObject, ForEachIbnAndBlobObject->ForEachConstBnAndBlobObject
      
      * fix bug
      4131ea97
  13. Aug 13, 2020
    • daquexian's avatar
      manipulate lazy interface blobs in eager (#3226) · b5b48550
      daquexian authored
      
      * wip
      
      * wip
      
      * remove instruction2, make Run() method empty
      
      * use HostStreamType instead of CpuStreamType
      
      * get blob by lbn and parallel_id, separate cpu and gpu instructions
      
      * fix wrong initialization order
      
      * set parallel_ctx field of regst_desc in task::NewProducedRegst
      
      * refine code
      
      * rename MaybeRun -> Run
      
      * reformat
      
      * keep the track of regst_desc_id and parallel_ctx in register manager
      
      * add license
      
      * use lbi instead of lbn
      
      * variable op -> interface op
      
      * set is_mutable to true in other interface ops
      
      * commit debug code
      
      * add api in flow.experimental, record job_name for interface ops, pass job_name to BlobDef and EagerConsistentBlob
      
      * add lock on map concurrent write
      
      * clean
      
      * keep the track of lbi&parallel_id and blob* in register manager
      
      * remove hardcoded 'out'
      
      * update comments
      
      * use right dtype, add test
      
      * add missing sync
      
      * add type annotations in test job
      
      * support mirrored and tensor list blob
      
      * wrap cuda_blob_instruction_type.cpp with #ifdef WITH_CUDA, skip the test in cpu only build
      
      Co-authored-by: default avatarLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: default avataroneflow-bot <69100618+oneflow-bot@users.noreply.github.com>
      b5b48550
  14. Jul 25, 2020
  15. Jul 24, 2020
  16. Jul 23, 2020
    • Shenghang Tsai's avatar
      Dev apache2 license (#3266) · d0bdbd5d
      Shenghang Tsai authored
      
      * add license at root dir
      
      * check in empty files
      
      * rm space
      
      * check in script
      
      * update script
      
      * fix bug
      
      * add print
      
      * fix
      
      * add exit
      
      * add to of_format
      
      * add CI task
      
      * fix license
      
      * Revert "fix license"
      
      This reverts commit 818b6d7691d3a8b4a25dd41a47ff2c5922b8ec57.
      
      * only add once
      
      * quick fix
      
      * fix script
      
      * dont fmt empty file
      
      * fix
      
      * quick fix
      
      * fix py
      
      * add license
      
      * fix exit
      
      * add license for hpp
      
      * add license
      
      * license new vm files
      
      Co-authored-by: default avatartsai <caishenghang@oneflow.org>
      d0bdbd5d
  17. Jul 15, 2020
    • Li Xinqi's avatar
      Dev eager (#2966) · 996ebd45
      Li Xinqi authored
      
      * rename Async2sync to Await
      
      * oneflow.eager_fixed_placement
      
      * fix a bug in InstructionBuilder._StatelessCall
      
      * do not panic in VirtualMachine::ForEachMutMirroredObject
      
      * GetMirroredObject
      
      * add instruction ReplaceMirrored
      
      * hob.is_current_machine_master
      
      * rename ForeignWorkerWatcher to ForeignWorkerCallback (#2935)
      
      * EagerJobBuildAndInferCtx
      
      * refactor api_oneflow_function
      
      * eager_oneflow_global_function
      
      * fix a bug in Session.Init()
      
      * interpreter.EagerRun
      
      * JobBuildAndInferCtx::GetMirroredOpParallelConf
      
      * JobBuildAndInferCtx:GetMirroredOpName
      
      * chanege the JobBuildAndInferCtx::Complete function to the pure virtual
      
      * refactor EagerJobBuildAndInferCtx::Complete()
      
      * LazyConsistentBlob/LazyMirroredBlob
      
      * EagerRemoteBlob
      
      * eager logical blob
      
      * rename symbol_dict to symbol_cache; rename object_dict to object_cache
      
      * refactor EagerLogicalBlob with instruction ReplaceMirror
      
      * test_eager_logical_blob
      
      * get_eager_variable
      
      * free stashed variable blob before close session
      
      * Quick fix eager (#2964)
      
      * fix EagerPhysicalBlob
      
      * fix lazy
      
      Co-authored-by: default avatartsai <caishenghang@oneflow.org>
      
      * DeprecatedStatelessCallOpKernel
      
      * python wrapper for DeprecatedStatelessCall
      
      * Fix compile error (#2967)
      
      * fix compile error
      
      * modify by comment
      
      * rename unused cuda copy instruction
      
      * remove OneflowVM<kMaster>
      
      * stream_tag for StatelessCallOpKernel/DeprecatedStatelessCallOpKernel
      
      * quick fix is train in hob (#2971)
      
      Co-authored-by: default avatartsai <caishenghang@oneflow.org>
      
      * CudaHostRegisterBlob (#2972)
      
      * CudaHostRegisterBlob
      
      * CudaHostUnRegisterBlob
      
      * 1) gpu.copy_h2d.*; 2) gpu.copy_d2h.*
      
      * amend CudaHostUnregisterBlob
      
      Co-authored-by: default avatarouyangyu <xuanjiuye@gmail.com>
      
      * CudaHostAllocator for CudaCopyD2HStreamType
      
      * eager_copy
      
      * fix bugs in eager oneflow.copy
      
      * oneflow.copy
      
      * autograd for CopyOp
      
      * move MakeCopyInstructionBuilderFunction from copy.py to eager/vm_util.py
      
      * auto copy_hd
      
      * oneflow.system.assign
      
      * single device version model_init
      
      * more assert in CurJobAddMirroredOp
      
      * BroadcastReference
      
      * rename BroadcastReference to BroadcastObjectReference
      
      * InstructionBuilder.BroadcastBlobReference
      
      * remove RAIIBlobObject; add class BlobObject
      
      * oneflow.python.eager.boxing_util
      
      * rename DeprecatedXXX to SystemXXX
      
      * vm_util.InstructionBuilder cares nothing about logical_blob_name
      
      * refactor eager oneflow.get_variable and eager oneflow.system.assign
      
      * OneToManyBroadcastBlobReference
      
      * no panic in GenerateBackwardOpConfIf
      
      * ConsumedByGradientOp
      
      * fix Global<ResourceDesc> bug and gradient_function_not_found panic
      
      * implement gradient_util.*
      
      * fix misuse bug of CHECK_NOTNULL
      
      * add pass AutoTrainStep and AutoLearningRate for eager execution
      
      * replace ibn with ibn_prefix in StatelessCallOpKernelInstruction
      
      * delete unused bn_in_op index
      
      * object_cache.BnInOp2BlobObjectScope
      
      * refactor ModelInitOpConf
      
      * refactor InstructionBuilder.StatelessCall
      
      * refactor InstructionBuilder._SystemStatelessCall
      
      * fuse UserStatelessCall and SystemStatelessCall
      
      * Operator::GetOpAttributeWithoutOpNameAndLbn
      
      * always pass ParallelConf to InstructionBuilder.StatelessCall
      
      * refactor eager_get_variable
      
      * refactor python functions about OpAttribute
      
      * EagerCastToMirrored
      
      * OpArgAttribute
      
      * refactor interpreter_callback.Interpret
      
      * refactor FetchDelegateBlob
      
      * eager backward interpreter
      
      * BlobRegister
      
      * refactor interpret_callback
      
      * refactor gradient_util
      
      * put eager variable blob object into backward blob register
      
      * refactor interpreter_callback.Interpret
      
      * gradient_util.ReleaseUnusedBlobObject
      
      * Fix opkernel_instruction_type_test and remove machine_id2dev_phy_ids_ (#3047)
      
      * EagerRunBackwardOps
      
      * eager train demo
      
      * refactor interpreter_callback
      
      * remove unused foreign callback apis
      
      * interpret_callback.FindOrCreateVarBlobObject
      
      * rename OpArgAttribute to OpArgParallelAttribute
      
      * refactor EagerConsistentBlob/EagerMirroredBlob
      
      * interpret completed variable op
      
      * bugfix: mv oneflow.function to oneflow.global_function
      
      * fix the bug abount twice called global_function
      
      * refactor NormalModelUpdateKernel::Forward to being called by eager execution
      
      * refactor Kernel::Forward to the public one
      
      * refactor system kernel for being compatible to eager execution
      
      * 1) boxing_util.TryBroadcastOneToMany; 2) vm_util.BoxingStatelessCall
      
      * 1) refactor class Maybe and class Error; 2) fix boxing_util.TryBroadcastOneToMany
      
      * more test case for eager executed model_init
      
      * fix CastToMirrored::InferSbpSignature and CastFromMirrored::InferSbpSignature
      
      * merge develop
      
      * merge develop
      
      * boxing_util.TrySingleDeviceBoxing
      
      * op_executor
      
      * refactor boxing_util with boxing_hob
      
      * more boxing methods
      
      * GetEnvDefaultParallelConf
      
      * boxing_util.NcclAllReduce
      
      * OneflowVm::TryReceiveAndRun
      
      * Remove class ForeignWorkerCallback (#3097)
      
      * Remove class ForeignWorkerCallback
      
      * Add test_2d_gpu_variable
      
      * framework/register_python_callback.py
      
      * rename EagerModelForward to EagerForward
      
      * boxing_hob.MasterMachineOnly
      
      * boxing_util.ComposeBoxing
      
      * HobContextAttr
      
      * boxing_util.BroadcastManyToOne
      
      * boxing_util.NoBoxing
      
      * merge develop
      
      * c_api_util.InferOpConf
      
      * ReplaceBlobParallelDesc
      
      * rename local variables
      
      * refine boxing_util.BoxingTo
      
      * boxing_util.NcclAllReduce
      
      * support non broadcast paralleled variable
      
      * FillLogicalBlobDescSignature in InferOpConf
      
      * NaiveCpuConcatSplit
      
      * fix SystemOpKernelObject::ResetKernel
      
      * RwMutexedObject::Get returns Maybe<const T*> instead of const T&
      
      * CHECK_OK
      
      * 1) boxing_middle; 2) boxing_util.RefBlobObjectWithParallelDesc
      
      * more boxing methods composed with NaiveCpuConcatSplit
      
      * boxing_util.CpuManyOneToOne
      
      * Scope
      
      * 1) refactor vm::SymbolStorage; 2) Scope
      
      * refactor GetOpAttribute4OpConf
      
      * update session.Scope when new name_scope constructed
      
      * refactor c_util.InferOpConf
      
      * OperatorConf.scope_symbol_id
      
      * refactor AddAndInferConsistentOp/AddAndInferMirroredOp
      
      * fix get_variable
      
      * global function input output (#3065)
      
      * eager return
      
      * update test
      
      * update output
      
      * global function input base test pass
      
      * update test
      
      * fix some issues
      
      * EagerConsistentBlob return
      
      * merge dev_eager
      
      * refactor EagerConsistentBlob.numpy(...)
      
      * minor update
      
      * refactor.ModeScope
      
      * refactor GetOpAttribute4OpConf
      
      * fix unittest using numpy_mirrored_list
      
      Co-authored-by: default avatarlixinqi <lixinqi0703106@163.com>
      
      * eager oneflow.watch
      
      * Rename symbol_cache.py to symbol_storage.py (#3138)
      
      * eager watch_diff
      
      * more eager tests
      
      * Remove duplicate function GetParallelContext
      
      * 1) add FeedContext; 2) remove LocalFixedTensor
      
      * add instruction FeedBlob
      
      * rename: WatchBlob => FetchBlob
      
      * oneflow.env.enable_eager_environment
      
      * IsCpuOnly
      
      * fix eager push_util bugs
      
      * code format
      
      * reformat
      
      * ArgBlobDef.SetBatchAxisAndSplitAxis
      
      * CallOpkernel instruction family add argument SbpSignature
      
      * refactor remote eager blob
      
      * refactor InitGlobalCudaDeviceProp
      
      * fix InterfaceOpUtil
      
      * recusive call MakeEagerInputBlobs
      
      * copy returned blob in train job
      
      * 1) refactor UserStatelessCallOpKernel; 2) replace Global<ThreadMgr>::Get()->compute_thread_pool() with Global<ThreadPool>::Get()
      
      * fix gpu argwhere
      
      * refactor BlobObject::header_buffer_
      
      * interpret_util.ConsistentInterpret
      
      * return more debug messages when encountering mixed consistent/mirrored error
      
      * TryMirroredCastTotalLossInstanceNum
      
      * replace compile_context.CurJobAddOp with interpret_util.Forward
      
      * check backward timeline
      
      * add scope to return (#3156)
      
      * add scope to return
      
      * more elegant
      
      * Dev eager merge develop (#3157)
      
      * skip empty stream (#3141)
      
      * skip empty stream
      
      * skip empty stream
      
      * add tbs for gdb in docker (#3139)
      
      * add tbs for gdb in docker
      
      * add more desc
      
      * Fix CUDNN_STATUS_NOT_SUPPORTED error for bn (#3147)
      
      * Fix CUDNN_STATUS_NOT_SUPPORTED error for bn
      
      * always use nchw when training
      
      * fix xla cmake arg (#3144)
      
      Co-authored-by: default avatarguoran <guoran@oneflow.org>
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      
      * Autograd use user op (#3151)
      
      * Use scalar_mul user op
      
      * IndexedSlicesOptimizerRewritePass use scalar_mul user op
      
      * scalar_sub_by_tensor
      
      * broadcast_div=>scalar_div
      
      * fix name
      
      * fix int_operand
      
      * fix diff_lbi
      
      * install python pkgs from dev-requirements.txt when running CI (#3121)
      
      * Hotfix multi machine vm panic (#3153)
      
      * fix multi machine vm panic
      
      * fix compile bug
      
      * fix vm unittest bug (#3155)
      
      * Fix test cases
      
      Co-authored-by: default avatarLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: default avatarShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      Co-authored-by: default avatarguoran <guoran@oneflow.org>
      Co-authored-by: default avatardaquexian <daquexian566@gmail.com>
      
      * ParallelSignature
      
      * fix normalization grad ops timeline
      
      * Dev eager fix assign op (#3160)
      
      * Fix assign op
      
      * Add enable_if to assign api
      
      * Dev eager merge develop branch (#3164)
      
      * fix vm unittest bug (#3155)
      
      * Support BN Ex Operation (#3154)
      
      * Hotfix  CUDNN_STATUS_NOT_SUPPORTED error (#3162)
      
      * xrt support user op (#3152)
      
      * xrt support user op
      
      * xla add Sole func
      
      * tensorrt Sole func
      
      * fix
      
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      
      Co-authored-by: default avatarLi Xinqi <lixinqi2010@gmail.com>
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      
      * refactor BoxingHobContext
      
      * fix FixedTensorDef
      
      * refactor boxing_util.BroadcastManyToOne and boxing_util.BroadcastOneToMany
      
      * refactor eager boxing
      
      * replace compile_context.CurJobAddOp with interpret_util.Forward
      
      * boxing verbose
      
      * Call TryClearObject4BlobName in EagerConsistentBlob.__del__
      
      * fix instruction CudaHostRegisterBlob
      
      * new boxing method B -> S
      
      * blob_register.RegisteredBlobAccess
      
      * fix the use of enable_if (#3179)
      
      * add boxing P->B and P->S
      
      * fix eager oneflow.assign
      
      * Tensor list input and output of eager global function (#3181)
      
      * input tensor list
      
      * test input
      
      * output tensor list and update test
      
      * fix op watch (#3182)
      
      * fix op watch
      
      * optimized code
      
      * Math binary elementwise ops (#3169) (#3184)
      
      * math binary elementwise ops
      
      * implement of math binary elementwise gpu floating kernel
      
      * implement math binary elementwise cpu kernel; add test scripts
      
      * rm note
      
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      
      Co-authored-by: default avatarcheng cheng <472491134@qq.com>
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      
      * rename test_gather* to test_agather* (#3191)
      
      * Dev eager merge develop (#3192)
      
      * Math binary elementwise ops (#3169)
      
      * math binary elementwise ops
      
      * implement of math binary elementwise gpu floating kernel
      
      * implement math binary elementwise cpu kernel; add test scripts
      
      * rm note
      
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      
      * Remove multiply/gelu/tanh system op (#3183)
      
      * Remove multiply system op
      
      * Remove gelu/tanh system op
      
      * fix
      
      * Remove layer_norm/slice system op (#3180)
      
      * Remove layer_norm system op
      
      * Remove slice system op
      
      * Remove scalar_add/scalar_mul system op (#3189)
      
      * Remove useless system op (#3190)
      
      * Remove axpy system op
      
      * Remove print system op
      
      * Remove reduce_mean system op
      
      * Remove local_response_normalization system op
      
      * cleanup kernel.proto
      
      * Remove dot system op
      
      * Remove maximum system op
      
      * cleanup TopKOpConf
      
      * cleanup op_conf
      
      * remove TryUpdtBnVal4SepcialOpConf
      
      * fix xrt print
      
      Co-authored-by: default avatarcheng cheng <472491134@qq.com>
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      
      * remove test_eager.py (#3193)
      
      Co-authored-by: default avatarouyangyu <xuanjiuye@gmail.com>
      
      Co-authored-by: default avatarOuYang Yu <xuanjiuye@gmail.com>
      Co-authored-by: default avatarShenghang Tsai <jackalcooper@gmail.com>
      Co-authored-by: default avatartsai <caishenghang@oneflow.org>
      Co-authored-by: default avatarleaves-zwx <kunta0932@gmail.com>
      Co-authored-by: default avatarJuncheng <liujuncheng1022@gmail.com>
      Co-authored-by: default avatarguo ran <360112263@qq.com>
      Co-authored-by: default avatarguoran <guoran@oneflow.org>
      Co-authored-by: default avatardaquexian <daquexian566@gmail.com>
      Co-authored-by: default avatarcheng cheng <472491134@qq.com>
      996ebd45
  18. Jun 26, 2020