fix memory-related issues to enable ASAN tests#14223
fix memory-related issues to enable ASAN tests#14223eric-haibin-lin merged 10 commits intoapache:masterfrom
Conversation
4846bba to
fb43b71
Compare
|
Can you make them blocking as part of this PR? |
|
@mxnet-label-bot add [Memory, pr-awaiting-review] |
|
There still seems to be some problem with shutdown order http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14223/3/pipeline#step-102-log-1650 |
b5bc2da to
4f2549c
Compare
|
@szha After digging into the CI crash, two more problems are addressed:
@marcoabreu ASAN tests are blocking in CI now. |
14ad85e to
4e724c0
Compare
bfba55e to
9d20133
Compare
|
Why can't I trigger the full CI process? |
|
@arcadiaphy there seems to be some problem with the CI right now. |
|
@szha Everything seems OK now, the only problem is I have changed the code in the submodule of mshadow and dmlc-core. @marcoabreu The asan log looks clean too. |
|
@arcadiaphy thanks! Feel free to PR those changes to the respective repos. Once merged, you can change the submodules to point to the new commits there. |
|
@szha Submodules are merged and pointed to new commits. |
There was a problem hiding this comment.
Thanks for the fix! I noticed that the patch is made to the mxnet-stable branch in dmlc-core. I don't think that is a good sign - we do not want to diverge from dmlc-core master. @szha what do you think
|
I agree that mxnet-stable branch should be merged back to master ASAP. @hcho3 informed me that he's taking a look now. For now, I think that effort can be taken separate from this PR |
* fix heap overflow * fix memory leak of optimizer and executer * uncomment memory pool free * run cleanup in engine shutdown phase * make asan tests blocking * fix abort in mxnet shutdown, use forked submodules temporally for tests * trigger CI * change submodule mshadow * change submodule dmlc-core
* fix heap overflow * fix memory leak of optimizer and executer * uncomment memory pool free * run cleanup in engine shutdown phase * make asan tests blocking * fix abort in mxnet shutdown, use forked submodules temporally for tests * trigger CI * change submodule mshadow * change submodule dmlc-core
Description
Continuing the discussion in #14176, this PR fixes memory-related issues detected by ASAN:
Currently ASAN tests is non blocking, after this PR, the checks are green.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments