Fixing the leak. #1
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi. I'm from the Monadfix consultancy.
My preferred way to profile Haskell programs looks something like
so I'll be using this.
Running the original version of code results in
ARR_WORDSis stuff related toByteString. So a lot ofByteStrings are loaded into memory. The culprit isfoldChunks(I cannot say right now what exactly the problem withfoldChunksis, it's just my intuition said to me it looks suspicious; I'll think about it), so if we remove (note also thatrunResourceTis completely redundant here)and write this instead:
we'll get:
Which is the actual leak. But it's a simple one: you just allocate a lot of
[]things, i.e. lists. Namely,xandyin\x y -> force $ mappend x yare of the same type:Fix TestType. The monoid instance for this type iswhich in the
mappendcase uses the monoid instance ofTestType:which itself in the
mappendcase uses the monoid instance of[]. I.e.valueToTypesgenerates some lists insideFix TestTypeand then you append (and fully force each time!) these lists over and over again just to discard the resulting value viasomeStuff $ unfix $ ...later. The now obvious solution is to discard lists immediately while folding:which results in
By the way, I cannot review your
conduitcode as I'm not very familiar withconduit, but one thing that looks overcomplicated isWhat
readFiledoes is allocates a handle, reads the file and releases the handle. I.e. strictly reads a file. But everything related toResourceTit does just inside this single function, so you could as well writeBut
B.ByteStringis already strict, no reason to useResourceThere. Henceis sufficient.