Conversation
|
Neat! This can be used to implement |
|
Would be nice to have this.. Why is it not merged yet? |
|
Still not merged? |
|
I pulled this locally and resolved the conflicts (just a matter of fixing the param IDs in caffe.proto). Works good except for the backward pass - code seems fine (mathematically), yet I keep getting the same failure every time:
The reason for this is that sometimes an input value will land near the function's discontinuity (it's always the same element because we keep the random seed the same). This causes the gradient estimate to be different than the computed value, by a rather large factor. I can think of two hacky ways around this:
I will submit a new PR, cherry-picking @harm-nedap's commit and adding this fix to it. |
This PR adds a clipping layer which clips the bottom data to a certain range. I used this to ensure the output of the network is in a certain range after it was trained.