How Should I Use Blobs In A Caffe Python Layer, And When Does Their Training Take Place?
Solution 1:
You can add as many internal parameters as you wish, and these parameters (Blobs) may have whatever shape you want them to be.
To add Blobs (in your layer's class):
defsetup(self, bottom, top):
self.blobs.add_blob(2) # add two blobsself.blobs[0].reshape(3, 4) # first blob is 2Dself.blobs[0].data[...] = 0# init self.blobs[1].reshape(10) # second blob is 1D with 10 elementsself.blobs[1].data[...] = 1# init to 1
What is the "meaning" of each parameter and how to organize them in self.blobs
is entirely up to you.
How are trainable parameters being "trained"?
This is one of the cool things about caffe (and other DNN toolkits as well), you don't need to worry about it!
What do you need to do? All you need is to compute the gradient of the loss w.r.t the parameters and store it in self.blobs[i].diff
. Once the gradients are updated, caffe's internals takes care of updating the parameters according to the gradients/learning rate/momentum/update policy etc.
So,
You must have a non-trivial backward
method for your layer
backward(self, top, propagate_down, bottom):
self.blobs[0].diff[...] = # diff of parametersself.blobs[1].diff[...] = # diff for all the blobs
You might want to test your implementation of the layer, once you complete it. Have a look at this PR for a numerical test of the gradients.
Post a Comment for "How Should I Use Blobs In A Caffe Python Layer, And When Does Their Training Take Place?"