I want to add an Image Captioning model to Fabrik. It seems pretty easy, all you have to do is find a JSON format of the model you want, and there you go! All done. But in reality, it wasn’t that easy.
First, I have to find an image captioning mode. My first choice landed to NeuralTalk since it’s pretty popular. After heading to the GitHub page, it seems like NeuralTalk is obsoleted by NeuralTalk2, so I take NeuralTalk2 After having a hard time trying to install it and trying to make the JSON file, I realized something. NeuralTalk2 use torch as its framework and Fabrik doesn’t support torch. Fabrik only supports Caffe, Keras, and Tensorflow. (I never made the JSON file by the way)
I have to try another model. I ended up trying
Neural Image Captioning by oarriaga on github. Unlike NeuralTalk2, making the JSON file was pretty smooth.
So I try to find another model. Show and Tell looks good. It uses Tensorflow as its framework, but unfortunately Fabrik <><><>< so I have to find another model