The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available. A prototypical example of this is the one-shot learning setting, in which we must correctly make predictions given only a single example of each new class.
Here, I explored the power of One-Shot Learning with a popular model called "Siamese Neural Network".
Fire up your favorite command line utility (e.g. Terminal, iTerm or Command Prompt), and type the following commands to clone the project.
$ git clone https://github.com/victor-iyiola/few-shot-learning.git
$ cd few-shot-learning && ls
LICENSE README.md datasets images omniglot one-shot.ipynb utils.py
Or simply download this repository, and change your working directory to the downloaded project.
$ cd path/to/few-shot-learning
$ ls
LICENSE README.md datasets images omniglot one-shot.ipynb utils.py
Note:
ls
command will be changed todir
for Windows users.
This project was developed with Python v3.6.5. However any higher version of Python works fine.
- Jupyter >= v4.4.0
- NumPy >= v1.14.3
- Sci-kit learn >= v0.19.1
- Keras >= v2.2.0
- TensorFlow >= v1.9.0
$ pip3 install --upgrade -r requirements.txt
$ jupyter notebook
[I 08:36:52.271 LabApp] The Jupyter Notebook is running at:
[I 08:36:52.271 LabApp] http://localhost:8888/?token=cb246f438ca40a1a319d12c877d2e825c923fc0525c9d136
[I 08:36:52.271 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:36:52.273 LabApp]
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=cb246f438ca40a1a319d12c877d2e825c923fc0525c9d136
A standard Siamese Convolutional Neural Network with
The model consists of a sequence of convolutional layers, each of which uses a single channel with filters of varying size and a fixed stride of 1. The number of convolutional filters is specified as a multiple of 16 to optimize performance. The network applies a ReLU activation function to the output feature maps, optionally followed by max-pooling with a filter size and stride of 2. Thus the
$$ a^{(k)}{1, m} = \textrm{max-pool}(max(0, W^{(k)}{l-1} \star h_{1, (l-1)} + b_l), 2) $$
$$ a^{(k)}{2, m} = \textrm{max-pool}(max(0, W^{(k)}{l-1} \star h_{2, (l-1)} + b_l), 2) $$
where
Loss function. Let
You are very welcome to modify and use them in your own projects.
Please keep a link to the original repository. If you have made a fork with substantial modifications that you feel may be useful, then please open a new issue on GitHub with a link and short description.
This project is opened under the MIT 2.0 License which allows very broad use for both academic and commercial purposes.
A few of the images used for demonstration purposes may be under copyright. These images are included under the "fair usage" laws.