In Keras, training a ML model should feel similar to using a sklearn classifier. The API provides methods to design, compile, fit and evaluate simple and complex machine learning architectures. The beauty of Keras arises from its simplicity. The typical structure of a training process is:
- Design a model architecture
- Compile the model with an optimizer, loss and a choice of metrics
- Train the model for a given number of epochs
Lets have a look at some real code to see this structure in action (from the tensorflow website):
In the example above, a very simple model architecture was used. Depending on the needed level of complexity, there are three ways of designing the model (so-called API-styles):
In this approach, we first instantiate a tf.keras.models.Sequential class, then we simply add all the layers we want to create the desired architecture. See the example below for a simple convolutional architecture.
After defining the input size, all subsequent sizes are inferred from the applied operations. In this example, the max-pooling layers half the image size, whilst the convolutional layers reduce the size along the x and y axis by half the kernel size on each side due to no-padding.
Although this style is very conveniant to use, it does not allow to build more complex architectures like residual networks for example. To allow for more control, the functional API can be used.
Layers can be seen as functions or callables, which take a tensor and yield a tensor. Following this logic, the machine learning model can also be created from linking layers by feeding their return values into each other. By doing so, we create a directed graph along which the tensordata flows.
This allows the model designer to build residual connections, shared layers, and even multiple inputs or outputs. In the example below a simple residual network is defined.
As you can see in line 9, the output of block 2 is the sum of the input x and the result of the two convolutional layers. This would not be possible to implement with a sequential approach.
Note how you define a keras.Input item which is very similar to tf.placeholder. In the end, the model instance is constructed from the input layer and the output layer which is at the leaf node of the defined graph.
The climax of complexity is subclassing. This OOP-pattern allows you to implement your own Layers and modify every behaviour of your model. This can be done by writing a class which extends keras.layers.Layer and implements all necessary methods.
You can combine subclassing with a sequential or functional API style!
After implementing a custom layer, you can simply chain it into your functional/sequential style model and use the default methods for conveniance!