Simple Keras-inspired Deep Learning Framework implemented in Python with Numpy backend (using hand-written gradients) and Matplotlib plotting. For efficient (multithreaded) Einstein summation between tensors I use einsum2 repo.
As all my other repos, this is more an exercise for me to make sure I understand the main Deep Learning architectures and algorithms, rather than useful code to fit models. As well as a way to think of (relatively) efficient implementation of them. Hope this (super) simplified "Keras" re-implementation also helps you understand them!
Allows to Build, Train and Assess a modular Multi-Layer-Perceptron Squential architecture as you would do using Keras. The model (as for now) presents the following features:
- Layers:
- Trainable: Dense, Conv2D, VanillaRNN
- Activation: Relu, Softmax
- Regularization: Dropout, MaxPool2D
- Losses:
- CrossEntropy
- CategoricalHinge
- Optimization: Minibatch SGD BackProp Training with customizable:
- Batch Size
- Epochs / Iterations
- Momentum
- L2 Regularization Term
- Callbacks:
- Learning Rate Scheduler: Constant, Linear, Cyclic
- Loss & Metrics tracker
- Early Stopper
Code Example:
# Imports from mlp.callbacks import MetricTracker, LearningRateScheduler from mlp.layers import Conv2D, Dense, MaxPool2D, Softmax, Relu, Dropout from mlp.losses import CrossEntropy from mlp.models import Sequential from mlp.metrics import Accuracy # Define model model = Sequential(loss=CrossEntropy(), metric=Accuracy()) model.add(Conv2D(num_filters=32, kernel_shape=(3, 3), stride=2, input_shape=(32, 32, 3))) model.add(Relu()) model.add(Conv2D(num_filters=64, kernel_shape=(3, 3))) model.add(Relu()) model.add(MaxPool2D(kernel_shape=(2, 2), stride=2)) model.add(Conv2D(num_filters=128, kernel_shape=(2, 2))) model.add(Relu()) model.add(MaxPool2D(kernel_shape=(2, 2))) model.add(Flatten()) model.add(Dense(nodes=200)) model.add(Relu()) model.add(Dropout(0.8)) model.add(Dense(nodes=10)) model.add(Softmax()) # Define callbacks mt = MetricTracker() # Stores training evolution info (losses and metrics) lrs = LearningRateScheduler(evolution="cyclic", lr_min=1e-4, lr_max=1e-1) callbacks = [mt, lrs] # Fit model model.fit(X=x_train, Y=y_train, X_val=x_val, Y_val=y_val, batch_size=100, epochs=100, l2_reg=0.01, momentum=0.8, callbacks=callbacks) mt.plot_training_progress() # Test model test_acc, test_loss = model.get_metric_loss(x_test, y_test) print("Test accuracy:", test_acc)Example of metrics tracked during training: 
NOTE: More architectures, layers and features (LSTM, RBF, SOM, DBF) comming soon
Metaparameter Optimization is commonly used when training these kind of models. To ease the process I implemented a MetaParamOptimizer class with methods such as Grid Search, additionally I jointly wrote a wrapper around scikit-optimize with Federico Taschin, to perform Bayesian Optimization here).
- Define the search space and fixed args and a of your model in two diferent dictionaries
- Define an evaluator function which trains and evaluates your model in joined arguments, this function should return a
dictionarywith at least the key "value" (which MetaParamOptimizer will optimize).
Code example:
from mpo.metaparamoptimizer import MetaParamOptimizer from util.misc import dict_to_string search_space = { # Optimization will be performed on all combinations of these "batch_size": [100, 200, 400], # Batch sizes "lr": [0.001, 0.01, 0.1], # Learning rates "l2_reg": [0.01, 0.1] # L2 Regularization terms } fixed_args = { # These will be kept constant "x_train" : x_train, "y_train" : y_train, "x_val" : x_val, "y_val" : y_val, "epochs" : 100, "momentum" : 0.1, } def evaluator(x_train, y_train, x_val, y_val, **kwargs): # Define model (ex: SVM) model = Sequential(loss=CategoricalHinge()) model.add(Dense(nodes=10, input_dim=x_train.shape[0])) model.add(Softmax()) # Fit model model.fit(X=x_train, Y=y_train, X_val=x_val, Y_val=y_val, **kwargs) model.plot_training_progress(show=False, save=True, name="figures/" + dict_to_string(kwargs) model.save("models/" + dict_to_string(kwargs)) # Evaluator result (add model to retain best) value = model.get_classification_metrics(x_val, y_val)[0] # Get accuracy result = {"value": value, "model": model} # MetaParamOptimizer will maximize value return result # Get best model and best prams mpo = MetaParamOptimizer(save_path="models/") best_model = mpo.grid_search(evaluator=evaluator, search_space=search_space, fixed_args=fixed_args) # This will run your evaluator function 3x3x3 = 27 times on all combinations of search_space paramsExample of Gaussian Process Regression Optimizer hyperparameter analysis:
Clone repo and install requirements:
git clone https://github.com/OleguerCanal/Toy-DeepLearning-Framework.git
cd Toy-DeepLearning-Framework
pip install -r requirements.txt
[OPTIONAL] To parallelize Einstein sumations between tensors install einsum2, if not found will use numpy single-thread version instead (SLOWER).
