DLRG-software-session.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Software Session -- Deep Learning Reading Group"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is the IPython Notebook for the software session in the Deep Learning Reading Group at department D3, MPI-Inf.\n",
    "\n",
    "We are using Keras for this session. Keras is a Python NN library. Keras can run with TensorFlow, Theano and CNTK (The Microsoft Cognitive Toolkit) as the background. We are using TensorFlow for this session.\n",
    "\n",
    "First up, install TensorFlow. Then install Keras (All information at keras.io). There can be some issues with the version compatibilities, but they should be mostly resolvable. For example, TensorFlow issues can be resolved using already provided solutions on their Github repo. For example, see: https://github.com/tensorflow/tensorflow/issues/647"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.models import Sequential"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.layers import Dense, Activation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model = Sequential([\n",
    "    Dense(32, input_shape=(784,)),\n",
    "    Activation('relu'),\n",
    "    Dense(10),\n",
    "    Activation('softmax'),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "____________________________________________________________________________________________________\n",
      "Layer (type)                     Output Shape          Param #     Connected to                     \n",
      "====================================================================================================\n",
      "dense_1 (Dense)                  (None, 32)            25120       dense_input_1[0][0]              \n",
      "____________________________________________________________________________________________________\n",
      "activation_1 (Activation)        (None, 32)            0           dense_1[0][0]                    \n",
      "____________________________________________________________________________________________________\n",
      "dense_2 (Dense)                  (None, 10)            330         activation_1[0][0]               \n",
      "____________________________________________________________________________________________________\n",
      "activation_2 (Activation)        (None, 10)            0           dense_2[0][0]                    \n",
      "====================================================================================================\n",
      "Total params: 25,450\n",
      "Trainable params: 25,450\n",
      "Non-trainable params: 0\n",
      "____________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or, one can simply add layers as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model2 = Sequential()\n",
    "model2.add(Dense(output_dim=32, input_dim=784, init=\"glorot_uniform\"))\n",
    "model2.add(Activation(\"relu\"))\n",
    "model2.add(Dense(output_dim=10, init=\"glorot_uniform\"))\n",
    "model2.add(Activation(\"softmax\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "____________________________________________________________________________________________________\n",
      "Layer (type)                     Output Shape          Param #     Connected to                     \n",
      "====================================================================================================\n",
      "dense_3 (Dense)                  (None, 32)            25120       dense_input_2[0][0]              \n",
      "____________________________________________________________________________________________________\n",
      "activation_3 (Activation)        (None, 32)            0           dense_3[0][0]                    \n",
      "____________________________________________________________________________________________________\n",
      "dense_4 (Dense)                  (None, 10)            330         activation_3[0][0]               \n",
      "____________________________________________________________________________________________________\n",
      "activation_4 (Activation)        (None, 10)            0           dense_4[0][0]                    \n",
      "====================================================================================================\n",
      "Total params: 25,450\n",
      "Trainable params: 25,450\n",
      "Non-trainable params: 0\n",
      "____________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model2.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see that both model and model2 are same! To understand more, use help(model) on the python cmd prompt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on Sequential in module keras.models object:\n",
      "\n",
      "class Sequential(keras.engine.training.Model)\n",
      " |  Linear stack of layers.\n",
      " |  \n",
      " |  # Arguments\n",
      " |      layers: list of layers to add to the model.\n",
      " |  \n",
      " |  # Note\n",
      " |      The first layer passed to a Sequential model\n",
      " |      should have a defined input shape. What that\n",
      " |      means is that it should have received an `input_shape`\n",
      " |      or `batch_input_shape` argument,\n",
      " |      or for some type of layers (recurrent, Dense...)\n",
      " |      an `input_dim` argument.\n",
      " |  \n",
      " |  # Example\n",
      " |  \n",
      " |      ```python\n",
      " |          model = Sequential()\n",
      " |          # first layer must have a defined input shape\n",
      " |          model.add(Dense(32, input_dim=500))\n",
      " |          # afterwards, Keras does automatic shape inference\n",
      " |          model.add(Dense(32))\n",
      " |  \n",
      " |          # also possible (equivalent to the above):\n",
      " |          model = Sequential()\n",
      " |          model.add(Dense(32, input_shape=(500,)))\n",
      " |          model.add(Dense(32))\n",
      " |  \n",
      " |          # also possible (equivalent to the above):\n",
      " |          model = Sequential()\n",
      " |          # here the batch dimension is None,\n",
      " |          # which means any batch size will be accepted by the model.\n",
      " |          model.add(Dense(32, batch_input_shape=(None, 500)))\n",
      " |          model.add(Dense(32))\n",
      " |      ```\n",
      " |  \n",
      " |  Method resolution order:\n",
      " |      Sequential\n",
      " |      keras.engine.training.Model\n",
      " |      keras.engine.topology.Container\n",
      " |      keras.engine.topology.Layer\n",
      " |      __builtin__.object\n",
      " |  \n",
      " |  Methods defined here:\n",
      " |  \n",
      " |  __init__(self, layers=None, name=None)\n",
      " |  \n",
      " |  add(self, layer)\n",
      " |      Adds a layer instance on top of the layer stack.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          layer: layer instance.\n",
      " |  \n",
      " |  build(self, input_shape=None)\n",
      " |  \n",
      " |  call(self, x, mask=None)\n",
      " |  \n",
      " |  compile(self, optimizer, loss, metrics=None, sample_weight_mode=None, **kwargs)\n",
      " |      Configures the learning process.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          optimizer: str (name of optimizer) or optimizer object.\n",
      " |              See [optimizers](/optimizers).\n",
      " |          loss: str (name of objective function) or objective function.\n",
      " |              See [objectives](/objectives).\n",
      " |          metrics: list of metrics to be evaluated by the model\n",
      " |              during training and testing.\n",
      " |              Typically you will use `metrics=['accuracy']`.\n",
      " |              See [metrics](/metrics).\n",
      " |          sample_weight_mode: if you need to do timestep-wise\n",
      " |              sample weighting (2D weights), set this to \"temporal\".\n",
      " |              \"None\" defaults to sample-wise weights (1D).\n",
      " |          kwargs: for Theano backend, these are passed into K.function.\n",
      " |              Ignored for Tensorflow backend.\n",
      " |      \n",
      " |      # Example\n",
      " |          ```python\n",
      " |              model = Sequential()\n",
      " |              model.add(Dense(32, input_shape=(500,)))\n",
      " |              model.add(Dense(10, activation='softmax'))\n",
      " |              model.compile(optimizer='rmsprop',\n",
      " |                            loss='categorical_crossentropy',\n",
      " |                            metrics=['accuracy'])\n",
      " |          ```\n",
      " |  \n",
      " |  evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None, **kwargs)\n",
      " |      Computes the loss on some input data, batch by batch.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          y: labels, as a Numpy array.\n",
      " |          batch_size: integer. Number of samples per gradient update.\n",
      " |          verbose: verbosity mode, 0 or 1.\n",
      " |          sample_weight: sample weights, as a Numpy array.\n",
      " |      \n",
      " |      # Returns\n",
      " |          Scalar test loss (if the model has no metrics)\n",
      " |          or list of scalars (if the model computes other metrics).\n",
      " |          The attribute `model.metrics_names` will give you\n",
      " |          the display labels for the scalar outputs.\n",
      " |  \n",
      " |  evaluate_generator(self, generator, val_samples, max_q_size=10, nb_worker=1, pickle_safe=False, **kwargs)\n",
      " |      Evaluates the model on a data generator. The generator should\n",
      " |      return the same kind of data as accepted by `test_on_batch`.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          generator:\n",
      " |              generator yielding tuples (inputs, targets)\n",
      " |              or (inputs, targets, sample_weights)\n",
      " |          val_samples:\n",
      " |              total number of samples to generate from `generator`\n",
      " |              before returning.\n",
      " |          max_q_size: maximum size for the generator queue\n",
      " |          nb_worker: maximum number of processes to spin up\n",
      " |          pickle_safe: if True, use process based threading. Note that because\n",
      " |              this implementation relies on multiprocessing, you should not pass non\n",
      " |              non picklable arguments to the generator as they can't be passed\n",
      " |              easily to children processes.\n",
      " |  \n",
      " |  fit(self, x, y, batch_size=32, nb_epoch=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, **kwargs)\n",
      " |      Trains the model for a fixed number of epochs.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          y: labels, as a Numpy array.\n",
      " |          batch_size: integer. Number of samples per gradient update.\n",
      " |          nb_epoch: integer, the number of epochs to train the model.\n",
      " |          verbose: 0 for no logging to stdout,\n",
      " |              1 for progress bar logging, 2 for one log line per epoch.\n",
      " |          callbacks: list of `keras.callbacks.Callback` instances.\n",
      " |              List of callbacks to apply during training.\n",
      " |              See [callbacks](/callbacks).\n",
      " |          validation_split: float (0. < x < 1).\n",
      " |              Fraction of the data to use as held-out validation data.\n",
      " |          validation_data: tuple (x_val, y_val) or tuple\n",
      " |              (x_val, y_val, val_sample_weights) to be used as held-out\n",
      " |              validation data. Will override validation_split.\n",
      " |          shuffle: boolean or str (for 'batch').\n",
      " |              Whether to shuffle the samples at each epoch.\n",
      " |              'batch' is a special option for dealing with the\n",
      " |              limitations of HDF5 data; it shuffles in batch-sized chunks.\n",
      " |          class_weight: dictionary mapping classes to a weight value,\n",
      " |              used for scaling the loss function (during training only).\n",
      " |          sample_weight: Numpy array of weights for\n",
      " |              the training samples, used for scaling the loss function\n",
      " |              (during training only). You can either pass a flat (1D)\n",
      " |              Numpy array with the same length as the input samples\n",
      " |              (1:1 mapping between weights and samples),\n",
      " |              or in the case of temporal data,\n",
      " |              you can pass a 2D array with shape (samples, sequence_length),\n",
      " |              to apply a different weight to every timestep of every sample.\n",
      " |              In this case you should make sure to specify\n",
      " |              sample_weight_mode=\"temporal\" in compile().\n",
      " |          initial_epoch: epoch at which to start training\n",
      " |              (useful for resuming a previous training run)\n",
      " |      \n",
      " |      # Returns\n",
      " |          A `History` object. Its `History.history` attribute is\n",
      " |          a record of training loss values and metrics values\n",
      " |          at successive epochs, as well as validation loss values\n",
      " |          and validation metrics values (if applicable).\n",
      " |  \n",
      " |  fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose=1, callbacks=None, validation_data=None, nb_val_samples=None, class_weight=None, max_q_size=10, nb_worker=1, pickle_safe=False, initial_epoch=0, **kwargs)\n",
      " |      Fits the model on data generated batch-by-batch by\n",
      " |      a Python generator.\n",
      " |      The generator is run in parallel to the model, for efficiency.\n",
      " |      For instance, this allows you to do real-time data augmentation\n",
      " |      on images on CPU in parallel to training your model on GPU.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          generator: a generator.\n",
      " |              The output of the generator must be either\n",
      " |              - a tuple (inputs, targets)\n",
      " |              - a tuple (inputs, targets, sample_weights).\n",
      " |              All arrays should contain the same number of samples.\n",
      " |              The generator is expected to loop over its data\n",
      " |              indefinitely. An epoch finishes when `samples_per_epoch`\n",
      " |              samples have been seen by the model.\n",
      " |          samples_per_epoch: integer, number of samples to process before\n",
      " |              going to the next epoch.\n",
      " |          nb_epoch: integer, total number of iterations on the data.\n",
      " |          verbose: verbosity mode, 0, 1, or 2.\n",
      " |          callbacks: list of callbacks to be called during training.\n",
      " |          validation_data: this can be either\n",
      " |              - a generator for the validation data\n",
      " |              - a tuple (inputs, targets)\n",
      " |              - a tuple (inputs, targets, sample_weights).\n",
      " |          nb_val_samples: only relevant if `validation_data` is a generator.\n",
      " |              number of samples to use from validation generator\n",
      " |              at the end of every epoch.\n",
      " |          class_weight: dictionary mapping class indices to a weight\n",
      " |              for the class.\n",
      " |          max_q_size: maximum size for the generator queue\n",
      " |          nb_worker: maximum number of processes to spin up\n",
      " |          pickle_safe: if True, use process based threading. Note that because\n",
      " |              this implementation relies on multiprocessing, you should not pass\n",
      " |              non picklable arguments to the generator as they can't be passed\n",
      " |              easily to children processes.\n",
      " |          initial_epoch: epoch at which to start training\n",
      " |              (useful for resuming a previous training run)\n",
      " |      \n",
      " |      # Returns\n",
      " |          A `History` object.\n",
      " |      \n",
      " |      # Example\n",
      " |      \n",
      " |      ```python\n",
      " |          def generate_arrays_from_file(path):\n",
      " |              while 1:\n",
      " |                  f = open(path)\n",
      " |                  for line in f:\n",
      " |                      # create Numpy arrays of input data\n",
      " |                      # and labels, from each line in the file\n",
      " |                      x, y = process_line(line)\n",
      " |                      yield (x, y)\n",
      " |                  f.close()\n",
      " |      \n",
      " |          model.fit_generator(generate_arrays_from_file('/my_file.txt'),\n",
      " |                              samples_per_epoch=10000, nb_epoch=10)\n",
      " |      ```\n",
      " |  \n",
      " |  get_config(self)\n",
      " |      Returns the model configuration\n",
      " |      as a Python list.\n",
      " |  \n",
      " |  get_layer(self, name=None, index=None)\n",
      " |      Returns a layer based on either its name (unique)\n",
      " |      or its index in the graph. Indices are based on\n",
      " |      order of horizontal graph traversal (bottom-up).\n",
      " |      \n",
      " |      # Arguments\n",
      " |          name: string, name of layer.\n",
      " |          index: integer, index of layer.\n",
      " |      \n",
      " |      # Returns\n",
      " |          A layer instance.\n",
      " |  \n",
      " |  get_losses_for(self, inputs)\n",
      " |  \n",
      " |  get_updates_for(self, inputs)\n",
      " |  \n",
      " |  get_weights(self)\n",
      " |      Returns the weights of the model,\n",
      " |      as a flat list of Numpy arrays.\n",
      " |  \n",
      " |  pop(self)\n",
      " |      Removes the last layer in the model.\n",
      " |  \n",
      " |  predict(self, x, batch_size=32, verbose=0)\n",
      " |      Generates output predictions for the input samples,\n",
      " |      processing the samples in a batched way.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: the input data, as a Numpy array.\n",
      " |          batch_size: integer.\n",
      " |          verbose: verbosity mode, 0 or 1.\n",
      " |      \n",
      " |      # Returns\n",
      " |          A Numpy array of predictions.\n",
      " |  \n",
      " |  predict_classes(self, x, batch_size=32, verbose=1)\n",
      " |      Generate class predictions for the input samples\n",
      " |      batch by batch.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          batch_size: integer.\n",
      " |          verbose: verbosity mode, 0 or 1.\n",
      " |      \n",
      " |      # Returns\n",
      " |          A numpy array of class predictions.\n",
      " |  \n",
      " |  predict_generator(self, generator, val_samples, max_q_size=10, nb_worker=1, pickle_safe=False)\n",
      " |      Generates predictions for the input samples from a data generator.\n",
      " |      The generator should return the same kind of data as accepted by\n",
      " |      `predict_on_batch`.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          generator: generator yielding batches of input samples.\n",
      " |          val_samples: total number of samples to generate from `generator`\n",
      " |              before returning.\n",
      " |          max_q_size: maximum size for the generator queue\n",
      " |          nb_worker: maximum number of processes to spin up\n",
      " |          pickle_safe: if True, use process based threading. Note that because\n",
      " |              this implementation relies on multiprocessing, you should not pass non\n",
      " |              non picklable arguments to the generator as they can't be passed\n",
      " |              easily to children processes.\n",
      " |      \n",
      " |      # Returns\n",
      " |          A Numpy array of predictions.\n",
      " |  \n",
      " |  predict_on_batch(self, x)\n",
      " |      Returns predictions for a single batch of samples.\n",
      " |  \n",
      " |  predict_proba(self, x, batch_size=32, verbose=1)\n",
      " |      Generates class probability predictions for the input samples\n",
      " |      batch by batch.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          batch_size: integer.\n",
      " |          verbose: verbosity mode, 0 or 1.\n",
      " |      \n",
      " |      # Returns\n",
      " |          A Numpy array of probability predictions.\n",
      " |  \n",
      " |  set_weights(self, weights)\n",
      " |      Sets the weights of the model.\n",
      " |      The `weights` argument should be a list\n",
      " |      of Numpy arrays with shapes and types matching\n",
      " |      the output of `model.get_weights()`.\n",
      " |  \n",
      " |  test_on_batch(self, x, y, sample_weight=None, **kwargs)\n",
      " |      Evaluates the model over a single batch of samples.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          y: labels, as a Numpy array.\n",
      " |          sample_weight: sample weights, as a Numpy array.\n",
      " |      \n",
      " |      # Returns\n",
      " |          Scalar test loss (if the model has no metrics)\n",
      " |          or list of scalars (if the model computes other metrics).\n",
      " |          The attribute `model.metrics_names` will give you\n",
      " |          the display labels for the scalar outputs.\n",
      " |  \n",
      " |  train_on_batch(self, x, y, class_weight=None, sample_weight=None, **kwargs)\n",
      " |      Single gradient update over one batch of samples.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: input data, as a Numpy array or list of Numpy arrays\n",
      " |              (if the model has multiple inputs).\n",
      " |          y: labels, as a Numpy array.\n",
      " |          class_weight: dictionary mapping classes to a weight value,\n",
      " |              used for scaling the loss function (during training only).\n",
      " |          sample_weight: sample weights, as a Numpy array.\n",
      " |      \n",
      " |      # Returns\n",
      " |          Scalar training loss (if the model has no metrics)\n",
      " |          or list of scalars (if the model computes other metrics).\n",
      " |          The attribute `model.metrics_names` will give you\n",
      " |          the display labels for the scalar outputs.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Class methods defined here:\n",
      " |  \n",
      " |  from_config(cls, config, layer_cache=None) from __builtin__.type\n",
      " |      Supports legacy formats\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Data descriptors defined here:\n",
      " |  \n",
      " |  constraints\n",
      " |  \n",
      " |  flattened_layers\n",
      " |  \n",
      " |  losses\n",
      " |  \n",
      " |  non_trainable_weights\n",
      " |  \n",
      " |  regularizers\n",
      " |  \n",
      " |  state_updates\n",
      " |  \n",
      " |  trainable\n",
      " |  \n",
      " |  trainable_weights\n",
      " |  \n",
      " |  training_data\n",
      " |  \n",
      " |  updates\n",
      " |  \n",
      " |  uses_learning_phase\n",
      " |  \n",
      " |  validation_data\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Methods inherited from keras.engine.topology.Container:\n",
      " |  \n",
      " |  compute_mask(self, input, mask)\n",
      " |  \n",
      " |  get_output_shape_for(self, input_shape)\n",
      " |  \n",
      " |  load_weights(self, filepath, by_name=False)\n",
      " |      Loads all layer weights from a HDF5 save file.\n",
      " |      \n",
      " |      If `by_name` is False (default) weights are loaded\n",
      " |      based on the network's topology, meaning the architecture\n",
      " |      should be the same as when the weights were saved.\n",
      " |      Note that layers that don't have weights are not taken\n",
      " |      into account in the topological ordering, so adding or\n",
      " |      removing layers is fine as long as they don't have weights.\n",
      " |      \n",
      " |      If `by_name` is True, weights are loaded into layers\n",
      " |      only if they share the same name. This is useful\n",
      " |      for fine-tuning or transfer-learning models where\n",
      " |      some of the layers have changed.\n",
      " |  \n",
      " |  load_weights_from_hdf5_group(self, f)\n",
      " |      Weight loading is based on layer order in a list\n",
      " |      (matching model.flattened_layers for Sequential models,\n",
      " |      and model.layers for Model class instances), not\n",
      " |      on layer names.\n",
      " |      Layers that have no weights are skipped.\n",
      " |  \n",
      " |  load_weights_from_hdf5_group_by_name(self, f)\n",
      " |      Name-based weight loading\n",
      " |      (instead of topological weight loading).\n",
      " |      Layers that have no matching name are skipped.\n",
      " |  \n",
      " |  reset_states(self)\n",
      " |  \n",
      " |  run_internal_graph(self, inputs, masks=None)\n",
      " |      Computes output tensors for new inputs.\n",
      " |      \n",
      " |      # Note:\n",
      " |          - Expects `inputs` to be a list (potentially with 1 element).\n",
      " |          - Can be run on non-Keras tensors.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          inputs: List of tensors\n",
      " |          masks: List of masks (tensors or None).\n",
      " |      \n",
      " |      # Returns\n",
      " |          Three lists: output_tensors, output_masks, output_shapes\n",
      " |  \n",
      " |  save(self, filepath, overwrite=True)\n",
      " |      Save into a single HDF5 file:\n",
      " |          - The model architecture, allowing to re-instantiate the model.\n",
      " |          - The model weights.\n",
      " |          - The state of the optimizer, allowing to resume training\n",
      " |              exactly where you left off.\n",
      " |      \n",
      " |      This allows you to save the entirety of the state of a model\n",
      " |      in a single file.\n",
      " |      \n",
      " |      Saved models can be reinstantiated via `keras.models.load_model`.\n",
      " |      The model returned by `load_model`\n",
      " |      is a compiled model ready to be used (unless the saved model\n",
      " |      was never compiled in the first place).\n",
      " |      \n",
      " |      # Example\n",
      " |      \n",
      " |      ```python\n",
      " |      from keras.models import load_model\n",
      " |      \n",
      " |      model.save('my_model.h5')  # creates a HDF5 file 'my_model.h5'\n",
      " |      del model  # deletes the existing model\n",
      " |      \n",
      " |      # returns a compiled model\n",
      " |      # identical to the previous one\n",
      " |      model = load_model('my_model.h5')\n",
      " |      ```\n",
      " |  \n",
      " |  save_weights(self, filepath, overwrite=True)\n",
      " |      Dumps all layer weights to a HDF5 file.\n",
      " |      \n",
      " |      The weight file has:\n",
      " |          - `layer_names` (attribute), a list of strings\n",
      " |              (ordered names of model layers).\n",
      " |          - For every layer, a `group` named `layer.name`\n",
      " |              - For every such layer group, a group attribute `weight_names`,\n",
      " |                  a list of strings\n",
      " |                  (ordered names of weights tensor of the layer).\n",
      " |              - For every weight in the layer, a dataset\n",
      " |                  storing the weight value, named after the weight tensor.\n",
      " |  \n",
      " |  save_weights_to_hdf5_group(self, f)\n",
      " |  \n",
      " |  summary(self, line_length=100, positions=[0.33, 0.55, 0.67, 1.0])\n",
      " |  \n",
      " |  to_json(self, **kwargs)\n",
      " |      Returns a JSON string containing the network configuration.\n",
      " |      \n",
      " |      To load a network from a JSON save file, use\n",
      " |      `keras.models.model_from_json(json_string, custom_objects={})`.\n",
      " |  \n",
      " |  to_yaml(self, **kwargs)\n",
      " |      Returns a yaml string containing the network configuration.\n",
      " |      \n",
      " |      To load a network from a yaml save file, use\n",
      " |      `keras.models.model_from_yaml(yaml_string, custom_objects={})`.\n",
      " |      \n",
      " |      `custom_objects` should be a dictionary mapping\n",
      " |      the names of custom losses / layers / etc to the corresponding\n",
      " |      functions / classes.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Data descriptors inherited from keras.engine.topology.Container:\n",
      " |  \n",
      " |  input_spec\n",
      " |  \n",
      " |  stateful\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Methods inherited from keras.engine.topology.Layer:\n",
      " |  \n",
      " |  __call__(self, x, mask=None)\n",
      " |      Wrapper around self.call(), for handling\n",
      " |      internal Keras references.\n",
      " |      \n",
      " |      If a Keras tensor is passed:\n",
      " |          - We call self.add_inbound_node().\n",
      " |          - If necessary, we `build` the layer to match\n",
      " |              the _keras_shape of the input(s).\n",
      " |          - We update the _keras_shape of every input tensor with\n",
      " |              its new shape (obtained via self.get_output_shape_for).\n",
      " |              This is done as part of add_inbound_node().\n",
      " |          - We update the _keras_history of the output tensor(s)\n",
      " |              with the current layer.\n",
      " |              This is done as part of add_inbound_node().\n",
      " |      \n",
      " |      # Arguments\n",
      " |          x: Can be a tensor or list/tuple of tensors.\n",
      " |          mask: Tensor or list/tuple of tensors.\n",
      " |  \n",
      " |  add_inbound_node(self, inbound_layers, node_indices=None, tensor_indices=None)\n",
      " |      # Arguments\n",
      " |          inbound_layers: Can be a layer instance\n",
      " |              or a list/tuple of layer instances.\n",
      " |          node_indices: Integer (or list of integers).\n",
      " |              The input layer might have a number of\n",
      " |              parallel output streams;\n",
      " |              this is the index of the stream (in the input layer)\n",
      " |              where to connect the current layer.\n",
      " |          tensor_indices: Integer or list of integers.\n",
      " |              The output of the inbound node might be a list/tuple\n",
      " |              of tensor, and we might only be interested in\n",
      " |              one specific entry.\n",
      " |              This index allows you to specify the index of\n",
      " |              the entry in the output list\n",
      " |              (if applicable). \"None\" means that we take all outputs\n",
      " |              (as a list).\n",
      " |  \n",
      " |  add_loss(self, losses, inputs=None)\n",
      " |  \n",
      " |  add_update(self, updates, inputs=None)\n",
      " |  \n",
      " |  add_weight(self, shape, initializer, name=None, trainable=True, regularizer=None, constraint=None)\n",
      " |      Adds a weight variable to the layer.\n",
      " |      \n",
      " |      # Arguments\n",
      " |          shape: The shape tuple of the weight.\n",
      " |          initializer: An Initializer instance (callable).\n",
      " |          trainable: A boolean, whether the weight should\n",
      " |              be trained via backprop or not (assuming\n",
      " |              that the layer itself is also trainable).\n",
      " |          regularizer: An optional Regularizer instance.\n",
      " |  \n",
      " |  assert_input_compatibility(self, input)\n",
      " |      This checks that the tensor(s) `input`\n",
      " |      verify the input assumptions of the layer\n",
      " |      (if any). If not, exceptions are raised.\n",
      " |  \n",
      " |  count_params(self)\n",
      " |      Returns the total number of floats (or ints)\n",
      " |      composing the weights of the layer.\n",
      " |  \n",
      " |  create_input_layer(self, batch_input_shape, input_dtype=None, name=None)\n",
      " |  \n",
      " |  get_input_at(self, node_index)\n",
      " |      Retrieves the input tensor(s) of a layer at a given node.\n",
      " |  \n",
      " |  get_input_mask_at(self, node_index)\n",
      " |      Retrieves the input mask tensor(s) of a layer at a given node.\n",
      " |  \n",
      " |  get_input_shape_at(self, node_index)\n",
      " |      Retrieves the input shape(s) of a layer at a given node.\n",
      " |  \n",
      " |  get_output_at(self, node_index)\n",
      " |      Retrieves the output tensor(s) of a layer at a given node.\n",
      " |  \n",
      " |  get_output_mask_at(self, node_index)\n",
      " |      Retrieves the output mask tensor(s) of a layer at a given node.\n",
      " |  \n",
      " |  get_output_shape_at(self, node_index)\n",
      " |      Retrieves the output shape(s) of a layer at a given node.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Data descriptors inherited from keras.engine.topology.Layer:\n",
      " |  \n",
      " |  __dict__\n",
      " |      dictionary for instance variables (if defined)\n",
      " |  \n",
      " |  __weakref__\n",
      " |      list of weak references to the object (if defined)\n",
      " |  \n",
      " |  input\n",
      " |      Retrieves the input tensor(s) of a layer (only applicable if\n",
      " |      the layer has exactly one inbound node, i.e. if it is connected\n",
      " |      to one incoming layer).\n",
      " |  \n",
      " |  input_mask\n",
      " |      Retrieves the input mask tensor(s) of a layer (only applicable if\n",
      " |      the layer has exactly one inbound node, i.e. if it is connected\n",
      " |      to one incoming layer).\n",
      " |  \n",
      " |  input_shape\n",
      " |      Retrieves the input shape tuple(s) of a layer. Only applicable\n",
      " |      if the layer has one inbound node,\n",
      " |      or if all inbound nodes have the same input shape.\n",
      " |  \n",
      " |  output\n",
      " |      Retrieves the output tensor(s) of a layer (only applicable if\n",
      " |      the layer has exactly one inbound node, i.e. if it is connected\n",
      " |      to one incoming layer).\n",
      " |  \n",
      " |  output_mask\n",
      " |      Retrieves the output mask tensor(s) of a layer (only applicable if\n",
      " |      the layer has exactly one inbound node, i.e. if it is connected\n",
      " |      to one incoming layer).\n",
      " |  \n",
      " |  output_shape\n",
      " |      Retrieves the output shape tuple(s) of a layer. Only applicable\n",
      " |      if the layer has one inbound node,\n",
      " |      or if all inbound nodes have the same output shape.\n",
      " |  \n",
      " |  weights\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(model2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We next compile the model, which means to configure its learning process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model2.compile(loss='categorical_crossentropy', optimizer='sgd')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One could also use 'binary_crossentropy' instead. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model2.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One could also use custom metrics as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# For custom metrics\n",
    "#import keras.backend as K\n",
    "\n",
    "#def mean_pred(y_true, y_pred):\n",
    "#    return K.mean(y_pred)\n",
    "\n",
    "#model.compile(optimizer='rmsprop',\n",
    "#              loss='binary_crossentropy',\n",
    "#              metrics=['accuracy', mean_pred])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could also more specifically configure the optimizer as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.optimizers import SGD #More on Optimizers here -- https://keras.io/optimizers/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "model2.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some available optimizers are 'rmsprop', 'adagrad' (adaptive subgradient), 'Adam', 'Nadam' (Nesterov Adam), 'Adamax' etc."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Coming back to the model, the configuration of the model can be looked as a Python list using the .get_config()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'class_name': 'Dense',\n",
       "  'config': {'W_constraint': None,\n",
       "   'W_regularizer': None,\n",
       "   'activation': 'linear',\n",
       "   'activity_regularizer': None,\n",
       "   'b_constraint': None,\n",
       "   'b_regularizer': None,\n",
       "   'batch_input_shape': (None, 784),\n",
       "   'bias': True,\n",
       "   'init': 'glorot_uniform',\n",
       "   'input_dim': 784,\n",
       "   'input_dtype': 'float32',\n",
       "   'name': 'dense_3',\n",
       "   'output_dim': 32,\n",
       "   'trainable': True}},\n",
       " {'class_name': 'Activation',\n",
       "  'config': {'activation': 'relu', 'name': 'activation_3', 'trainable': True}},\n",
       " {'class_name': 'Dense',\n",
       "  'config': {'W_constraint': None,\n",
       "   'W_regularizer': None,\n",
       "   'activation': 'linear',\n",
       "   'activity_regularizer': None,\n",
       "   'b_constraint': None,\n",
       "   'b_regularizer': None,\n",
       "   'bias': True,\n",
       "   'init': 'glorot_uniform',\n",
       "   'input_dim': 32,\n",
       "   'name': 'dense_4',\n",
       "   'output_dim': 10,\n",
       "   'trainable': True}},\n",
       " {'class_name': 'Activation',\n",
       "  'config': {'activation': 'softmax',\n",
       "   'name': 'activation_4',\n",
       "   'trainable': True}}]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model2.get_config()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, we saw how to configure the model itself, set and configure a specific optimizer for our model. Model training can be done as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2369     \n",
      "Epoch 2/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2163     \n",
      "Epoch 3/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2145     \n",
      "Epoch 4/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2142     \n",
      "Epoch 5/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2137     \n",
      "Epoch 6/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2135     \n",
      "Epoch 7/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2131     \n",
      "Epoch 8/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2126     \n",
      "Epoch 9/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2124     \n",
      "Epoch 10/10\n",
      "1000/1000 [==============================] - 0s - loss: 1.2119     \n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.callbacks.History at 0x7f6df9551a50>"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "data = np.random.random((1000, 784))\n",
    "labels = np.random.randint(2, size=(1000, 10))\n",
    "\n",
    "model2.fit(data, labels, nb_epoch=10, batch_size=32)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<bound method Sequential.summary of <keras.models.Sequential object at 0x7f6dfd950550>>"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model2.summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A fitted model can now be evaluated on test data using"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# model2.evaluate(x_test, y_test, batch_size=32)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see a complete example in one go (albeit a dummy example):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.7066 - acc: 0.5090     \n",
      "Epoch 2/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.7007 - acc: 0.5150     \n",
      "Epoch 3/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6992 - acc: 0.5230     \n",
      "Epoch 4/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6892 - acc: 0.5490     \n",
      "Epoch 5/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.7040 - acc: 0.5250     \n",
      "Epoch 6/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6927 - acc: 0.5550     \n",
      "Epoch 7/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6975 - acc: 0.5050     \n",
      "Epoch 8/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6959 - acc: 0.5100     \n",
      "Epoch 9/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6948 - acc: 0.5090     \n",
      "Epoch 10/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6994 - acc: 0.5000     \n",
      "Epoch 11/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6887 - acc: 0.5340     \n",
      "Epoch 12/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6932 - acc: 0.5240     \n",
      "Epoch 13/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6946 - acc: 0.5160     \n",
      "Epoch 14/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6902 - acc: 0.5280     \n",
      "Epoch 15/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6925 - acc: 0.5280     \n",
      "Epoch 16/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6865 - acc: 0.5370     \n",
      "Epoch 17/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6884 - acc: 0.5480     \n",
      "Epoch 18/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6905 - acc: 0.5340     \n",
      "Epoch 19/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6885 - acc: 0.5500     \n",
      "Epoch 20/20\n",
      "1000/1000 [==============================] - 0s - loss: 0.6829 - acc: 0.5580     \n",
      "100/100 [==============================] - 0s\n"
     ]
    }
   ],
   "source": [
    "# Multilayer Perceptron -- Binary classification example\n",
    "import numpy as np\n",
    "from keras.models import Sequential\n",
    "from keras.layers import Dense, Dropout\n",
    "# Info. on layers in Keras -- https://keras.io/layers/core/\n",
    "\n",
    "# Generate dummy data\n",
    "x_train = np.random.random((1000, 20)) #1000 samples, 20 feature dimensions\n",
    "y_train = np.random.randint(2, size=(1000, 1))\n",
    "x_test = np.random.random((100, 20))\n",
    "y_test = np.random.randint(2, size=(100, 1))\n",
    "\n",
    "model = Sequential()\n",
    "model.add(Dense(64, input_dim=20, activation='relu')) \n",
    "# first layer, thus output and input dimensions to be supplied\n",
    "# if activation is not specified, defaults to linear a(x) = x\n",
    "\n",
    "model.add(Dropout(0.5)) # dropout layer with rate 0.5\n",
    "model.add(Dense(64, activation='relu')) \n",
    "# other possibilities of activations -- 'ELU', 'SELU', 'tanh', 'linear'\n",
    "model.add(Dropout(0.5))\n",
    "model.add(Dense(1, activation='sigmoid'))\n",
    "# NN with the required layered architecure is ready at this point\n",
    "\n",
    "# Now, configure few more things\n",
    "model.compile(loss='binary_crossentropy',\n",
    "              optimizer='rmsprop',\n",
    "              metrics=['accuracy'])\n",
    "\n",
    "# Model ready, data ready, we can fit now\n",
    "\n",
    "model.fit(x_train, y_train,\n",
    "          nb_epoch=20,\n",
    "          batch_size=128)\n",
    "\n",
    "score = model.evaluate(x_test, y_test, batch_size=128)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "____________________________________________________________________________________________________\n",
      "Layer (type)                     Output Shape          Param #     Connected to                     \n",
      "====================================================================================================\n",
      "dense_5 (Dense)                  (None, 64)            1344        dense_input_3[0][0]              \n",
      "____________________________________________________________________________________________________\n",
      "dropout_1 (Dropout)              (None, 64)            0           dense_5[0][0]                    \n",
      "____________________________________________________________________________________________________\n",
      "dense_6 (Dense)                  (None, 64)            4160        dropout_1[0][0]                  \n",
      "____________________________________________________________________________________________________\n",
      "dropout_2 (Dropout)              (None, 64)            0           dense_6[0][0]                    \n",
      "____________________________________________________________________________________________________\n",
      "dense_7 (Dense)                  (None, 1)             65          dropout_2[0][0]                  \n",
      "====================================================================================================\n",
      "Total params: 5,569\n",
      "Trainable params: 5,569\n",
      "Non-trainable params: 0\n",
      "____________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another important feature or set of functions is the Callbacks https://keras.io/callbacks/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some key callback functions\n",
    "\n",
    "-- keras.callbacks.LearningRateScheduler # \n",
    "\n",
    "-- keras.callbacks.EarlyStopping # When a monitored quantity has stopped imporoving\n",
    "\n",
    "-- keras.callbacks.ModelCheckpoint # Save model after every epoch\n",
    "\n",
    "Interesting one -- visualization tool with TensorFlow\n",
    "\n",
    "-- keras.callbacks.TensorBoard \n",
    "\n",
    "While training, a log is created about events \n",
    "which can be later viewed using TensorBoard\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can already look at example projects (comp.bio.):\n",
    "1. DeepCpG : Predicting single cell DNA methylation states\n",
    "https://github.com/PMBio/deepcpg/blob/master/scripts/dcpg_train.py\n",
    "They also use and recommend TensorBoard for visualizations\n",
    "\n",
    "2. CODA : A convolutional denoising algorithm for ChIP-Seq data\n",
    "https://github.com/kundajelab/coda/blob/master/models.py\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's it! \n",
    "\n",
    "Thanks.\n",
    "\n",
    "-Sarvesh"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}