{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "269afbea-bb5a-4464-9581-34a1bfabb3c1",
   "metadata": {},
   "source": [
    "# CNN starter kit\n",
    "\n",
    "Let us tackle this problem using a CNN model.\n",
    "\n",
    "First, note, in my setup, all the images are in the `unzip` directory and data files in the `data` directory. The following parameters prepare you for that but may need to be customised to your environment. My local environmet looks like this\n",
    "- unzip\n",
    "  - planet-dec17\n",
    "    - ... lots of `png` images\n",
    "  - planet-dec18\n",
    "    - ... lots of `png` images\n",
    "  - planet-jun18\n",
    "    - ... lots of `png` images\n",
    "  - planet-jun18\n",
    "    - ... lots of `png` images\n",
    "  -sentinel\n",
    "    - ... lots of `tif` images\n",
    "- data\n",
    "  - auxilary_data_unique.csv\n",
    "  - extra_train.csv\n",
    "  - test.csv\n",
    "  - train-unique.csv\n",
    "\n",
    "Note that I placed all images into the directories as noted above and this allows me to easily access them without any gymnastics around the many different sets that were published over time.\n",
    "\n",
    "## License\n",
    "\n",
    "MG Ferreira<br/>\n",
    "&copy; Ferra Solutions (Pty) Ltd https://www.ferrasolutions.com</br>\n",
    "Note that this is *copyrighted* material.\n",
    "\n",
    "You are welcome to use it to compete, but you can not publish this notebook elsewhere as if you created it, or use it in e.g. a lecture or as part of a paper or in some other setting without written permission from me.\n",
    "\n",
    "## Disclaimer\n",
    "\n",
    "This is a fairly elaborate example meant to get you going.\n",
    "\n",
    "I created this because I was asked by some of the other contestants to do it and do so to help newcomers get going really quickly.\n",
    "\n",
    "I wish I had this at the outset but, you know what, I am also glad I did not because it forced me to learn all these things.\n",
    "\n",
    "Being an example, it is not meant for production use nor are there *any* guarantees of usability of correctness included with it.\n",
    "\n",
    "## Warning\n",
    "\n",
    "This will estimate a CNN model and then write the outputs in a submission ready file to\n",
    "\n",
    "`cnn.csv`\n",
    "\n",
    "without asking for permission. If you have such a file then you will loose it if you run this.\n",
    "\n",
    "This is also covered by the disclaimer - if this notebook formats your hard disk you have been warned. But don't worry, it has good intentions, although things may heat up a bit during the model estimation process later :-)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cac9d2c7-47d2-4e0a-aae3-bdb81b5de9fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data paths - customise for your environment\n",
    "data_path = 'data'\n",
    "img_path  = 'unzip'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5e6afdb1-69ac-47f1-b864-1d96e6a1d87b",
   "metadata": {},
   "source": [
    "# Load libraries\n",
    "\n",
    "Now let us load all the required libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0aed548b-9dd7-479f-a2f7-cff98c8f73b2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import os\n",
    "import random\n",
    "\n",
    "import skimage.io\n",
    "\n",
    "import pandas     as pd\n",
    "import numpy      as np\n",
    "import tensorflow as tf\n",
    "\n",
    "from tensorflow.keras        import layers\n",
    "from tensorflow.keras.models import Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "365b09dc-232a-4236-9c32-0a0ebb932a76",
   "metadata": {},
   "source": [
    "# Utilities\n",
    "\n",
    "Next let us define a few utility functions and also functions to load RGB. Note especially the `load_rgb_stack` function which will load the RGB and return it as a stack with 12 channels (RGB x 4 photos)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f919522d-b0cf-48d2-88d7-b6a92ddded9e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the training and auxilliary data\n",
    "def load_data ():\n",
    "\n",
    "    train = pd.read_csv ( f'{ data_path }/train-unique.csv' )\n",
    "    aux   = pd.read_csv ( f'{ data_path }/auxilary_data-unique.csv' )\n",
    "    extra = pd.read_csv ( f'{ data_path }/extra_train.csv' )\n",
    "\n",
    "    return pd.concat ( [ train, aux, extra ] )\n",
    "\n",
    "# Load the test data\n",
    "def load_test ():\n",
    "\n",
    "    return pd.read_csv ( f'{ data_path }/test.csv' )\n",
    "\n",
    "# Supplied - load RGB image\n",
    "def load_RGB_images ( ID ):\n",
    "\n",
    "    # e.g id_0b242e06 -> 0b242e06\n",
    "    name      = ID.split('_')[1]\n",
    "\n",
    "    img_jun17 = skimage.io.imread ( f'{img_path}/planet-jun17/{name}.png' )\n",
    "    img_dec17 = skimage.io.imread ( f'{img_path}/planet-dec17/{name}.png' )\n",
    "    img_jun18 = skimage.io.imread ( f'{img_path}/planet-jun18/{name}.png' )\n",
    "    img_dec18 = skimage.io.imread ( f'{img_path}/planet-dec18/{name}.png' )\n",
    "\n",
    "    return img_jun17, img_dec17, img_jun18, img_dec18\n",
    "\n",
    "# Load the RGB mages and return them as a stack\n",
    "def load_rgb_stack ( ID ):\n",
    "\n",
    "    rgbs = load_RGB_images ( ID )\n",
    "    return np.concatenate ( [ rgbs [ 0 ], rgbs [ 1 ], rgbs [ 2 ], rgbs [ 3 ] ], axis = 2 )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6d485dd-361b-42b4-968f-7a8fba8653c8",
   "metadata": {},
   "source": [
    "# Spectral images\n",
    "\n",
    "The spectal images are next. They are treated similarly. Note that there are 192 channels, 16 channels for each of 12 photos. Since channels 14, 15 and 16 are not important they are discarded below. Note that, since the channels here are zero-based, the channels removed in the code becomes 13, 14 and 15.\n",
    "\n",
    "## Interesting channels\n",
    "\n",
    "Some preliminary investigation revealed that channels 3, 13, 49, 50, 66, 117, 119, 121, 122, 133, 136 and 139 are *interesting*. These channels show a significant difference in the pixels inside the plot and those outside. For that reason the code below includes this array of channels. This allows for it to be easily extracted later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d23a7f60-42c1-4404-96ab-8dcba62e6dc8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Supplied - load spectral image\n",
    "def load_Spectral_image ( ID ):\n",
    "\n",
    "    # e.g id_0b242e06 -> 0b242e06\n",
    "    name         = ID.split ( '_' ) [ 1 ]\n",
    "    img_sentinel = skimage.io.imread ( f'{img_path}/sentinel/{name}.tif' )\n",
    "\n",
    "    return img_sentinel\n",
    "\n",
    "# Channels we want to remove from spectral image\n",
    "# We are not interested in channel 13, 14 or 15\n",
    "sr13 = [ i for i in range ( 13, 192, 16 ) ]\n",
    "sr14 = [ i for i in range ( 14, 192, 16 ) ]\n",
    "sr15 = [ i for i in range ( 15, 192, 16 ) ]\n",
    "\n",
    "sral = sr13\n",
    "sral.extend ( sr14 )\n",
    "sral.extend ( sr15 )\n",
    "\n",
    "# Spectral channels we find interesting\n",
    "sintch = [ 3, 13, 49, 50, 66, 117, 119, 121, 122, 133, 136, 139 ] \n",
    "\n",
    "# Read the spectral image and remove the unused channels from it\n",
    "def load_spectral_stack ( ID ):\n",
    "\n",
    "    return np.delete ( load_Spectral_image ( ID ), sral, axis = 2 )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b561bc2-7d76-48d7-b498-ffed69c4b020",
   "metadata": {},
   "source": [
    "# Review\n",
    "\n",
    "Let us build a CNN with three legs.\n",
    " - One leg will handle the black and white satellite images.\n",
    " - One leg will handle the spectral images, specifically the channels we find interesting.\n",
    " - Finally a leg to handle meta data such as the year and the size of the plot.\n",
    "\n",
    "Later we will combine all three these legs into a dense network and then use that to predict the target location.\n",
    "\n",
    "## Prepocessing\n",
    "\n",
    "Based on the review, we need to do the following.\n",
    " - Convert the 4 RGB images to B&W and collate them into 4 channels\n",
    "   - When we convert the images to grayscale they will also be scaled to between 0-1\n",
    " - Extract the spectral channels we are interested in\n",
    "   - We need to rescale them, let us calculate and use the z-scores\n",
    " - Scale and relocate the metadata so that it is between 0 and 1\n",
    "   - Note that the average yield is 0.33 (you can easily verify this using a spreadsheet)\n",
    "   - We will also use the yield and will use this value when the yield is missing\n",
    "\n",
    "We also need to resize the images so that they are all the same size. Let us use 80x80 for the RGB and 40x40 for the spectral images.\n",
    "\n",
    "Then we need to rescale the target as well. We will use a `tanh` activation which lies between -1 and +1. Since the field values are between -2 and 2, we need to also preprocess the targets.\n",
    "\n",
    "Importantly, we also need to create a validation sample so that we can test how well our model works.\n",
    "\n",
    "This leads to the utilities below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5af9b50a-376e-46ee-8941-2bcc7ba99481",
   "metadata": {},
   "outputs": [],
   "source": [
    "from skimage.color import rgb2gray\n",
    "from skimage.transform import resize\n",
    "\n",
    "# Preprocess the RGB\n",
    "def preprocess_rgb ( row ):\n",
    "\n",
    "    # Load the RGB stack\n",
    "    rgb = load_rgb_stack ( row.ID )\n",
    "\n",
    "    # Convert to B&W\n",
    "    # and rescale to 0 .. 1\n",
    "    bw1 = rgb2gray ( rgb [ :, :, 0:3 ] )\n",
    "    bw2 = rgb2gray ( rgb [ :, :, 3:6 ] )\n",
    "    bw3 = rgb2gray ( rgb [ :, :, 6:9 ] )\n",
    "    bw4 = rgb2gray ( rgb [ :, :, 9:12 ] )\n",
    "\n",
    "    # Stack\n",
    "    bw_stack = np.dstack ( [ bw1, bw2, bw3, bw4 ] )\n",
    "\n",
    "    # Rescale and return\n",
    "    return resize ( bw_stack, [ 80, 80 ] ).flatten ()\n",
    "\n",
    "# Preprocess the spectral\n",
    "def preprocess_spc ( row ):\n",
    "\n",
    "    # Load the spectral stack - here channels 13, 14 and 15 have already been removed\n",
    "    spc = load_spectral_stack ( row.ID )\n",
    "\n",
    "    # Retain only the ones we are interested in\n",
    "    spc = spc [ :, :, sintch ]\n",
    "\n",
    "    # Convert to z-score\n",
    "    for ch in range ( spc.shape [ 2 ] ):\n",
    "        m = spc [ :, :, ch ].mean ()\n",
    "        s = spc [ :, :, ch ].std ()\n",
    "        spc [ :, :, ch ] = ( spc [ :, :, ch ] - m ) / s\n",
    "    \n",
    "    # Resize and return\n",
    "    return resize ( spc, [ 40, 40 ] ).flatten ()\n",
    "\n",
    "# Preprocess the metadata\n",
    "def preprocess_meta ( row ):\n",
    "\n",
    "    year = int ( row.Year )\n",
    "    size = float ( row.PlotSize_acres )\n",
    "    yld  = float ( row.Yield )\n",
    "\n",
    "    # Replace the yield with the average if not available\n",
    "    if np.isnan ( yld ):\n",
    "        yld = 0.33\n",
    "\n",
    "    # Scale the year\n",
    "    year = ( year - 2015 ) / 5\n",
    "\n",
    "    # Done\n",
    "    return np.array ( [ year, size, yld ] )\n",
    "\n",
    "# Preprocess the target\n",
    "def preprocess_target ( row ):\n",
    "\n",
    "    return np.array ( [ float ( row.x ) / 2, float ( row.y ) / 2 ] )\n",
    "\n",
    "# Preprocess a whole row\n",
    "def preprocess_row ( row ):\n",
    "\n",
    "    return np.concatenate ( [ preprocess_rgb ( row ), preprocess_spc ( row ), preprocess_meta ( row ), preprocess_target ( row ) ] )\n",
    "\n",
    "# Later we will also preprocess a row from the test set where there is no knowledge of the target\n",
    "def preprocess_test ( row ):\n",
    "\n",
    "    return np.concatenate ( [ preprocess_rgb ( row ), preprocess_spc ( row ), preprocess_meta ( row ) ] )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9bfbf7c5-c3b0-4c72-8abe-975634e6cff2",
   "metadata": {},
   "source": [
    "# Random seed\n",
    "\n",
    "Since we must be able to reproduce the results, we also need a utility to set the random seed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "af347420-1588-411a-8106-60c8056d7860",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set the random seed\n",
    "def set_random_seed ( seed ):\n",
    "\n",
    "    os.environ [ 'PYTHONHASHSEED' ] = str ( seed )\n",
    "    np.random.seed ( seed )\n",
    "\n",
    "    if tf.__version__ > '1.':\n",
    "        tf.random.set_seed ( seed )\n",
    "    else:\n",
    "        tf.set_random_seed ( seed )\n",
    "\n",
    "    random.seed ( seed )\n",
    "\n",
    "    return"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "995eee8f-1f8b-4b32-943d-548bf4ee9e4e",
   "metadata": {},
   "source": [
    "# Train and validation samples\n",
    "\n",
    "Now we are ready and we can extract the data.\n",
    "\n",
    "Each row in our dataset will have the following structure.\n",
    " - `80 x 80 x 4` for the B&W images\n",
    " - `40 x 40 x len ( sintch )` for the interesting spectral images\n",
    " - `len ( meta )` for the metadata\n",
    " - `len ( target ) = 2` for the target\n",
    "   - this means that the target `x` sits at `[ :, -3 ]` and the target `y` sits at `[ :, -2 ]`\n",
    "   - this is important for later on when we slice up the store to train our network\n",
    " - 1 additional column\n",
    "   - this column will be 1 if the row is in the training and 0 if in the validation sample\n",
    "   - we will mark this randomly so that roughly 20% of the data ends up in the validation sample\n",
    "   - a lot of slicing will be performed on this last column, which, importantly, now finds itself at `[ :, -1 ]`\n",
    "\n",
    "We also profile this section as it will take a while to complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ef0b7ae-6047-41e4-8402-aeaa7167cb5e",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "# Some counts\n",
    "n_rgb    = 80 * 80 * 4\n",
    "n_spc    = 40 * 40 * len ( sintch )\n",
    "n_meta   = 3\n",
    "n_target = 2\n",
    "n_valid  = 1\n",
    "\n",
    "n_inputs = n_rgb + n_spc + n_meta\n",
    "n_all    = n_inputs + n_target + n_valid\n",
    "\n",
    "# Set the random seed\n",
    "set_random_seed ( 20210618 )\n",
    "\n",
    "# Load the training data\n",
    "data  = load_data ()\n",
    "\n",
    "# Prepare an array where we will store our data\n",
    "store = np.zeros ( [ data.shape [ 0 ], n_all ], dtype = float )\n",
    "\n",
    "# Now load the data into this\n",
    "i = 0\n",
    "for irow, row in data.iterrows ():\n",
    "\n",
    "    # Load the preprocessed data\n",
    "    store [ i, :n_all - 1 ] = preprocess_row ( row )\n",
    "\n",
    "    # Set the validation sample - 20% change of being in the validation sample\n",
    "    if random.randint ( 0, 100 ) > 20:\n",
    "\n",
    "        store [ i, -1 ] = 1\n",
    "\n",
    "    # Move to next row\n",
    "    i = i + 1\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2770f517-616d-44d3-8392-fee500898846",
   "metadata": {},
   "source": [
    "# Model\n",
    "\n",
    "It is time to create our model.\n",
    "\n",
    "It has an input layer that contains the full row. This is then cropped into the three parts (RGB, spectral and metadata). Each of these are then modelled separately.\n",
    "\n",
    "The RGB is pooled until it is 20x20, as is the spectral. These two 20x20 images are then combined into a single, multichannel image which is processed further.\n",
    "\n",
    "Finally the image as well as the metadata is concatenated into a dense network which has two tanh outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f04c26c4-6c44-4425-b5fd-5000829940bd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Some parameters for what follows\n",
    "padding = \"same\"\n",
    "\n",
    "# Utility to create and return a CNN 2d convolution layer\n",
    "#   features     - int giving number of features\n",
    "#   kernel_size  - int giving size of kernel\n",
    "#   activation   - string giving name of activation e.g. tanh or relu\n",
    "#   batch_normal - boolean True to add a batch normalization layer\n",
    "#   max_pool     - int 0 for no or a number to add a max pooling layer\n",
    "#   avg_pool     - int 0 for no or a number to add an average pooling layer\n",
    "#   input_layer  - the input layer\n",
    "# Return the output layer\n",
    "def make_c2d_layer ( features, kernel_size, batch_normal, activation, max_pool, avg_pool, input_layer ):\n",
    "\n",
    "    next_layer = layers.Conv2D ( features, kernel_size, activation = activation, padding = padding ) ( input_layer )\n",
    "\n",
    "    if batch_normal:\n",
    "\n",
    "        next_layer = layers.BatchNormalization () ( next_layer )\n",
    "\n",
    "    if max_pool > 0:\n",
    "\n",
    "        next_layer = layers.MaxPooling2D ( max_pool, padding = padding ) ( next_layer )\n",
    "\n",
    "    if avg_pool > 0:\n",
    "\n",
    "        next_layer = layers.AveragePooling2D ( avg_pool, padding = padding ) ( next_layer )\n",
    "\n",
    "    return next_layer\n",
    "\n",
    "# Start with the input layer\n",
    "input_layer = layers.Input ( shape = n_inputs )\n",
    "\n",
    "# Reshape so that we can crop it\n",
    "flat_layer = layers.Reshape ( target_shape = ( n_inputs, 1 ) ) ( input_layer )\n",
    "\n",
    "# Chop into the various sections\n",
    "rgb_flat_layer = layers.Cropping1D ( cropping = ( 0, n_inputs - n_rgb ) ) ( flat_layer )\n",
    "rgb_layer = layers.Reshape ( target_shape = ( 80, 80, 4 ) ) ( rgb_flat_layer )\n",
    "\n",
    "spc_flat_layer = layers.Cropping1D ( cropping = ( n_rgb, n_inputs - n_rgb - n_spc ) ) ( flat_layer )\n",
    "spc_layer = layers.Reshape ( target_shape = ( 40, 40, len ( sintch ) ) ) ( spc_flat_layer )\n",
    "\n",
    "meta_layer = layers.Cropping1D ( cropping = ( n_rgb + n_spc, 0 ) ) ( flat_layer )\n",
    "meta_flat_layer = layers.Flatten () ( meta_layer )\n",
    "\n",
    "# The heart of the CNN is below\n",
    "\n",
    "# Convolve the RGB\n",
    "\n",
    "# Here it goes from 80x80x4 to 80x80x8\n",
    "rgb_layer1 = make_c2d_layer (  8, 1, True, \"relu\", 0, 0, rgb_layer )\n",
    "\n",
    "# Now from 80x80x8 to 40x40x16 using max pooling\n",
    "rgb_layer2 = make_c2d_layer ( 16, 3, True, \"relu\", 2, 0, rgb_layer1 )\n",
    "\n",
    "# Now from 40x40x16 to 20x20x32\n",
    "rgb_layer3 = make_c2d_layer ( 32, 3, True, \"relu\", 2, 0, rgb_layer2 )\n",
    "\n",
    "# Convolve the spectral side\n",
    "\n",
    "# Here it goes from 40x40xlen(sintch) to 40x40x16\n",
    "spc_layer1 = make_c2d_layer ( 16, 1, True, \"relu\", 0, 0, spc_layer )\n",
    "\n",
    "# Next from 40x40x16 to 20x20x32\n",
    "spc_layer2 = make_c2d_layer ( 32, 3, True, \"relu\", 2, 0, spc_layer1 )\n",
    "\n",
    "# Now we combine the two images and we have 20x20x64\n",
    "img_layer = layers.Concatenate () ( [ rgb_layer3, spc_layer2 ] )\n",
    "\n",
    "# Convolve this combined layer\n",
    "img_layer1 = make_c2d_layer ( 64, 1, True, \"relu\", 0, 0, img_layer )\n",
    "\n",
    "# Convolve again into 10x10 with 128 features\n",
    "img_layer2 = make_c2d_layer ( 128, 3, True, \"relu\", 2, 0, img_layer1 )\n",
    "\n",
    "# Now we flatten this and add the meta data\n",
    "img_flat_layer = layers.Flatten () ( img_layer2 )\n",
    "\n",
    "hidden_layer1 = layers.Concatenate () ( [ img_flat_layer, meta_flat_layer ] )\n",
    "\n",
    "# Build a dense network on these\n",
    "hidden_layer2 = layers.Dense ( 20, activation = \"relu\" ) ( hidden_layer1 )\n",
    "hidden_layer3 = layers.Dense (  5, activation = \"relu\" ) ( hidden_layer2 )\n",
    "\n",
    "# Now we can construct the output layer\n",
    "output_layer = layers.Dense ( 2, activation = \"tanh\" ) ( hidden_layer3 )\n",
    "\n",
    "# Construct the model\n",
    "model = Model ( input_layer, output_layer )\n",
    "model.summary ()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f06e5014-edce-401b-a452-c9215180424c",
   "metadata": {},
   "source": [
    "# Training time\n",
    "\n",
    "The model is now built and we can train it on the training set. The elaborate `fit` below is to limit the data to those that were marked to be part of the training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2943d0b-3e39-4c57-bfc1-0be944fe9380",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "# Now we can compile and train the model on the training set\n",
    "model.compile ( optimizer = \"SGD\", loss = \"mae\" )\n",
    "model.fit ( store [ store [ :, -1 ] == 1, :n_inputs ], store [ store [ :, -1 ] == 1, -3:-1 ], epochs = 10 )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bdbe80a-69ee-4d97-b585-338a64b756e4",
   "metadata": {},
   "source": [
    "# Testing time\n",
    "\n",
    "With the model trained the next step is to test it.\n",
    "\n",
    "Here we calculate the mean absolute deviation as that is the metric used in the competition. In the calculation below I add `abs x` to `abs y` and should therefore really divide the results by 2. But, since the targets have been scaled by 2, its omission below allows me to directly compare these results with the leaderboard. So a case of two wrongs making a right.\n",
    "\n",
    "The stuff in the square brackets is to limit the sample to the training and validation sections."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51a474fc-da52-4f62-9860-e75ddd5a94cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time to test the model\n",
    "pred = model.predict ( store [ :, :n_inputs ] )\n",
    "\n",
    "mae_0     = ( abs ( store [ :, -3 ] ) + abs ( store [ :, -2 ] ) ).mean ()\n",
    "mae_full  = ( abs ( pred [                    :, 0 ] - store [                    :, -3 ] ) + abs ( pred [                    :, 1 ] - store [                    :, -2 ] ) ).mean ()\n",
    "mae_train = ( abs ( pred [ store [ :, -1 ] == 1, 0 ] - store [ store [ :, -1 ] == 1, -3 ] ) + abs ( pred [ store [ :, -1 ] == 1, 1 ] - store [ store [ :, -1 ] == 1, -2 ] ) ).mean ()\n",
    "mae_valid = ( abs ( pred [ store [ :, -1 ] == 0, 0 ] - store [ store [ :, -1 ] == 0, -3 ] ) + abs ( pred [ store [ :, -1 ] == 0, 1 ] - store [ store [ :, -1 ] == 0, -2 ] ) ).mean ()\n",
    "\n",
    "print ( f\"MAE zeros { mae_0 } full sample { mae_full } in-sample { mae_train } validation sample { mae_valid }\" )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "568b942e-a93e-4b85-8927-2928e9c3c152",
   "metadata": {},
   "source": [
    "# Test and submit\n",
    "\n",
    "The next step is to apply the model to the test data and create a submission file, as is done below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1c337624-1b6e-414b-be56-8765284d7cf7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now we can run it on the test data and create a submission file at the same time\n",
    "\n",
    "# Output file and file header\n",
    "out = open ( \"cnn.csv\", \"w\" )\n",
    "out.write ( \"ID,x,y\\n\" )\n",
    "\n",
    "# Test data\n",
    "test = load_test ()\n",
    "\n",
    "for irow, row in test.iterrows ():\n",
    "\n",
    "    x = preprocess_test ( row )\n",
    "    y = model ( x.reshape ( 1, -1 ) ).numpy ().flatten ()\n",
    "\n",
    "    # Scale back to target and write result\n",
    "    out.write ( str ( row.ID ) )\n",
    "    out.write ( \",\" )\n",
    "    out.write ( str ( y [ 0 ] * 2 ) )\n",
    "    out.write ( \",\" )\n",
    "    out.write ( str ( y [ 1 ] * 2 ) )\n",
    "    out.write ( \"\\n\" )\n",
    "\n",
    "# Done\n",
    "out.close ()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ecec482-61b8-4c63-a81e-e74be35cb65b",
   "metadata": {},
   "source": [
    "# Next steps\n",
    "\n",
    "Well, do you feel lucky?\n",
    "\n",
    "If so, then you can simply change the random seed and try again, hoping to come up with a better solution. Good luck with that.\n",
    "\n",
    "Alternatively, you can try and train this differently or make minor changes here and there and see how it goes. Good luck with that as well.\n",
    "\n",
    "More advanced would be to add a type of ResNet layer where you hook up earlier layers to the output. I discarded that idea here, although I use it in my own work, as it would simply blow up an already bloated example and because this really is to get newcomers going.\n",
    "\n",
    "The best alternative, I think, is to really try to understand the problem and based on that understanding change this network into something better. This is more or less what I am doing. I had something like this roughly a month ago and since have changed it quite a bit based on my understanding of the problem.\n",
    "\n",
    "That said, here the data is so dirty that you could easily be better of by just changing the seed here. The complexity I added since pales in comparison to the simplicity here.\n",
    "\n",
    "# My approach\n",
    "\n",
    "Yes - I do want to win and I hope by publishing this I am not endangering my chances.\n",
    "\n",
    "As said, I started with this and this type of model remains one of my best thus far.\n",
    "\n",
    "It scores 0.238 on the leaderboard and would put you, at the time of writing, in position 80 or so.\n",
    "\n",
    "However, this does not *fully* solve the problem. To go into details at this stage will be premature, but I believe I've got a better grasp on this problem than is exibited in this model.\n",
    "\n",
    "This model is a fairly standard CNN and it can be obviously improved by adding a ResNET type layer, but there are also some other refinements possible that are pretty problem specific. Those I will not divulge at this stage.\n",
    "\n",
    "The advantages of the approach in this notebook is that it is simple, standard and it works well for this problem. The disadvantages is that it does not really solve the problem as it should and disregards important pieces of information which, again, I do not want to divulge right now.\n",
    "\n",
    "I am working on another approach that I believe to be much better than this one, even though it looks pretty similar to this. The difference is based on customising this model more to the problem using some of the properties of CNNs that lend themselves to this type of customisation.\n",
    "\n",
    "This other approach's advantages are that it is much more customised for the problem at hand and should solve it better if the data quality is better. Its disadvantages are that it is more complex, requires more work to transpose its outputs to the format required by the competition and that, for the data, it may not win this competition.\n",
    "\n",
    "So this notebook gives you a real fighting chance, although I hope to be able to work my approach into a winning solution.\n",
    "\n",
    "---\n",
    "\n",
    "Regards<br/>\n",
    "MG Ferreira<br/>\n",
    "2021"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3f60b39a-c1df-4ec8-a5d8-d523e4424f03",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
