{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The `ParameterMapping`\n",
    "In a heterogeneous `Dataset` that contains many `Replicate`s, one often needs to share some parameters globally across replicates while keeping others local, or shared to a subset of replicates.\n",
    "With `murefi`, the rules for this mapping of parameters are defined in a `ParameterMapping`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy\n",
    "import pandas\n",
    "\n",
    "import murefi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating a `ParameterMapping`\n",
    "The `ParameterMapping` object is created from a `pandas.DataFrame`.\n",
    "This dataframe must be indexed by the replicate ID (`\"rid\"`) and have the generic model parameter names as column headers.\n",
    "\n",
    "The values in the `DataFrame` may be fixed to a specific value, or left variable by passing an arbitrary placeholder name.\n",
    "Names of placeholders may occur multiple times to share them across respective replicates, but they must only appear within the same column.\n",
    "In the following example, there are 6 free parameters ($substrate_c$, $P_0$, $v_{max,A/B/C}$ and $K_S$).\n",
    "Here, only $K_S$ is truly *global*, whereas all other parameters apply to only a subset of the replicates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>S_0</th>\n",
       "      <th>P_0</th>\n",
       "      <th>v_max</th>\n",
       "      <th>K_S</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rid</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A</th>\n",
       "      <td>8.0</td>\n",
       "      <td>P_0</td>\n",
       "      <td>v_max_A</td>\n",
       "      <td>K_S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>B</th>\n",
       "      <td>10.0</td>\n",
       "      <td>P_0</td>\n",
       "      <td>v_max_B</td>\n",
       "      <td>K_S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C</th>\n",
       "      <td>substrate_c</td>\n",
       "      <td>0</td>\n",
       "      <td>v_max_C</td>\n",
       "      <td>K_S</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             S_0  P_0    v_max  K_S\n",
       "rid                                \n",
       "A            8.0  P_0  v_max_A  K_S\n",
       "B           10.0  P_0  v_max_B  K_S\n",
       "C    substrate_c    0  v_max_C  K_S"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_mapping = pandas.DataFrame(\n",
    "    columns='rid,S_0,P_0,v_max,K_S'.split(',')\n",
    ").set_index('rid')\n",
    "df_mapping.loc['A'] = (8.0, 'P_0', 'v_max_A', 'K_S')\n",
    "df_mapping.loc['B'] = (10.0, 'P_0', 'v_max_B', 'K_S')\n",
    "df_mapping.loc['C'] = ('substrate_c', '0', 'v_max_C', 'K_S')\n",
    "\n",
    "df_mapping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The constructor of the `ParameterMapping` takes override-dictionaries for `bounds` and initial `guesses`.\n",
    "These dictionaries should contain bounds & guesses for all model dimensions.\n",
    "\n",
    "The entries in `bounds` & `guesses` *can* refer to the unique names of free parameters, but the more common use case is to use the same values for all parameters of the same kind:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ParameterMapping(3 replicates, 4 inputs, 6 free parameters)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create the ParameterMapping object\n",
    "pm = murefi.ParameterMapping(\n",
    "    df_mapping,\n",
    "    guesses=dict(\n",
    "        S_0=5,\n",
    "        P_0=0,\n",
    "        v_max=2,\n",
    "        K_S=0.1,\n",
    "        # special guess for v_max_C:\n",
    "        v_max_C=3,\n",
    "    ),\n",
    "    bounds=dict(\n",
    "        S_0=(5, 15),\n",
    "        P_0=(0, 5),\n",
    "        v_max=(0, 10),\n",
    "        K_S=(0.01, 1),\n",
    "    )\n",
    ")\n",
    "pm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Properties of the `ParameterMapping`\n",
    "The object exposes several properties that are useful in various contexts.\n",
    "This includes:\n",
    "+ `order`: names of model parameters (matches the column order from the DataFrame)\n",
    "+ `parameters`: dictionary of all unique parameters in the mapping and their corresponding model parameter name\n",
    "+ `bounds` and `guesses` with `ndim` entries for all the `parameters`\n",
    "+ `mapping` is a dictionary indicating the names or values of parameters that will end up in replicate-wise parameter vectors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Order of parameters in the model:   ('S_0', 'P_0', 'v_max', 'K_S')\n",
      "Number of free parameters:          6\n",
      "Names of free parameters:           ('K_S', 'P_0', 'substrate_c', 'v_max_A', 'v_max_B', 'v_max_C')\n",
      "\n",
      "Initial guesses:    (0.1, 0, 5, 2, 2, 3)\n",
      "Parameter bounds:   ((0.01, 1), (0, 5), (5, 15), (0, 10), (0, 10), (0, 10))\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(f\"\"\"\n",
    "Order of parameters in the model:   {pm.order}\n",
    "Number of free parameters:          {pm.ndim}\n",
    "Names of free parameters:           {pm.theta_names}\n",
    "\n",
    "Initial guesses:    {pm.guesses}\n",
    "Parameter bounds:   {pm.bounds}\n",
    "\"\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The association between replicate ID and parameter values or placeholders is available through the `pm.mapping` property:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'A': (8.0, 'P_0', 'v_max_A', 'K_S'),\n",
       " 'B': (10.0, 'P_0', 'v_max_B', 'K_S'),\n",
       " 'C': ('substrate_c', 0.0, 'v_max_C', 'K_S')}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pm.mapping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `repmap` Method\n",
    "The `repmap` method is the most important feature - it maps a global parameter vector (same order as `theta_names`) to `Replicate`-wise parameter vectors.\n",
    "\n",
    "For example, it can be used to map the full-length vector of initial guesses to short vectors for each replicate.\n",
    "Note that for replicate \"C\", the 3rd parameter ($v_{max}$) has the 3, because it was given a special initial guess earlier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'A': (8.0, 0, 2, 0.1), 'B': (10.0, 0, 2, 0.1), 'C': (5, 0.0, 3, 0.1)}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pm.repmap(pm.guesses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Last updated: Mon Mar 29 2021\n",
      "\n",
      "Python implementation: CPython\n",
      "Python version       : 3.7.9\n",
      "IPython version      : 7.19.0\n",
      "\n",
      "murefi: 5.0.0\n",
      "numpy : 1.19.2\n",
      "pandas: 1.2.1\n",
      "\n",
      "Watermark: 2.1.0\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%load_ext watermark\n",
    "%watermark -n -u -v -iv -w"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}