{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculating the Clustering Ensemble Similarity between ensembles\n", "\n", "Here we compare the conformational ensembles of proteins in three trajectories, using the clustering ensemble similarity method.\n", "\n", "**Last executed:** Feb 06, 2020 with MDAnalysis 0.20.2-dev0\n", "\n", "**Last updated:** January 2020\n", "\n", "**Minimum version of MDAnalysis:** 0.20.1\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis (Michaud-Agrawal *et al.*, 2011, Gowers *et al.*, 2016)\n", "* MDAnalysisTests\n", "* [scikit-learn](https://scikit-learn.org/stable/)\n", " \n", "**Optional packages for visualisation:**\n", "\n", "* [matplotlib](https://matplotlib.org)\n", "\n", "\n", "
\n", " \n", "**Note**\n", "\n", "The metrics and methods in the `encore` module are from (Tiberti *et al.*, 2015). Please cite them when using the ``MDAnalysis.analysis.encore`` module in published work.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import MDAnalysis as mda\n", "from MDAnalysis.tests.datafiles import (PSF, DCD, DCD2, GRO, XTC, \n", " PSF_NAMD_GBIS, DCD_NAMD_GBIS)\n", "from MDAnalysis.analysis import encore\n", "from MDAnalysis.analysis.encore.clustering import ClusteringMethod as clm\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The test files we will be working with here feature adenylate kinase (AdK), a phosophotransferase enzyme. (Beckstein *et al.*, 2009)\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "u1 = mda.Universe(PSF, DCD)\n", "u2 = mda.Universe(PSF, DCD2)\n", "u3 = mda.Universe(PSF_NAMD_GBIS, DCD_NAMD_GBIS)\n", "\n", "labels = ['DCD', 'DCD2', 'NAMD']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The trajectories can have different lengths, as seen below." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "98 102 100\n" ] } ], "source": [ "print(len(u1.trajectory), len(u2.trajectory), len(u3.trajectory))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with default settings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The clustering ensemble similarity method combines every trajectory into a whole space of conformations, and then uses a user-specified `clustering_method` to partition this into clusters. The population of each trajectory ensemble within each cluster is taken as a probability density function.\n", "\n", "The similarity of each probability density function is compared using the Jensen-Shannon divergence. This divergence has an upper bound of $\\ln{(2)}$, representing no similarity between the ensembles, and a lower bound of 0.0, representing identical conformational ensembles.\n", "\n", "You do not need to align your trajectories, as the function will align it for you (along your `selection` atoms, which are `selection='name CA'` by default). " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "ces0, details0 = encore.ces([u1, u2, u3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`encore.ces` returns two outputs. `ces0` is the similarity matrix for the ensemble of trajectories." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0.68070702, 0.69314718],\n", " [0.68070702, 0. , 0.69314718],\n", " [0.69314718, 0.69314718, 0. ]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ces0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`details0` contains the calculated clusters as a `encore.clustering.ClusterCollection.ClusterCollection`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "We have found 49 clusters\n" ] } ], "source": [ "cluster_collection = details0['clustering'][0]\n", "print(type(cluster_collection))\n", "print('We have found {} clusters'.format(len(cluster_collection)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access each Cluster at `cluster_collection.clusters`. For example, the first one has these elements:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 98])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first_cluster = cluster_collection.clusters[0]\n", "first_cluster" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 98])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first_cluster.elements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each cluster has an ID number and a centroid conformation." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The ID of this cluster is: 0\n", "The centroid is 1\n" ] } ], "source": [ "print('The ID of this cluster is:', first_cluster.id)\n", "print('The centroid is', first_cluster.centroid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAU4AAAEICAYAAAAwUh0YAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3deZwdVZ338c83YY8gaECQHUFWkSUGncEFFQ2MI64DiA64YZToOD7y6IzzuIyzuKADDmCMjIK4ADKgzBgNyKOiAwwJTkADAiEiCSAQUMImkM53/qhqLC63u+t2bnVXd77v16teuXWrzqlT1Z1fn1Pn1CnZJiIi6psy3gWIiJhoEjgjInqUwBkR0aMEzoiIHiVwRkT0KIEzIqJHCZzDkPRxSV8f73IASHpA0i7jXY7xIMmSdh1i23GSftbn4x0j6eJRpn2hpBsq67dIevlalGWd/bm32TofOCW9SdKi8hf0Dknfl3RwH/PfqfyPv97a5GP7KbaX9atcMTTb37D9ilGm/ant3ftYlsd/7pLOlPQP/co7Rm+dDpySPgCcDPwT8AxgB+B04IjxLFfV2gbcmJjyc2+3dTZwSnoq8PfACbYvsP2g7cds/4ftE7vs/xJJKzq+e7wZJmlmWXNdJelOSZ8vd7us/Pf3Za32BeX+b5N0vaTfSVogacdKvpZ0gqSbgJsq3+1afj5T0mmSvifpfkn/LelZlfSvkHSDpPsknS7pJ5LeMcR1mCLpw5JulnSPpPMkPa3cNlhbPlbSrZJWSvpIJe1Q54yk50u6XNLvJV0j6SWVbT+W9A/l9gck/Yekp0v6RpnXQkk7dRT1cEnLyjJ8VlLX311Je0i6RNK95TX4i277lfseV+Z5v6RfSzqm8v3PKvtZ0nsk3VTu+0lJz5J0RVne8yRtUO77pN+Tjut1RXlN7pB06mC6ynG6/twlHQ8cA/zfyjU7UdK/dxzjXyWdPNQ5R5/YXicXYBawGlhvmH0+Dny9/PwSYEXH9luAl5efrwDeUn5+CvD88vNOgKvHAV4DLAX2BNYD/g64vLLdwCXA04CNK9/tWn4+E7gXmFmm/wZwTrltOrAKeF257a+Ax4B3DHGO7weuBLYDNgS+BHyro+xfBjYGngs8Auw5wjlvC9wDHE7xx/nQcn3LcvuPy/N/FvBU4DrgRuDlZZm/Bny143r8qLweO5T7vqPcdhzws/LzNGA58NYynwOAlcDeXc57Wnmddi/Xtxncr5pn5fgXAZsBe5fX4FJgl0r5j+32e8ITf0cOBJ5flm0n4Hrg/T3+3P+hsv82wIPA5uX6esBdwIHj/f9rsi/rbI0TeDqw0vbqPuX3GLCrpOm2H7B95TD7vgv4Z9vXl8f/J2C/aq2z3H6v7YeHyOMC21eV6b8B7Fd+fziwxEUtejXwBeC3I5TlI7ZX2H6E4o/FGzqaip+w/bDta4BrKALocOf8ZmC+7fm219i+BFhUlm3QV23fbPs+4PvAzbZ/WJb528D+HeX8dHk9bqW4vXJ0l3N5FXCL7a/aXm3758C/A28Y4tzXAPtI2tj2HbaXDHOdPm17VbnPL4GLbS+rlL+zvE9i+2rbV5Zlu4Xij9SLO3Yb6edeze8OihbNG8uvZlH8Tl89UtpYO+ty4LwHmN7He0lvB54N/Kpsar5qmH13BE4pm2y/p6g9iqKmNmj5CMerBsOHKGp8AM+sprVtoGvTsVKWCytluR4YoLjnO9KxhjrnHYE3DuZZ5nswRQ1p0J2Vzw93WX8KT1S9Hr8pz7PbuRzUcdxjgK07d7T9IHAkMBu4o7ztsUeXPEdb3ieR9GxJ/ynpt5JWUfzBnN6x20g/905nUfyhovz37B7Txyisy4HzCuAPFM3mOh4ENhlckTQV2HJw3fZNto8GtgI+DZwvaRpFU6vTcuBdtjevLBvbvryyz2inrbqDotk9WE5V14coy2EdZdnI9m0jHWiYc14OnN2R5zTbnxrlOQFsX/m8A3D7EOfyk47jPsX2u4co/wLbh1IE9F9R3JJo0hfL4+xmezPgbyn+YD6hWMOk77btO8C+kvahqHF/ox8FjeGts4GzbGJ9FDhN0mskbSJpfUmHSfpMlyQ3AhtJ+jNJ61Pcl9xwcKOkN0va0vYa4Pfl1wPA3RRNwupYvLnA30jau0z7VElvpD++BzynPKf1gBPoUuPqKMs/Dt4mkLSlpFqjCoY5568Dfy7plZKmStqo7DQZLoCP5ERJW0januK+7bld9vlP4NmS3lL+LNeX9DxJe3Yp+zMkvboM9I8AD5Rlb9KmFPdVHyhrt10D+jDu5Im/R9j+A3A+8E3gqvJWRjRsnQ2cALY/D3yAIgjeTVFjmUPxV7xz3/uA9wBnALdR1ECrTeBZwBJJDwCnAEfZ/oPth4B/BP6rbD4+3/aFFDW0c8om2y+Bw/p0Tisp7nl9huJ2xF4U9xcfGSLJKRQdHxdLup+io+igmocb6pyXUwzp+lv+eF1PZO1+374LXA0spvjj8G+dO9i+H3gFcBRFjfS3FNd5w859y7L8n3K/eynuNb5nLcpXxweBNwH3U9RuuwX/4fwbsFf5e1T9HT0LeA5ppo8ZFbfAYrIqh+2sAI6x/aPxLk/0n6QdKG4BbG171XiXZ12wTtc4J6uyiby5pA3543204Xr5Y4Iq/zB+gGI4WoLmGMnTCZPTCyjueW1AMcbwNXWGt8TEUt6fvZNilMGscS7OOiVN9YiIHqWpHhHRownZVN9AG3ojpo13MVpr130fHO8itN6UJw2fjKpblj/GynsH1uoivfKQab7n3nojvK6+9pEFtifM7YYJGTg3YhoH6WXjXYzWuvD7V413EVpvkykbjLzTOmzmK3t9gOnJ7rl3gKsW7FBr36nb3NT5BFWrTcjAGRHtZ2ANa8a7GI1I4IyIRhjzmJt+GGt8JHBGRGNS44yI6IExA5N0uGMCZ0Q0Zs2oJ/lqtwTOiGiEgYEEzoiI3qTGGRHRAwOP5R5nRER9xmmqR0T0xDAwOeNmAmdENKN4cmhySuCMiIaIgUk6mUoCZ0Q0ougcSuCMiKitGMeZwBkR0ZM1qXFGRNSXGmdERI+MGJikb+dJ4IyIxqSpHhHRAyMe9dTxLkYjEjgjohHFAPg01SMiepLOoYiIHthiwKlxRkT0ZE1qnBER9RWdQ5MzxEzOenREjLvBzqE6Sx2SZkm6QdJSSR8eYp+XSFosaYmkn/TzfKom55+DiGiFgT6N45Q0FTgNOBRYASyUdJHt6yr7bA6cDsyyfaukrfpy8C4SOCOiEX1+cmgmsNT2MgBJ5wBHANdV9nkTcIHtWwFs39Wvg3dKUz0iGrPGU2otNWwLLK+sryi/q3o2sIWkH0u6WtJf9uk0niQ1zohoRDHJR+262XRJiyrr82zPq6x3a/N3vphjPeBA4GXAxsAVkq60fWPdQtSVwBkRjTDisfqPXK60PWOY7SuA7Svr2wG3d9lnpe0HgQclXQY8F+h74ExTPSIaYcOAp9RaalgI7CZpZ0kbAEcBF3Xs813ghZLWk7QJcBBwfV9PqtSXwClpoDIE4BpJH5A0pbJ9pqTLyqEEv5J0hqRNJB0n6W5J/yPpJkkLJP1JP8oUEeNNrKm5jMT2amAOsIAiGJ5ne4mk2ZJml/tcD/wAuBa4CjjD9i+bOLN+NdUftr0fQDkE4JvAU4GPSXoG8G3gKNtXSBLwemDTMu25tueUaQ8BLpB0SHkRImKCMvT1kUvb84H5Hd/N7Vj/LPDZvh10CH1vqpdDAI4H5pRB8gTgLNtXlNtt+3zbd3ZJ+yNgXpk+Iia4AabUWiaaRkpcjrWaAmwF7ANc3UPynwN7NFGuiBg7RqxxvWWiabJXfbRXo2s6ScdT1kQ3YpPRlikixkjxeuDJOXCnkRqnpF2AAeAuYAnF2Kq69qdLT5jtebZn2J6xPhv2p6AR0SAxUHOZaPoeOCVtCcwFTrVt4FTgWEkHVfZ5s6Stu6R9MUWt8sv9LldEjC3T1yeHWqVf9eiNJS0G1gdWA2cDnwewfaeko4CTyh73NcBlwAVl2iMlHQxsAvwaeH161CMmh4lYm6yjL4HTHv7xgLJH/YVdNp1ZLhExydhqfW1S0rTySaOetPusImLCKjqHptZaxpqkP5F0HWV/iqTnSjq9bvoEzohoiPr5yGW//QvwSuAeANvXAC+qm3hyjhWIiHFXdA619x6n7eXFMzqPG6ibNoEzIhrT4qeClpfzYricNOR99DAhSAJnRDRi8MmhlpoNnEIxGfIK4GKKx8NrSeCMiMbUfRHbWLO9EjhmtOnbeVYRMeHZ8NiaKbWWsSbprPLlboPrW0j6St30qXFGRCOKpnpr62b72v794Irt30nav27iBM6IaEyLnxyaImkL278DkPQ0eoiHCZwR0YiWD0f6HHC5pPPL9TcC/1g3cQJnRDSkvU1121+TdDVwCMVUlq+zfd0IyR6XwBkRjanzPqFx9Cvgd5RxUNIOtm+tkzCBMyIaUfSqj/1z6HVIei/wMeBOiieGRHF3Yd866RM4I6IRLR8A/1fA7rbvGU3iBM6IaEyLm+rLgftGmziBMyIa0fJe9WXAjyV9D3hk8Evbn6+TuJ1dXhExKfTz1RmSZkm6QdJSSR/usv0lku6TtLhcPjpMdrcClwAbAJtWllpS44yIRthidZ+GI0maCpwGHEoxKcdCSRd1GUL0U9uvGrls/kSZb2aAj4h26eN71WcCS20vs/0ocA5wxGjLJekFmQE+Ilpn8B5nzcA5XdKiynJ8R3bbUnToDFpRftfpBZKukfR9SXsPU7yTyQzwEdFGPXQOrbQ9Y5jt3TJyx/rPgR1tPyDpcOA7wG5DZbg2M8CnxhkRjRgcx9mnpvoKYPvK+nbA7U84nr3K9gPl5/nA+pKmD5HfE2aAl/RBepgBPoEzIhqzBtVaalgI7CZp5/JVF0cBF1V3kLS1yiqkpJkU8W2oAe6zKWZ8H5wBfj8yA3xEjDcbVvdpkmLbqyXNARYAU4Gv2F4iaXa5fS7wBuDdklYDDwNH2e5szg/mt1YzwCdwRkRj+jkAvmx+z+/4bm7l86nAqXXykvSFLl/fByyy/d2R0qepHhGN6PM9zn7biKJ5flO57As8DXi7pJNHSpwaZ0Q0xu195HJX4KW2VwNI+iLFmy4PBX4xUuIEzohoTIsn+dgWmMYfJ/qYBjzT9oCkR4ZOVkjgjIhG2K2e5OMzwGJJP6YYI/oi4J8kTQN+OFLiBM6IaIgYGIdX/46kHLJ0MUVH00yKwPm3tgfHhZ44Uh4JnBHRmDbe47RtSd+xfSAwYg96NxMycO6674Nc+P2rxrsYrfXa7WaOdxFab8Hti8e7CJNey+fjvFLS82wvHE3iCRk4I2ICcHGfs6UOAWZLugV4kPKdQ7bzzqGIGF8t7lU/bG0St+/ObURMCi47h+osY142+zcUk4a8tPz8ED3Ew9Q4I6IxbW2qS/oYMAPYHfgqsD7wdeBP66RP4IyIxrSxV730WmB/ijk8sX27pLxzKCLGl93qwPloOSzJULx7qJfECZwR0ZgWD0c6T9KXgM0lvRN4G/DluokTOCOiMW29x2n7JEmHAqso7nN+1PYlddMncEZEI4xY08JHLgEk/TXw7V6CZVU7zyoiJgXXXMbBZsACST+VdIKkZ/SSOIEzIppRdg7VWca8aPYnbO9N8Z6hZwI/kTTirEiD0lSPiOa09B5nxV3Abyle6rZV3USpcUZEY9pa45T07nIuzkuB6cA76z6nDqlxRkRDDKxZ09rhSDsC77c9qmmyEjgjohkGWjaOU9JmtldRzACPpKdVt9u+t04+aapHRGPseksdkmZJukHSUkkfHma/50kakPSGLpu/Wf57NbCo/PfqynotqXFGRHP61DkkaSpwGsVbKFcACyVdZPu6Lvt9GljQtTj2q8p/d16b8iRwRkRD+trxMxNYansZgKRzgCOA6zr2ey/w78DzupZIOmC4g9j+eZ3CJHBGRHPq1zinS6o2lefZnldZ3xZYXllfARxUzUDSthSzHr2UIQIn8Lny340oppW7hmL2932B/wYOrlPYBM6IaIbB9XvVV9qeMcz2bhl1huWTgQ+V70bvXiT7EHi8xnq87V+U6/sAH6xb2ATOiGhQ35rqKyhmbB+0HXB7xz4zgHPKoDkdOFzSatvf6ZLfHoNBE8D2LyXtV7cwCZwR0Zz+PTm0ENhN0s7AbcBRwJuecKhKh4+kM4H/HCJoAlwv6QyKWd8NvBm4vm5hEjgjojl9Cpy2V0uaQ9FbPhX4iu0lkmaX2+f2mOVbgXcDf1WuXwZ8sW7iBM6IaEafB8Dbng/M7/iua8C0fdwIef0B+Jdy6VkCZ0Q0pq0TGa+tBM6IaE57n1VfKwmcEdEYpcYZEdGDcZzefSSSng2cSDFL0uNx0PZL66RP4IyIhqh1syNVfBuYS/Fmy4FeEydwRkRzWlrjBFbbrj38qFOmlYuI5qypuYy9/5D0HknbSHra4FI3cWqcEdGMFk5kXHFs+e+Jle8M7FInce3AKWkA+AWwPrAaOAs42faacvtM4CTgGWUBfga8D/gL4LMUz5o+BVgGfML25WW6zwJ/DjwK3Ay81fbv65YrItqrrb3qazsfZy9N9Ydt71e+UvNQ4HDgYwDlO4m/TTEzye7AnsAPgE3LtOfa3t/2bsCngAsk7VluuwTYp3xR0o3A36zNCUVEi7T0xeqS1pf0Pknnl8scSevXTT+qe5y27wKOB+aomIrkBOAs21eU2237fNt3dkn7I2BemR7bF9teXW6+kmLWk4iIJn0ROBA4vVwOZCyeVbe9TNIUincR70PRdK/r58C7unz/NuDcbgkkHU8ZbLffdmpvhY2IcdHWpjrwPNvPraz/f0nX1E28tr3qo73z+6R0kj5Cce/0G90S2J5ne4btGdOfnsAZ0XqmeOSyzjL2BiQ9a3BF0i70MJ5z1DXOyoHuApZQVHW/WzP5/lTmvpN0LPAq4GX2ZJ0WIGId1N7/zScCP5K0jKIityPFVHO1jCpwStqSYtT9qbYt6VTgKknfs/3f5T5vBn7YJe2LKZrcg1PYzwI+BLzY9kOjKU9EtFNbm+q2L5W0G7A7ReD8le1H6qbvJXBuLGkxfxyOdDbw+bIQd0o6CjhJ0lYUQ1ovAy4o0x4p6WBgE+DXwOttD9Y4TwU2BC4pp7y/0vbsHsoVEW3V0sBZOhDYiSIOPlcStr9WJ2HtwGl72BuLZY/6C7tsOrNchkq3a90yRMQE09LAKels4FnAYv54b9NAfwNnREQv5PY21Sle7LbXaPtUEjgjojntncj4l8DWwB2jSZzAGRGNaXGNczpwnaSrgMc7hWy/uk7iBM6IaE57A+fH1yZxAmdENKPF9zht/2Rt0mc+zohoTnsn+XidpJsk3SdplaT7Ja2qmz6BMyIaozX1llp5SbMk3SBpqaQPd9l+hKRrJS2WtKgcOz6UzwCvtv1U25vZ3tT2ZnXPK031iGg9SVOB0yimtFwBLJR0ke3rKrtdClxUPs24L3AesMcQWd5ZeQinZwmcEdGc/jXDZwJLbS8DkHQOcATweOC0/UBl/2kjHH2RpHOB7/DEXvULhk7yRwmcEdGM3jqHpktaVFmfZ3teZX1bYHllfQVwUGcmkl4L/DPFdJd/NszxNgMeAl7xxBKTwBkR46x+4Fxpe8Yw27uNpH9S7rYvBC6U9CLgk8DLuxbLrj0TUjcJnBHRnP411VcA21fWtwNuH/Kw9mWSniVpuu2VndslbQS8Hdgb2KiS7m11CpNe9YhohOhrr/pCYDdJO0vaADgKuOgJx5N2LV/lg6QDgA2Ae4bI72yKRy5fCfyEIhDfX/fcUuOMiGb0cQC87dWS5gALgKnAV2wvkTS73D4XeD3wl5IeAx4GjhxmEo9dbb9R0hG2z5L0zTLvWhI4I6I5fRzcbns+ML/ju7mVz58GPl0zu8fKf38vaR/gtxRzc9aSwBkRzWnpI5fAPElbAP+Posn/FOCjdRMncEZEY1r8rPoZ5cefALv0mj6BMyKa09LAKWlDinuiO1GJg7b/vk76BM6IaIbrP4c+Dr4L3AdcTeXJoboSOCOiOS2tcQLb2Z412sQZxxkRjRl879BIyzi4XNJzRps4Nc6IaE7LapySfkFRqvWAt0paRtFUF2Db+9bJJ4EzIpoxTpMUj+BV/cgkgTMiGiFaORzpbuAx248BSNodOBz4Td0p5SD3OCOiQS28x/kDyieEJO0KXEExjvMESf9cN5MEzohoTvveObSF7ZvKz8cC37L9XuAwemjGJ3BGRHPaFzirR3spcAmA7UeB2qNOc48zIprRztcDXyvpJOA2YFfgYgBJm/eSSWqcEdGc9tU43wmspLjP+QrbD5Xf7wWcVDeT1DgjojFte+TS9sPAp6rfSTrA9uXA5XXzmZCBcwpikykbjHcxWmvB7YvHuwit98pn7jfeRWi1Gz3UxOm9aWFTvZszgAN6STAhA2dETADtHADfTbcXwQ0rgTMimjMxAucnek2QwBkRjWjpk0OPk7QtsCNwb/k6YWxfVidtAmdENEZr2hk5JX0aOBK4DhgovzaQwBkR46jd9zhfA+xuu+dJjCGBMyIa1OKm+jJgfUYx+zskcEZEk/oYOCXNAk6heK/6GbY7x2MeA3yoXH0AeLfta4bI7iFgsaRLqQRP2++rU5YEzohoTL9qnJKmAqcBhwIrgIWSLrJ9XWW3XwMvtv07SYcB84CDhsjyonIZlQTOiGhO/2qcM4GltpcBSDoHOIKic6c4VPH0z6Arge2GLJZ9lqSNgR1s39BrYfKsekQ0o3zLZZ0FmC5pUWU5viO3bYHllfUV5XdDeTvw/aE2SvpzYDHF/JxI2k9S7RpoapwR0Ygex3GutD1jhOw6dc1d0iEUgfPgYfL7OEUt9scAthdL2rlWSUngjIgmuW9t9RXA9pX17YDbO3eStC/Fs+eH2cM+cL/a9n3SE+Jx7cKmqR4RjenjqzMWArtJ2lnSBsBRdHTuSNoBuAB4i+0bR8jvl5LeBEyVtJukf6WH2ZESOCOiGXXn4qwROG2vBuYAC4DrgfNsL5E0W9LscrePAk8HTpe0WNKiYbJ8L7A3xVCkbwGrgPfXPbU01SOiMf2cj9P2fGB+x3dzK5/fAbyjZl4PAR8BPlIOdZpm+w91y5IaZ0Q0pode9bEtl/RNSZtJmgYsAW6QdGLd9AmcEdEMU3QO1VnG3l62V1E8sz4f2AF4S93ECZwR0ZgWvld90PqS1qcInN+1/RjpVY+IVmjfy9oGfQm4BZgGXCZpR4oOolrSORQRjWjzRMa2vwB8ofLVb8qB87UkcEZEM+w2T2S8IfB6itcEV+Pg39dJn8AZEc1pZ9wE+C5wH3A1o5iTM4EzIhrT1qY6sJ3tWaNNnM6hiGiGgTWut4y9yyU9Z7SJU+OMiOa0t8Z5MHCcpF9TNNUF2Pa+dRIncEZEY1rcVD9sbRKnqR4RjdEa11rGmu3fUExT99Ly80P0EA9T44yIZrT49cCSPgbMAHYHvkrxxsuvA39aJ30CZ0Q0ohgA39LICa8F9gd+DmD7dkmb1k2cwBkRzRmHmY9qetS2peIubDlLUm25xxkRjZFdaxkH50n6ErC5pHcCl1K8cqOW1Dgjohktvsdp+yRJh1JM7PFs4O9s/7Bu+hFrnJIs6XOV9Q9K+njHPtdI+lbHd2dKeqh630DSKWV+08v1gXKK+yVlHh+QlFpwxKRQr0d9LHvVJd0vaZWkVRTvJ5pdLhdKulvSlZJeNlI+dYLUI8DrBoNdl4LsWebzoi73CZZSvDSeMiAeAtxW2f6w7f1s7w0cChwOfKxGmSJiImjZRMa2N7W9WblsWl2ArYF3AaeMlE+dwLkamAf89RDb3wScDVwMvLpj27eAI8vPLwH+q8zvSWzfBRwPzFHHOzsjYgJye1+d0Y3tAdvXAP860r51m8WnAcdIemqXbUcC51IEyaM7tt0EbClpi3LbOcMdxPayskxb1SxXRLRZy2qcddj+0kj71Aqc5bs5vga8r/q9pOcBd5cj7y8FDiiDZNUFFO9APgj4aY3Dda1tSjpe0iJJi+6+Z6BOsSNivLV3Bvi10ktHzMnA2ymmmh90NLCHpFuAm4HNKCYHrToH+CRwie1hK+WSdgEGgLs6t9meZ3uG7RlbPn1qD8WOiPGiNWtqLbXykmZJukHSUkkf7rJ9D0lXSHpE0gf7fjIVtQOn7XuB8yiC52BnzxuBfW3vZHsnio6gozvS3Urx/uLTh8tf0pbAXOBUu2V194jonSkGwNdZRlC++/w0isk59gKOlrRXx273UrSKT+pL+YfR69CfzwGDvesvAm6zXe0lvwzYS9I21US2v2T75i75bTw4HAn4IUUH0yd6LFNEtJCoN/i95gD4mcBS28tsP0rRkj2iuoPtu2wvBB7r/9k80YgD4G0/pfL5TmCTyubnd+w7AAwGzeOGyG+nyue0uSMms/qNx+mSFlXW59meV1nfFlheWV9B0W8yLvLkUEQ0p37gXGl7xjDbu3Uaj9stvQTOiGjG4D3O/lhBMX/moO2A2/uWe48SOCOiMXV7zGtYCOwmaWeKpw+Ponj4ZlwkcEZEQ/o3uN32aklzgAXAVOArtpdIml1unytpa2ARxbDINZLeD+xVjkPvqwTOiGiG6etTQbbnA/M7vptb+fxbiiZ84xI4I6I5LXkOvd8SOCOiMS1+dcZaSeCMiOYkcEZE9MCGgcnZVk/gjIjmpMYZEdGjBM6IiB4YGMP3CY2lBM6IaIhh+Cl4J6wEzohohknnUEREz3KPMyKiRwmcERG9aN8bLPslgTMimmGgf9PKtUoCZ0Q0JzXOiIhe5JHLiIjeGJxxnBERPcqTQxERPco9zoiIHtjpVY+I6FlqnBERvTAeGBjvQjQigTMimpFp5SIiRmGSDkeaMt4FiIjJyYDXuNZSh6RZkm6QtFTSh7tsl6QvlNuvlXRAv89pUAJnRDTD5UTGdZYRSJoKnAYcBuwFHC1pr47dDgN2K5fjgS/294T+KIEzIhrjgYFaSw0zgaW2l9l+FDgHOKJjnyOAr7lwJbC5pG36e0aFCXmP8+prH1k5dZulvxnvclRMB1aOdyFarmXXaOl4F6BTy64PO65tBg6hWkQAAAOiSURBVPfzuwU/9PnTa+6+kaRFlfV5tudV1rcFllfWVwAHdeTRbZ9tgTtqlqG2CRk4bW853mWokrTI9ozxLkeb5RoNbzJeH9uz+piduh1iFPv0RZrqETERrAC2r6xvB9w+in36IoEzIiaChcBuknaWtAFwFHBRxz4XAX9Z9q4/H7jPdt+b6TBBm+otNG/kXdZ5uUbDy/UZhu3VkuYAC4CpwFdsL5E0u9w+F5gPHE5xA/sh4K1NlUeepM+SRkQ0JU31iIgeJXBGRPQogXMYkgYkLZa0RNI1kj4gaUpl+0xJl5WPgf1K0hmSNpF0nKS7Jf2PpJskLZD0J+N5Lv3S1DWR9Nly/2slXShp8/E5w7UjyZI+V1n/oKSPd+xzjaRvdXx3pqSHJG1a+e6UMr/p5fqw1z7GTi768B62vZ/tvYFDKW48fwxA0jOAbwMfsr07sCfwA2DwF/9c2/vb3g34FHCBpD3H/Az6r6lrcgmwj+19gRuBvxmzM+qvR4DXDQa7TuX5TgFeJGlax+allE/DlAHxEOC2yvYhr32MrQTOmmzfRfH86xxJAk4AzrJ9Rbndts+3fWeXtD+i6DU9fizL3LR+XhPbF9teXW6+kmIM3kS0muK8/nqI7W8CzgYuBl7dse1bwJHl55cA/1Xm9yRdrn2MoQTOHtheRnHNtgL2Aa7uIfnPgT2aKNd4auiavA34/tqXbtycBhwj6aldth0JnEsRJI/u2HYTsKWkLcpt5wx3kI5rH2MogbN3o/3rPplrBX27JpI+QlHL+sZalWgc2V4FfA14X/V7Sc8D7rb9G+BS4IAySFZdQDG4+yDgpzUON5l/r1orgbMHknYBBoC7gCXAgT0k3x+4volyjad+XhNJxwKvAo7xxB9gfDLwdqB6H/NoYA9JtwA3A5sBr+9Idw7wSeASj/BS8o5rH2MogbMmSVsCc4FTy//UpwLHSjqoss+bJW3dJe2LKe5HfXmsyjsW+nlNJM0CPgS82vZDY1H+Jtm+FziPIngOdva8EdjX9k62d6LoCDq6I92twEeA04fLv8u1jzGURy6Ht7GkxcD6FM3Hs4HPA9i+U9JRwEmStgLWAJdRNLUAjpR0MLAJ8Gvg9bYnQ42zqWtyKrAhcEnZ13Gl7dljdE5N+Rwwp/z8IuA229Ve8suAvTrnjLT9pSHyG/Lax9jKI5cRET1KUz0iokcJnBERPUrgjIjoUQJnRESPEjgjInqUwBkR0aMEzoiIHv0vu/0EdcnfmaEAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig0, ax0 = plt.subplots()\n", "im0 = plt.imshow(ces0, vmax=np.log(2), vmin=0)\n", "plt.xticks(np.arange(3), labels)\n", "plt.yticks(np.arange(3), labels)\n", "plt.title('Clustering ensemble similarity')\n", "cbar0 = fig0.colorbar(im0)\n", "cbar0.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with one method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clustering methods should be subclasses of `analysis.encore.clustering.ClusteringMethod`, initialised with your chosen parameters. Below, we set up an affinity progragation scheme, which uses message-passing to choose a number of 'exemplar' points to represent the data and updates these points until they converge. The `preference` parameter controls how many exemplars are used -- a higher value results in more clusters, while a lower value results in fewer clusters. The `damping` factor damps the message passing to avoid numerical oscillations. [(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#affinity-propagation)\n", "\n", "The other keyword arguments control when to stop clustering. Adding noise to the data can also avoid numerical oscillations." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "clustering_method = clm.AffinityPropagationNative(preference=-1.0,\n", " damping=0.9,\n", " max_iter=200,\n", " convergence_iter=30,\n", " add_noise=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, MDAnalysis will run the job on one core. If it is taking too long and you have the resources, you can increase the number of cores used." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "ces1, details1 = encore.ces([u1, u2, u3],\n", " selection='name CA',\n", " clustering_method=clustering_method,\n", " ncores=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAU4AAAEICAYAAAAwUh0YAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3deZwdVZ338c83YY8gaECQHUFWkSUGncEFFQ2MI64DiA64YZToOD7y6IzzuIyzuKADDmCMjIK4ADKgzBgNyKOiAwwJTkADAiEiCSAQUMImkM53/qhqLC63u+t2bnVXd77v16teuXWrzqlT1Z1fn1Pn1CnZJiIi6psy3gWIiJhoEjgjInqUwBkR0aMEzoiIHiVwRkT0KIEzIqJHCZzDkPRxSV8f73IASHpA0i7jXY7xIMmSdh1i23GSftbn4x0j6eJRpn2hpBsq67dIevlalGWd/bm32TofOCW9SdKi8hf0Dknfl3RwH/PfqfyPv97a5GP7KbaX9atcMTTb37D9ilGm/ant3ftYlsd/7pLOlPQP/co7Rm+dDpySPgCcDPwT8AxgB+B04IjxLFfV2gbcmJjyc2+3dTZwSnoq8PfACbYvsP2g7cds/4ftE7vs/xJJKzq+e7wZJmlmWXNdJelOSZ8vd7us/Pf3Za32BeX+b5N0vaTfSVogacdKvpZ0gqSbgJsq3+1afj5T0mmSvifpfkn/LelZlfSvkHSDpPsknS7pJ5LeMcR1mCLpw5JulnSPpPMkPa3cNlhbPlbSrZJWSvpIJe1Q54yk50u6XNLvJV0j6SWVbT+W9A/l9gck/Yekp0v6RpnXQkk7dRT1cEnLyjJ8VlLX311Je0i6RNK95TX4i277lfseV+Z5v6RfSzqm8v3PKvtZ0nsk3VTu+0lJz5J0RVne8yRtUO77pN+Tjut1RXlN7pB06mC6ynG6/twlHQ8cA/zfyjU7UdK/dxzjXyWdPNQ5R5/YXicXYBawGlhvmH0+Dny9/PwSYEXH9luAl5efrwDeUn5+CvD88vNOgKvHAV4DLAX2BNYD/g64vLLdwCXA04CNK9/tWn4+E7gXmFmm/wZwTrltOrAKeF257a+Ax4B3DHGO7weuBLYDNgS+BHyro+xfBjYGngs8Auw5wjlvC9wDHE7xx/nQcn3LcvuPy/N/FvBU4DrgRuDlZZm/Bny143r8qLweO5T7vqPcdhzws/LzNGA58NYynwOAlcDeXc57Wnmddi/Xtxncr5pn5fgXAZsBe5fX4FJgl0r5j+32e8ITf0cOBJ5flm0n4Hrg/T3+3P+hsv82wIPA5uX6esBdwIHj/f9rsi/rbI0TeDqw0vbqPuX3GLCrpOm2H7B95TD7vgv4Z9vXl8f/J2C/aq2z3H6v7YeHyOMC21eV6b8B7Fd+fziwxEUtejXwBeC3I5TlI7ZX2H6E4o/FGzqaip+w/bDta4BrKALocOf8ZmC+7fm219i+BFhUlm3QV23fbPs+4PvAzbZ/WJb528D+HeX8dHk9bqW4vXJ0l3N5FXCL7a/aXm3758C/A28Y4tzXAPtI2tj2HbaXDHOdPm17VbnPL4GLbS+rlL+zvE9i+2rbV5Zlu4Xij9SLO3Yb6edeze8OihbNG8uvZlH8Tl89UtpYO+ty4LwHmN7He0lvB54N/Kpsar5qmH13BE4pm2y/p6g9iqKmNmj5CMerBsOHKGp8AM+sprVtoGvTsVKWCytluR4YoLjnO9KxhjrnHYE3DuZZ5nswRQ1p0J2Vzw93WX8KT1S9Hr8pz7PbuRzUcdxjgK07d7T9IHAkMBu4o7ztsUeXPEdb3ieR9GxJ/ynpt5JWUfzBnN6x20g/905nUfyhovz37B7Txyisy4HzCuAPFM3mOh4ENhlckTQV2HJw3fZNto8GtgI+DZwvaRpFU6vTcuBdtjevLBvbvryyz2inrbqDotk9WE5V14coy2EdZdnI9m0jHWiYc14OnN2R5zTbnxrlOQFsX/m8A3D7EOfyk47jPsX2u4co/wLbh1IE9F9R3JJo0hfL4+xmezPgbyn+YD6hWMOk77btO8C+kvahqHF/ox8FjeGts4GzbGJ9FDhN0mskbSJpfUmHSfpMlyQ3AhtJ+jNJ61Pcl9xwcKOkN0va0vYa4Pfl1wPA3RRNwupYvLnA30jau0z7VElvpD++BzynPKf1gBPoUuPqKMs/Dt4mkLSlpFqjCoY5568Dfy7plZKmStqo7DQZLoCP5ERJW0januK+7bld9vlP4NmS3lL+LNeX9DxJe3Yp+zMkvboM9I8AD5Rlb9KmFPdVHyhrt10D+jDu5Im/R9j+A3A+8E3gqvJWRjRsnQ2cALY/D3yAIgjeTVFjmUPxV7xz3/uA9wBnALdR1ECrTeBZwBJJDwCnAEfZ/oPth4B/BP6rbD4+3/aFFDW0c8om2y+Bw/p0Tisp7nl9huJ2xF4U9xcfGSLJKRQdHxdLup+io+igmocb6pyXUwzp+lv+eF1PZO1+374LXA0spvjj8G+dO9i+H3gFcBRFjfS3FNd5w859y7L8n3K/eynuNb5nLcpXxweBNwH3U9RuuwX/4fwbsFf5e1T9HT0LeA5ppo8ZFbfAYrIqh+2sAI6x/aPxLk/0n6QdKG4BbG171XiXZ12wTtc4J6uyiby5pA3543204Xr5Y4Iq/zB+gGI4WoLmGMnTCZPTCyjueW1AMcbwNXWGt8TEUt6fvZNilMGscS7OOiVN9YiIHqWpHhHRownZVN9AG3ojpo13MVpr130fHO8itN6UJw2fjKpblj/GynsH1uoivfKQab7n3nojvK6+9pEFtifM7YYJGTg3YhoH6WXjXYzWuvD7V413EVpvkykbjLzTOmzmK3t9gOnJ7rl3gKsW7FBr36nb3NT5BFWrTcjAGRHtZ2ANa8a7GI1I4IyIRhjzmJt+GGt8JHBGRGNS44yI6IExA5N0uGMCZ0Q0Zs2oJ/lqtwTOiGiEgYEEzoiI3qTGGRHRAwOP5R5nRER9xmmqR0T0xDAwOeNmAmdENKN4cmhySuCMiIaIgUk6mUoCZ0Q0ougcSuCMiKitGMeZwBkR0ZM1qXFGRNSXGmdERI+MGJikb+dJ4IyIxqSpHhHRAyMe9dTxLkYjEjgjohHFAPg01SMiepLOoYiIHthiwKlxRkT0ZE1qnBER9RWdQ5MzxEzOenREjLvBzqE6Sx2SZkm6QdJSSR8eYp+XSFosaYmkn/TzfKom55+DiGiFgT6N45Q0FTgNOBRYASyUdJHt6yr7bA6cDsyyfaukrfpy8C4SOCOiEX1+cmgmsNT2MgBJ5wBHANdV9nkTcIHtWwFs39Wvg3dKUz0iGrPGU2otNWwLLK+sryi/q3o2sIWkH0u6WtJf9uk0niQ1zohoRDHJR+262XRJiyrr82zPq6x3a/N3vphjPeBA4GXAxsAVkq60fWPdQtSVwBkRjTDisfqPXK60PWOY7SuA7Svr2wG3d9lnpe0HgQclXQY8F+h74ExTPSIaYcOAp9RaalgI7CZpZ0kbAEcBF3Xs813ghZLWk7QJcBBwfV9PqtSXwClpoDIE4BpJH5A0pbJ9pqTLyqEEv5J0hqRNJB0n6W5J/yPpJkkLJP1JP8oUEeNNrKm5jMT2amAOsIAiGJ5ne4mk2ZJml/tcD/wAuBa4CjjD9i+bOLN+NdUftr0fQDkE4JvAU4GPSXoG8G3gKNtXSBLwemDTMu25tueUaQ8BLpB0SHkRImKCMvT1kUvb84H5Hd/N7Vj/LPDZvh10CH1vqpdDAI4H5pRB8gTgLNtXlNtt+3zbd3ZJ+yNgXpk+Iia4AabUWiaaRkpcjrWaAmwF7ANc3UPynwN7NFGuiBg7RqxxvWWiabJXfbRXo2s6ScdT1kQ3YpPRlikixkjxeuDJOXCnkRqnpF2AAeAuYAnF2Kq69qdLT5jtebZn2J6xPhv2p6AR0SAxUHOZaPoeOCVtCcwFTrVt4FTgWEkHVfZ5s6Stu6R9MUWt8sv9LldEjC3T1yeHWqVf9eiNJS0G1gdWA2cDnwewfaeko4CTyh73NcBlwAVl2iMlHQxsAvwaeH161CMmh4lYm6yjL4HTHv7xgLJH/YVdNp1ZLhExydhqfW1S0rTySaOetPusImLCKjqHptZaxpqkP5F0HWV/iqTnSjq9bvoEzohoiPr5yGW//QvwSuAeANvXAC+qm3hyjhWIiHFXdA619x6n7eXFMzqPG6ibNoEzIhrT4qeClpfzYricNOR99DAhSAJnRDRi8MmhlpoNnEIxGfIK4GKKx8NrSeCMiMbUfRHbWLO9EjhmtOnbeVYRMeHZ8NiaKbWWsSbprPLlboPrW0j6St30qXFGRCOKpnpr62b72v794Irt30nav27iBM6IaEyLnxyaImkL278DkPQ0eoiHCZwR0YiWD0f6HHC5pPPL9TcC/1g3cQJnRDSkvU1121+TdDVwCMVUlq+zfd0IyR6XwBkRjanzPqFx9Cvgd5RxUNIOtm+tkzCBMyIaUfSqj/1z6HVIei/wMeBOiieGRHF3Yd866RM4I6IRLR8A/1fA7rbvGU3iBM6IaEyLm+rLgftGmziBMyIa0fJe9WXAjyV9D3hk8Evbn6+TuJ1dXhExKfTz1RmSZkm6QdJSSR/usv0lku6TtLhcPjpMdrcClwAbAJtWllpS44yIRthidZ+GI0maCpwGHEoxKcdCSRd1GUL0U9uvGrls/kSZb2aAj4h26eN71WcCS20vs/0ocA5wxGjLJekFmQE+Ilpn8B5nzcA5XdKiynJ8R3bbUnToDFpRftfpBZKukfR9SXsPU7yTyQzwEdFGPXQOrbQ9Y5jt3TJyx/rPgR1tPyDpcOA7wG5DZbg2M8CnxhkRjRgcx9mnpvoKYPvK+nbA7U84nr3K9gPl5/nA+pKmD5HfE2aAl/RBepgBPoEzIhqzBtVaalgI7CZp5/JVF0cBF1V3kLS1yiqkpJkU8W2oAe6zKWZ8H5wBfj8yA3xEjDcbVvdpkmLbqyXNARYAU4Gv2F4iaXa5fS7wBuDdklYDDwNH2e5szg/mt1YzwCdwRkRj+jkAvmx+z+/4bm7l86nAqXXykvSFLl/fByyy/d2R0qepHhGN6PM9zn7biKJ5flO57As8DXi7pJNHSpwaZ0Q0xu195HJX4KW2VwNI+iLFmy4PBX4xUuIEzohoTIsn+dgWmMYfJ/qYBjzT9oCkR4ZOVkjgjIhG2K2e5OMzwGJJP6YYI/oi4J8kTQN+OFLiBM6IaIgYGIdX/46kHLJ0MUVH00yKwPm3tgfHhZ44Uh4JnBHRmDbe47RtSd+xfSAwYg96NxMycO6674Nc+P2rxrsYrfXa7WaOdxFab8Hti8e7CJNey+fjvFLS82wvHE3iCRk4I2ICcHGfs6UOAWZLugV4kPKdQ7bzzqGIGF8t7lU/bG0St+/ObURMCi47h+osY142+zcUk4a8tPz8ED3Ew9Q4I6IxbW2qS/oYMAPYHfgqsD7wdeBP66RP4IyIxrSxV730WmB/ijk8sX27pLxzKCLGl93qwPloOSzJULx7qJfECZwR0ZgWD0c6T9KXgM0lvRN4G/DluokTOCOiMW29x2n7JEmHAqso7nN+1PYlddMncEZEI4xY08JHLgEk/TXw7V6CZVU7zyoiJgXXXMbBZsACST+VdIKkZ/SSOIEzIppRdg7VWca8aPYnbO9N8Z6hZwI/kTTirEiD0lSPiOa09B5nxV3Abyle6rZV3USpcUZEY9pa45T07nIuzkuB6cA76z6nDqlxRkRDDKxZ09rhSDsC77c9qmmyEjgjohkGWjaOU9JmtldRzACPpKdVt9u+t04+aapHRGPseksdkmZJukHSUkkfHma/50kakPSGLpu/Wf57NbCo/PfqynotqXFGRHP61DkkaSpwGsVbKFcACyVdZPu6Lvt9GljQtTj2q8p/d16b8iRwRkRD+trxMxNYansZgKRzgCOA6zr2ey/w78DzupZIOmC4g9j+eZ3CJHBGRHPq1zinS6o2lefZnldZ3xZYXllfARxUzUDSthSzHr2UIQIn8Lny340oppW7hmL2932B/wYOrlPYBM6IaIbB9XvVV9qeMcz2bhl1huWTgQ+V70bvXiT7EHi8xnq87V+U6/sAH6xb2ATOiGhQ35rqKyhmbB+0HXB7xz4zgHPKoDkdOFzSatvf6ZLfHoNBE8D2LyXtV7cwCZwR0Zz+PTm0ENhN0s7AbcBRwJuecKhKh4+kM4H/HCJoAlwv6QyKWd8NvBm4vm5hEjgjojl9Cpy2V0uaQ9FbPhX4iu0lkmaX2+f2mOVbgXcDf1WuXwZ8sW7iBM6IaEafB8Dbng/M7/iua8C0fdwIef0B+Jdy6VkCZ0Q0pq0TGa+tBM6IaE57n1VfKwmcEdEYpcYZEdGDcZzefSSSng2cSDFL0uNx0PZL66RP4IyIhqh1syNVfBuYS/Fmy4FeEydwRkRzWlrjBFbbrj38qFOmlYuI5qypuYy9/5D0HknbSHra4FI3cWqcEdGMFk5kXHFs+e+Jle8M7FInce3AKWkA+AWwPrAaOAs42faacvtM4CTgGWUBfga8D/gL4LMUz5o+BVgGfML25WW6zwJ/DjwK3Ay81fbv65YrItqrrb3qazsfZy9N9Ydt71e+UvNQ4HDgYwDlO4m/TTEzye7AnsAPgE3LtOfa3t/2bsCngAsk7VluuwTYp3xR0o3A36zNCUVEi7T0xeqS1pf0Pknnl8scSevXTT+qe5y27wKOB+aomIrkBOAs21eU2237fNt3dkn7I2BemR7bF9teXW6+kmLWk4iIJn0ROBA4vVwOZCyeVbe9TNIUincR70PRdK/r58C7unz/NuDcbgkkHU8ZbLffdmpvhY2IcdHWpjrwPNvPraz/f0nX1E28tr3qo73z+6R0kj5Cce/0G90S2J5ne4btGdOfnsAZ0XqmeOSyzjL2BiQ9a3BF0i70MJ5z1DXOyoHuApZQVHW/WzP5/lTmvpN0LPAq4GX2ZJ0WIGId1N7/zScCP5K0jKIityPFVHO1jCpwStqSYtT9qbYt6VTgKknfs/3f5T5vBn7YJe2LKZrcg1PYzwI+BLzY9kOjKU9EtFNbm+q2L5W0G7A7ReD8le1H6qbvJXBuLGkxfxyOdDbw+bIQd0o6CjhJ0lYUQ1ovAy4o0x4p6WBgE+DXwOttD9Y4TwU2BC4pp7y/0vbsHsoVEW3V0sBZOhDYiSIOPlcStr9WJ2HtwGl72BuLZY/6C7tsOrNchkq3a90yRMQE09LAKels4FnAYv54b9NAfwNnREQv5PY21Sle7LbXaPtUEjgjojntncj4l8DWwB2jSZzAGRGNaXGNczpwnaSrgMc7hWy/uk7iBM6IaE57A+fH1yZxAmdENKPF9zht/2Rt0mc+zohoTnsn+XidpJsk3SdplaT7Ja2qmz6BMyIaozX1llp5SbMk3SBpqaQPd9l+hKRrJS2WtKgcOz6UzwCvtv1U25vZ3tT2ZnXPK031iGg9SVOB0yimtFwBLJR0ke3rKrtdClxUPs24L3AesMcQWd5ZeQinZwmcEdGc/jXDZwJLbS8DkHQOcATweOC0/UBl/2kjHH2RpHOB7/DEXvULhk7yRwmcEdGM3jqHpktaVFmfZ3teZX1bYHllfQVwUGcmkl4L/DPFdJd/NszxNgMeAl7xxBKTwBkR46x+4Fxpe8Yw27uNpH9S7rYvBC6U9CLgk8DLuxbLrj0TUjcJnBHRnP411VcA21fWtwNuH/Kw9mWSniVpuu2VndslbQS8Hdgb2KiS7m11CpNe9YhohOhrr/pCYDdJO0vaADgKuOgJx5N2LV/lg6QDgA2Ae4bI72yKRy5fCfyEIhDfX/fcUuOMiGb0cQC87dWS5gALgKnAV2wvkTS73D4XeD3wl5IeAx4GjhxmEo9dbb9R0hG2z5L0zTLvWhI4I6I5fRzcbns+ML/ju7mVz58GPl0zu8fKf38vaR/gtxRzc9aSwBkRzWnpI5fAPElbAP+Posn/FOCjdRMncEZEY1r8rPoZ5cefALv0mj6BMyKa09LAKWlDinuiO1GJg7b/vk76BM6IaIbrP4c+Dr4L3AdcTeXJoboSOCOiOS2tcQLb2Z412sQZxxkRjRl879BIyzi4XNJzRps4Nc6IaE7LapySfkFRqvWAt0paRtFUF2Db+9bJJ4EzIpoxTpMUj+BV/cgkgTMiGiFaORzpbuAx248BSNodOBz4Td0p5SD3OCOiQS28x/kDyieEJO0KXEExjvMESf9cN5MEzohoTvveObSF7ZvKz8cC37L9XuAwemjGJ3BGRHPaFzirR3spcAmA7UeB2qNOc48zIprRztcDXyvpJOA2YFfgYgBJm/eSSWqcEdGc9tU43wmspLjP+QrbD5Xf7wWcVDeT1DgjojFte+TS9sPAp6rfSTrA9uXA5XXzmZCBcwpikykbjHcxWmvB7YvHuwit98pn7jfeRWi1Gz3UxOm9aWFTvZszgAN6STAhA2dETADtHADfTbcXwQ0rgTMimjMxAucnek2QwBkRjWjpk0OPk7QtsCNwb/k6YWxfVidtAmdENEZr2hk5JX0aOBK4DhgovzaQwBkR46jd9zhfA+xuu+dJjCGBMyIa1OKm+jJgfUYx+zskcEZEk/oYOCXNAk6heK/6GbY7x2MeA3yoXH0AeLfta4bI7iFgsaRLqQRP2++rU5YEzohoTL9qnJKmAqcBhwIrgIWSLrJ9XWW3XwMvtv07SYcB84CDhsjyonIZlQTOiGhO/2qcM4GltpcBSDoHOIKic6c4VPH0z6Arge2GLJZ9lqSNgR1s39BrYfKsekQ0o3zLZZ0FmC5pUWU5viO3bYHllfUV5XdDeTvw/aE2SvpzYDHF/JxI2k9S7RpoapwR0Ygex3GutD1jhOw6dc1d0iEUgfPgYfL7OEUt9scAthdL2rlWSUngjIgmuW9t9RXA9pX17YDbO3eStC/Fs+eH2cM+cL/a9n3SE+Jx7cKmqR4RjenjqzMWArtJ2lnSBsBRdHTuSNoBuAB4i+0bR8jvl5LeBEyVtJukf6WH2ZESOCOiGXXn4qwROG2vBuYAC4DrgfNsL5E0W9LscrePAk8HTpe0WNKiYbJ8L7A3xVCkbwGrgPfXPbU01SOiMf2cj9P2fGB+x3dzK5/fAbyjZl4PAR8BPlIOdZpm+w91y5IaZ0Q0pode9bEtl/RNSZtJmgYsAW6QdGLd9AmcEdEMU3QO1VnG3l62V1E8sz4f2AF4S93ECZwR0ZgWvld90PqS1qcInN+1/RjpVY+IVmjfy9oGfQm4BZgGXCZpR4oOolrSORQRjWjzRMa2vwB8ofLVb8qB87UkcEZEM+w2T2S8IfB6itcEV+Pg39dJn8AZEc1pZ9wE+C5wH3A1o5iTM4EzIhrT1qY6sJ3tWaNNnM6hiGiGgTWut4y9yyU9Z7SJU+OMiOa0t8Z5MHCcpF9TNNUF2Pa+dRIncEZEY1rcVD9sbRKnqR4RjdEa11rGmu3fUExT99Ly80P0EA9T44yIZrT49cCSPgbMAHYHvkrxxsuvA39aJ30CZ0Q0ohgA39LICa8F9gd+DmD7dkmb1k2cwBkRzRmHmY9qetS2peIubDlLUm25xxkRjZFdaxkH50n6ErC5pHcCl1K8cqOW1Dgjohktvsdp+yRJh1JM7PFs4O9s/7Bu+hFrnJIs6XOV9Q9K+njHPtdI+lbHd2dKeqh630DSKWV+08v1gXKK+yVlHh+QlFpwxKRQr0d9LHvVJd0vaZWkVRTvJ5pdLhdKulvSlZJeNlI+dYLUI8DrBoNdl4LsWebzoi73CZZSvDSeMiAeAtxW2f6w7f1s7w0cChwOfKxGmSJiImjZRMa2N7W9WblsWl2ArYF3AaeMlE+dwLkamAf89RDb3wScDVwMvLpj27eAI8vPLwH+q8zvSWzfBRwPzFHHOzsjYgJye1+d0Y3tAdvXAP860r51m8WnAcdIemqXbUcC51IEyaM7tt0EbClpi3LbOcMdxPayskxb1SxXRLRZy2qcddj+0kj71Aqc5bs5vga8r/q9pOcBd5cj7y8FDiiDZNUFFO9APgj4aY3Dda1tSjpe0iJJi+6+Z6BOsSNivLV3Bvi10ktHzMnA2ymmmh90NLCHpFuAm4HNKCYHrToH+CRwie1hK+WSdgEGgLs6t9meZ3uG7RlbPn1qD8WOiPGiNWtqLbXykmZJukHSUkkf7rJ9D0lXSHpE0gf7fjIVtQOn7XuB8yiC52BnzxuBfW3vZHsnio6gozvS3Urx/uLTh8tf0pbAXOBUu2V194jonSkGwNdZRlC++/w0isk59gKOlrRXx273UrSKT+pL+YfR69CfzwGDvesvAm6zXe0lvwzYS9I21US2v2T75i75bTw4HAn4IUUH0yd6LFNEtJCoN/i95gD4mcBS28tsP0rRkj2iuoPtu2wvBB7r/9k80YgD4G0/pfL5TmCTyubnd+w7AAwGzeOGyG+nyue0uSMms/qNx+mSFlXW59meV1nfFlheWV9B0W8yLvLkUEQ0p37gXGl7xjDbu3Uaj9stvQTOiGjG4D3O/lhBMX/moO2A2/uWe48SOCOiMXV7zGtYCOwmaWeKpw+Ponj4ZlwkcEZEQ/o3uN32aklzgAXAVOArtpdIml1unytpa2ARxbDINZLeD+xVjkPvqwTOiGiG6etTQbbnA/M7vptb+fxbiiZ84xI4I6I5LXkOvd8SOCOiMS1+dcZaSeCMiOYkcEZE9MCGgcnZVk/gjIjmpMYZEdGjBM6IiB4YGMP3CY2lBM6IaIhh+Cl4J6wEzohohknnUEREz3KPMyKiRwmcERG9aN8bLPslgTMimmGgf9PKtUoCZ0Q0JzXOiIhe5JHLiIjeGJxxnBERPcqTQxERPco9zoiIHtjpVY+I6FlqnBERvTAeGBjvQjQigTMimpFp5SIiRmGSDkeaMt4FiIjJyYDXuNZSh6RZkm6QtFTSh7tsl6QvlNuvlXRAv89pUAJnRDTD5UTGdZYRSJoKnAYcBuwFHC1pr47dDgN2K5fjgS/294T+KIEzIhrjgYFaSw0zgaW2l9l+FDgHOKJjnyOAr7lwJbC5pG36e0aFCXmP8+prH1k5dZulvxnvclRMB1aOdyFarmXXaOl4F6BTy64PO65tBg6hWkQAAAOiSURBVPfzuwU/9PnTa+6+kaRFlfV5tudV1rcFllfWVwAHdeTRbZ9tgTtqlqG2CRk4bW853mWokrTI9ozxLkeb5RoNbzJeH9uz+piduh1iFPv0RZrqETERrAC2r6xvB9w+in36IoEzIiaChcBuknaWtAFwFHBRxz4XAX9Z9q4/H7jPdt+b6TBBm+otNG/kXdZ5uUbDy/UZhu3VkuYAC4CpwFdsL5E0u9w+F5gPHE5xA/sh4K1NlUeepM+SRkQ0JU31iIgeJXBGRPQogXMYkgYkLZa0RNI1kj4gaUpl+0xJl5WPgf1K0hmSNpF0nKS7Jf2PpJskLZD0J+N5Lv3S1DWR9Nly/2slXShp8/E5w7UjyZI+V1n/oKSPd+xzjaRvdXx3pqSHJG1a+e6UMr/p5fqw1z7GTi768B62vZ/tvYFDKW48fwxA0jOAbwMfsr07sCfwA2DwF/9c2/vb3g34FHCBpD3H/Az6r6lrcgmwj+19gRuBvxmzM+qvR4DXDQa7TuX5TgFeJGlax+allE/DlAHxEOC2yvYhr32MrQTOmmzfRfH86xxJAk4AzrJ9Rbndts+3fWeXtD+i6DU9fizL3LR+XhPbF9teXW6+kmIM3kS0muK8/nqI7W8CzgYuBl7dse1bwJHl55cA/1Xm9yRdrn2MoQTOHtheRnHNtgL2Aa7uIfnPgT2aKNd4auiavA34/tqXbtycBhwj6aldth0JnEsRJI/u2HYTsKWkLcpt5wx3kI5rH2MogbN3o/3rPplrBX27JpI+QlHL+sZalWgc2V4FfA14X/V7Sc8D7rb9G+BS4IAySFZdQDG4+yDgpzUON5l/r1orgbMHknYBBoC7gCXAgT0k3x+4volyjad+XhNJxwKvAo7xxB9gfDLwdqB6H/NoYA9JtwA3A5sBr+9Idw7wSeASj/BS8o5rH2MogbMmSVsCc4FTy//UpwLHSjqoss+bJW3dJe2LKe5HfXmsyjsW+nlNJM0CPgS82vZDY1H+Jtm+FziPIngOdva8EdjX9k62d6LoCDq6I92twEeA04fLv8u1jzGURy6Ht7GkxcD6FM3Hs4HPA9i+U9JRwEmStgLWAJdRNLUAjpR0MLAJ8Gvg9bYnQ42zqWtyKrAhcEnZ13Gl7dljdE5N+Rwwp/z8IuA229Ve8suAvTrnjLT9pSHyG/Lax9jKI5cRET1KUz0iokcJnBERPUrgjIjoUQJnRESPEjgjInqUwBkR0aMEzoiIHv0vu/0EdcnfmaEAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig1, ax1 = plt.subplots()\n", "im1 = plt.imshow(ces1, vmax=np.log(2), vmin=0)\n", "plt.xticks(np.arange(3), labels)\n", "plt.yticks(np.arange(3), labels)\n", "plt.title('Clustering ensemble similarity')\n", "cbar1 = fig1.colorbar(im1)\n", "cbar1.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with multiple methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may want to try different clustering methods, or use different parameters within the methods. `encore.ces` allows you to pass a list of `clustering_method`s to be applied.\n", "\n", "
\n", " \n", "**Note**\n", "\n", "To use the other ENCORE methods available, you need to install [scikit-learn](https://scikit-learn.org/stable/).\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Trying out different clustering parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The KMeans clustering algorithm separates samples into $n$ groups of equal variance, with centroids that minimise the inertia. You must choose how many clusters to partition. [(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#k-means)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "km1 = clm.KMeans(12, # no. clusters\n", " init = 'k-means++', # default\n", " algorithm=\"auto\") # default\n", "\n", "km2 = clm.KMeans(6, # no. clusters\n", " init = 'k-means++', # default\n", " algorithm=\"auto\") # default" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The DBSCAN algorithm is a density-based clustering method that defines clusters as 'high density' areas, separated by low density areas. The parameters `min_samples` and `eps` define how dense an area should be to form a cluster. Clusters are defined around core points which have at least `min_samples` neighbours within a distance of `eps`. Points that are at least `eps` in distance from any core point are considered outliers.\n", "[(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#dbscan)\n", "\n", "A higher `min_samples` or lower `eps` mean that data points must be more dense to form a cluster. You should consider your `eps` carefully. In MDAnalysis, `eps` can be interpreted as the distance between two points in Angstrom.\n", "\n", "
\n", " \n", "**Note**\n", "\n", "DBSCAN is an algorithm that can identify outliers, or data points that don't fit into any cluster. ``dres()`` and ``dres_convergence()`` treat the outliers as their own cluster. This means that the Jensen-Shannon divergence will be lower than it should be for trajectories that have outliers. Do not use this clustering method unless you are certain that your trajectories will not have outliers.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "db1 = clm.DBSCAN(eps=0.5,\n", " min_samples=5,\n", " algorithm='auto',\n", " leaf_size=30)\n", "\n", "db2 = clm.DBSCAN(eps=1,\n", " min_samples=5,\n", " algorithm='auto',\n", " leaf_size=30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we pass a list of clustering methods to `encore.ces`, the results get saved in `ces2` and `details2` in order." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 4\n" ] } ], "source": [ "ces2, details2 = encore.ces([u1, u2, u3],\n", " selection='name CA',\n", " clustering_method=[km1, km2, db1, db2],\n", " ncores=4)\n", "print(len(ces2), len(details2['clustering']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAygAAADACAYAAADvC0eDAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3debgcVZnH8e8vISEQVkkACRBAIqthC0GRLSiyiAOKSgAVFMUoyKijIw6jwjg6I6IiAxgigyAoQREVRyQsKojAmIABDIjEsCQwJIQtLAGSm3f+qHNJpdNL3dv3dnfd+/s8z3nStZzqU31fmn7r1DmliMDMzMzMzKwTDGl3A8zMzMzMzLo5QTEzMzMzs47hBMXMzMzMzDqGExQzMzMzM+sYTlDMzMzMzKxjOEExMzMzM7OO4QTFGpIUkrZtdzvMGpF0iaR/b3c7zMzMrPcGVYIi6WFJb88tT5b0jKT929mu3pI0SdLvJD0n6eGKbRtLukLS42n7HyXt1aam+odjHxtosQwgaXdJt0h6QdJCSf/YpnYcIGlBO957sElxvFTS85KelXSbpCmShuT2uUTSqykunpd0Zz7OJQ2X9C1JC9I+D0n6TsX7HCtpVtr+f5J+I2mfin1OSBdj3l+x/oC0/vyK9bdKOqFPP5A+ImnX9Dm9lP7dtc6++c+3uwxtZXsHCsdz/5A0TdIDklZ0ahut7w2qBCVP0vHA+cA7I+Lmdrenl14ELgY+X2XbOsBMYA/gdcClwK8lrdO65vUdSWu0uw2daiDEsqRRwHXAhcBGwLbA9W1tVC85VnvsXRGxLjAW+E/gC8B/V+xzVkSsA6wPfA+4Ovcj+ovABGAisC4wCfhzd0VJnwXOAb4ObAJsCVwAHFHxHscDT6d/K70IfEjSVr06wxaSNBz4JXA5sCHZd/8v0/pazoqIdXKlqxVtHaAcz33vbuCTwF3tboi1UEQMmgI8DLwdOAlYDEzIbdsKCODDwHzgGWAKsCdwD/AscF7F8T4C3J/2nQGMzW37bjrOEuBOYN/ctjOAnwA/BJ4H5lS05QvAY2nbA8DbGpzX24GHC5z/EmCPGtuGAv8C/D29753AFmlbANum178HPpqrdwJwa3ot4DvAIuC59LntnD7vZcCrwAvAr9L+mwE/A54EHgJOrfiMriL7n+wS4KNkX9iz0vJC4NvtjinHct/EMtn/bC/rwfnvA9yWzmU+cEJafwnw75WxmauXj+XDgPtS2x4DPgeMBJYCK1KsvpDidAhwWvrv46l0zq+r+LxPBB4FbgFGpNh9KrVxJrBJu+Om00p3HFesm5g+/50r/6Zpee30eW+Wlv8H+HSN46+f/obva9COsek9jwKW5/9WwAHAAuC/gB/k1t/aHXdVjlckXk4CHgf+D/inivPv9fcc8I4Uz8qtexQ4pMb+q3y+Lo7nKsdrWzxXtKNmG10GXhmMPSifAL5K9kNpVpXtewHjgKPJrlKcTvZDcCfg/d1dsZKOJPtB/x5gNPAH4IrccWYCu5L1XvwY+KmkEbnt/wBMBzYArgHOS8fdDjgF2DOyqzAHk33pNSV18Q8H5tbY5bPAMWQ/2tYj+8H6Ug/f5h3AfsAbyc7raOCpiJgG/IiVV+nelbq8f0V2ZWQM8Dbg05IOzh3vCLIkZYNU/7vAdyNiPeANZF+Sg9lAiuU3A0+nWyIWSfqVpC2r7ZjW/4bsf7CjU9tm1zhuPf8NfDy1bWfgtxHxInAo8HisvJr8OHAqcCSwP1nC8gxZr1Xe/sAO6TyPJ/sxsQVZj9AUssTHGoiIP5H9gNq3clu6yvwhsgsaC9PqO4DPSvqkpDdJUq7KW8iSxZ83eNsPAbMi4mdkifpxVfb5GnBUiutGisTLJLL/Pt8BnJa7ZbPm91y6bahWOS3tthNwT0T2iy65J62v5ZOSnk63Gx1V4PysIMdz0/Fsg1W7M6RWFrIfR0vIur+HVGzbiuwqwJjcuqeAo3PLPyNd2SD7gXRibtsQsh/0Y2u89zPALun1GcCNuW07AkvT623JeiDeDgwreF51e1DIEo57gS/W2ecB4Iga24r2oBwI/I3sx2bl53sJq1412gt4tGKfL5Ku6KTP6JaK7bcAZwKj2h1L7S4DLZZT3DxL1sszAjgX+GONfb8I/LzGttfijMY9KI8CHwfWq9jnAGBBxbr7yfX+AK8n6xVcI/d5b5Pb/hGyHp7x7Y6VTi5UueKc1t8BnJ77m76c4uPlVI7L7TsUOBn4I/AK2VXc49O244AnCrTjwdx/D18E7q4WD8BZwJXpdb0rzkXiZfvc9rOA/06vm/qeA74ETK9Y9yPgjBr7706WRK9BdoHqeeCt7Y6NMhbHc9/Hc0U73IMyiMpg7EGZQnaF/6KKKxPdFuZeL62y3D2GYyzw3e5sn+xeT5H1BiDpnyTdr2yA+rNkV1NH5Y71RO71S8AISWtExFzg02Q//BZJmi5ps16eK5LWIuupuCMi/qPOrluQdd/2WkT8luzq+fnAwjSwbb0au48FNstfMSG7ir9Jbp/5FXVOJPvb/VXSTEmHN9PeAWAgxfJSsqRjZkS8TPY/tL0lrV9l36ZjNTmK7AfZI5JulvSWOvuOBX6e+4zuB7qoHa+Xkd0qN13ZRBVnSRrWB20eLMaQxWG3syNiA2AtsvvzvynpUICI6IqI8yPirWS9eF8DLpa0A1liPqreuCBJbwW2JusFhKyX8E2qPrD8G8DBknZp0P6exssjZFemofnvuRfILkrlrUeWeKwmIu6KiKciYnlEXEuWzLynh+9p9Tme/f/tfnfwpJExYZcRdYuk69rdzqIGY4KyiOx2on3JBpb11nyy20M2yJW1IuI2SfuS3Xv/fmDD9EX0HNmPvoYi4scRsQ/Zl0KQfYn0mKQ1gV+Q3Y/88QLn84YCh32R7J7ZbpvmN0bEuRGxB9ntBG9k5QD+/O0G3e/3UMXnt25EHJY/XMWxH4yIY4CNyT6TqySNLNDmgWogxfI9rPr37n5d7X16FauSKmN1ZkQcQRZPv2DlrQeVsdr9nodWfEYjIuKxKm0mIpZFxJkRsSOwN3A42W0X1oCkPcl+0N1auS0yfyG7uvzOKtuXRsT5ZL18OwK3k12hPrLOWx5PFmezJT0B/G9av9rfKyKeIrtd8qsNTqNIvGyRe70l2ZXyut9zWnWmrcryL+lYc4DxFRctxqf1RQQF//u2xhzPTcezFfTU0138acaWdQurXlzsaIMxQSGye8oPBA5RxfR9PTAV+KKknQAkrS/pfWnbumQD054E1pD0ZVa/olWVpO0kHZiSi5fJrixXnVFF0hBlYwGGZYsaoTRTS7pae1Wq/6GIWNHgrS8CvippnDLjJW1UZb/ZwHskra3s2Sgn5tqzp6S90nu/mNrf3faFwDa54/wJWCLpC5LWkjRU0s7py7zWZ/MBSaPTuTybVg/q2WYGSiwDPwDerWx61GFkt6ncGhHPVtn3R8DbJb1f0hqSNqpxdfBuYKd0zBFkPTndbRsu6ThJ60fEMrLb5fKxulFF781U4GuSxqb6oyVVzpqTP/dJyu4fH5qOvazOuRsgab10dXU6cHlE3Ftjv+3JJkmYk5Y/rWzq1LVSPBxPFrd/jojngC8D50s6Mn1vDZN0aOrVGkGWfJ9ENpapu3wKOK7GlepvkyWdO9Q5nSLx8qXUnp3IJrS4Mu1b83suVp1pq7J8Pe37+7T/qZLWlHRKWv/bGp/neyWtk/5/8g7gA2RjyawJjuc+i+fu7+sRZInXsPRbZ1D+fq0nCJbF8rqlVJq9R6xMhYr7Q8m6QecD/8HK+yjXyG1fAByQW74c+Nfc8gfJxnYsSce5OK0fSjYAdwnZjBb/nH9vsh9Kl+eO89p7k13p+hNZd/zTZDN6bFbjfA5I9fLl92nb/mn5JVbORvQCuRmYKo41FPhXssF6z5MNjN48bcvftz+KbPrX58mu+pzByjEobyO7Ev4C2cxSPwLWSdvGkSU3zwK/SOs2IxuM/QTZFaI7an1Guc9/UTr+HODIdseUY7lvYjnV/QRZb98zZLclblFn333Jrgx2t/f4tP4SVh3rdHqKxflkP7yCbGzMcLJpjZ9Jx5gJ7JOrdzErZ+DqnsXrs2RjtZ4nu8Xs65XnnKt/TNr3RbKE59z8dpdV4nhp+kyfI7tCfDIwNLfPJaycAfBFsrFDXyeNvSLrHb4z1X82xdzhFe9zHNlMQi+Sfd/8muxH2eQU18Mq9h+R4uZwqo9J+uf0Nz+hxnkViZfuWY+eAP654r/Npr7ngN3SZ7KUbGrW3So+izm55T+kz24JWVI/ud1xUdbieO63eP49q//WOaDdf+9OK7uNHx4vPb5V3UI2eULb21qkKP3xzczMrJ8pe/bEQ2Q/Ikt2SdNsVY7nzrHbLsPj5t9sWnef9cfMvzMiJrSoSU3xA8XMzMzMzEosgGU0upu/PJygmJmZmZmV3Iqqc7yUkxMUMzOzFomIh/EsWTZAOJ47RwDLBtCwDScoZmZmZmYlFgRd7kFpr+FaM0ZQ7sdfvHH8S+1uwqB35z2vLI6I0a18T8euNevh+ctY/HRXS69YDoS4Xb5xuds/EH53LH1ygb9zrZSe55mWx25PRcCyAfA90a2UCcoIRrKX3tbuZjRlxozZ7W7CoDf09XMfafV7OnatWRMPnt94pz42EOJ24TF7t7sJTdEAeIrOPed91t+5Vko3xlUtj92eE10D6G67UiYoZmZmZmaWycagOEExMzMzM7MOkCUoQ9rdjD7jBMXMzMzMrMQCfIuXmZmZmZl1hkAsi6HtbsZqJI2MiBd7Wm/g9AWZmZmZmQ1C3T0o9UorSdpb0n3A/Wl5F0kXFK3vHhQzMzMzsxLLelA66mf9d4CDgWsAIuJuSfsVrdxRZ2JmZmZmZj3XaWNQImK+tEqbCk+Y7gTFzMzMzKzEIjpuDMp8SXsDIWk4cCrpdq8iPAbFzMzMzKzEAvFqrFG3NCLpEEkPSJor6bQa+xwgabakOZJurnO4KcDJwBhgAbBrWi7EPShmZmZmZiUWwIom+h0kDQXOBw4iSyhmSromIu7L7bMBcAFwSEQ8Kmnjmu2JWAwc19v2uAfFzMzMzKzEsh6UoXVLAxOBuRExLyJeBaYDR1TscyxwdUQ8ChARi2odTNKlKaHpXt5Q0sVFz8cJipmZmZlZya2IIXULMErSrFw5KVd9DDA/t7wgrct7I7ChpN9LulPSh+o0Z3xEPNu9EBHPALsVPRff4mVmZmZmVmIrUg9KA4sjYkKNbdWmAIuK5TWAPYC3AWsBt0u6IyL+VqXuEEkbpsQESa+jB3mHExQzMzMzs5JrZgwKWY/JFrnlzYHHq+yzOD0Z/kVJtwC7ANUSlG8Bt0m6Ki2/D/ha0cb4Fi8zMzMzsxLrnma4XmlgJjBO0tZpWuDJpIcs5vwS2FfSGpLWBvaixtTBEfFD4L3AQmAR8J6IuKzo+bgHxczMzMysxAIKTSVcs37EckmnADOAocDFETFH0pS0fWpE3C/pOuAeYAVwUUT8pc5h/wo8Q8o3JG3ZPcC+EScoZmZmZmYlFogV0dyT5CPiWuDainVTK5a/CXyz0bEkfQr4ClkPShfZGJcAxhdpixMUMzMzM7MSC2BZEz0o/eAfge0i4qneVO6oMzEzMzMzs54SXVUn4mqb+cBzva3cJwmKpC7gXmAYsBy4FDgnIlak7ROBs4FNyJK8W4FTgfeTdRMtANYB5gFnRsRtfdEuMzMzM7OBLutBaTgQvpXmAb+X9Gvgle6VEfHtIpX7qgdlaUTsCpAee/9jYH3gK5I2AX4KTI6I2yUJOApYN9W9MiJOSXUnAVdLmhQRVWcFMDMzMzOzlSLU/TDGTvFoKsNT6ZE+v8UrIhalJ1POlHQGcDJwaUTcnrYHcBVAlqusUvd3kqYBJwGf6eu2mZmZmZkNNJ3WgxIRZwJIGpmem9Ij/ZJqRcS8dOyNgZ2BO3tQ/S5g+8qVkk6SNEvSrGUre4rMOp5j18rIcWtl5di1wSgQy1YMrVtaSdJbJN1Hek6KpF0kXVC0fn/2BfV2pE7VehExLSImRMSEYazZRLPMWsuxa2XkuLWycuzaYNXFkLqlxc4BDgaeAoiIu4H9ilbul9ZK2oZszuNFwBxgjx5U340aT6U0MzMzM7NVBWJ5DK1bWt6miPkVq7qK1u3zBEXSaGAqcF4ab3IecLykvXL7fEDSplXq7k82/uT7fd0uMzMzM7OBKAK6QnVLi82XtDcQkoZL+hw96IDoq0Hya0mazcpphi8Dvg0QEQslTQbOTjN8rQBuAa5OdY+WtA+wNvAQcJRn8DIzMzMzKyYQy1s8zqSBKcB3gTFkjxO5nmzirEL6JEGJqN9vlGbw2rfKpktSMTMzMzOzXuqkBzVGxGLguN7W95PkzczMzMxKrNN6UCSdW2X1c8CsiPhlo/od9UQXMzMzMzPrmQhYFkPqlhYbAewKPJjKeOB1wImSzmlU2T0oZmZmZmYl12FPkt8WODAilgNI+h7ZOJSDgHsbVXaCYmZmZmZWYtk0wx2VoIwBRpLd1kV6vVlEdElq+ATVjjoTMzMzMzPrmQBWhOqWRiQdIukBSXMlnVZl+wGSnpM0O5Uv1zncWcBsST+QdAnwZ7IZfUcCNzZqi3tQzMzMzMzKLJobJC9pKHA+2S1YC4CZkq6JiPsqdv1DRBze4Fgiu53rWmAiIOBfIuLxtMvnG7XHCYqZmZmZWYkFsKK5aYYnAnMjYh6ApOnAEUBlgtK4LREh6RcRsQfQcMauanyLl5mZmZlZiQWwfMWQugUYJWlWrpyUO8QYYH5ueUFaV+ktku6W9BtJO9Vp0h2S9uzt+bgHxczMzMysxLLnoDTsd1gcERNqbKvW/RIVy3cBYyPiBUmHAb8AxtU43iRgiqSHgRfT8SMixjdqJDhBMTMzMzMrvSZv8VoAbJFb3hx4PL9DRCzJvb5W0gWSRqWnxlc6tJnG+BYvMzMzM7MSiyh0i1c9M4FxkraWNByYDFyT30HSpmkAPJImkuURT1VvTzxClvAcmF6/RA/yDvegmJmZmZmVXJGphGuJiOWSTgFmAEOBiyNijqQpaftU4L3AJyQtB5YCkyOi8jYwACR9BZgAbAf8ABgGXA68tUh7nKCYmZmZmZVYILoa95LUP0bEtWRTA+fXTc29Pg84r+Dh3g3sRjZuhYh4XNK6RdviBMXMzMzMrOSaHIPS115N0w0HQHpAY2FOUMzMzMzMSiyCpntQ+thPJF0IbCDpY8BHgO8XrewExczMzMys1Jq/xasvRcTZkg4ClpCNQ/lyRNxQtL4TFDMzMzOzEguaGyTf1yR9BvhpT5KSPCcoZmZmZmZlFtDVQQkKsB4wQ9LTwHTgqohYWLRy5/QFmZmZmZlZjwUQobqlpe2JODMidgJOBjYDbpZ0Y9H6pexBeeP4l5gxY3a7m9GUgzfbtd1NaMr3H7213U0oJcdu+716w9h2N6EpD7x8ecvfc/nGI1l4zN4tf9++tMm5t7W7CU156qNvaXcTzKyjia4VHdWD0m0R8ATZAx03LlrJPShmZmZmZiXXST0okj4h6ffATcAo4GMRMb5o/VL2oJiZmZmZWaYDpxkeC3w6Inp124gTFDMzMzOzklvRAbd4SVovIpYAZ6Xl1+W3R8TTRY7jBMXMzMzMrMSC1t/GVcOPgcOBO8nG7ucbFcA2RQ7iBMXMzMzMrMyiM56DEhGHp3+3buY4TlDMzMzMzMou2t0AkLR7ve0RcVeR4zhBMTMzMzMruU4YgwJ8K/07ApgA3E12m9d44H+BfYocpKOG+5uZmZmZWc/0xYMaJR0i6QFJcyWdVme/PSV1SXrvau2ImBQRk4BHgN0jYkJE7AHsBswtej5OUMzMzMzMyiwgVqhuqUfSUOB84FBgR+AYSTvW2O8bwIwGLdo+Iu59rXkRfwEKP+nZCYqZmZmZWdlFg1LfRGBuRMyLiFeB6cARVfb7FPAzsifE13O/pIskHSBpf0nfB+4veioeg2JmZmZmVmqNe0mAUZJm5ZanRcS09HoMMD+3bQGw1yrvII0B3g0cCOzZ4L0+DHwC+Me0fAvwvUYN7OYExczMzMyszNItXg0sjogJNbZVq1zZ73IO8IWI6JLqv1dEvAx8J5Uec4JiZmZmZlZ6Tc3itQDYIre8OfB4xT4TgOkpORkFHCZpeUT8opk3rsYJipmZmZlZ2a1oqvZMYJykrYHHgMnAsfkd8g9flHQJ8D/9kZyAExQzMzMzs3ILoIknyUfEckmnkM3ONRS4OCLmSJqStk/tk3YW5ATFzMzMzKzkorkeFCLiWuDainVVE5OIOKHesSS9Efg8MJZcvhERBxZpixMUMzMzM7Oya6IHpR/8FJgKfB/o6mllJyhmZmZmZmUWoCZ7UPrY8ogoPK1wJScoZmZmZmalJmg8zXAr/UrSJ4GfA690r4yIp4tUdoJiZmZmZlZ2jZ8W30rHp38/n1sXwDZFKhdOUCR1AfcCw4DlwKXAORHZkBxJE4GzgU1SA24FTgXeD3yTbH7ldYB5wJkRcVuq903gXcCrwN+BD0fEs0XbZWZmZmY2qAUd1YOSn5K4N4b0YN+lEbFrROwEHAQcBnwFQNImZINhvhAR2wE7ANcB66a6V0bEbhExDvhP4GpJO6RtNwA7R8R44G/AF5s5ITMzMzOzwUZRv7S0LdIwSadKuiqVUyQNK1q/JwnKayJiEXAScIqyx0meDFwaEben7RERV0XEwip1fwdMS/WJiOsjYnnafAfZkyvNzMzMzKyoaFBa63vAHsAFqeyR1hXS6zEoETFP0hBgY2Bnslu+iroL+HiV9R8BrqxWQdJJpKRmyzEeOmPl4di1MsrH7bB1N2xza8yKy8fuCNZuc2vMWqfVvSQN7BkRu+SWfyvp7qKVe9WDktPbm91WqyfpdLKxLT+qViEipkXEhIiYMHqjob18W7PWc+xaGeXjdo21Rra7OWaF5WN3GGu2uzlmrdE9BqVeaa0uSW/oXpC0DT14HkqvL+fm3mgRMIes6+aXBavvBtyfO9bxwOHA2yKis/I/MzMzM7MO12HPQfk88DtJ88g6JsYCHy5auVcJiqTRZE+HPC8iQtJ5wJ8k/Toi/jft8wHgxip19yfrep2Ulg8BvgDsHxEv9aY9ZmZmZmaDWgdd4o+ImySNA7YjS1D+GhGvNKj2mp4kKGtJms3KaYYvA76dGrFQ0mTgbEkbAyuAW4CrU92jJe0DrA08BBwVEd09KOcBawI3ZOPtuSMipvSgXWZmZmZmg5Y670nykN1dtRVZvrGLJCLih0UqFk5QIqLuzfNpBq99q2y6JJVa9bYt2gYzMzMzM6siOuc5KJIuA94AzGbl2JMA+jZBMTMzMzOzztRhPSgTgB17O7a82Vm8zMzMzMys3TrrOSh/ATbtbWUnKGZmZmZmZZbGoNQrjUg6RNIDkuZKOq3K9iMk3SNptqRZaXx5LaOA+yTNkHRNdyl6Or7Fy8zMzMys5Jq5xUvSUOB84CBgATBT0jURcV9ut5uAa9IMvuOBnwDb1zjkGb1vjRMUMzMzM7PBbiIwNyLmAUiaDhwBvJagRMQLuf1HUufGsYi4uZnG+BYvMzMzM7MyK3aL16h0a1Z3OSl3hDHA/NzygrRuFZLeLemvwK+Bj9RqjqT3SHpQ0nOSlkh6XtKSoqfjHhQzMzMzs7JrPBB+cURMqLGt2hzFqx0xIn4O/FzSfsBXgbfXON5ZwLtyzz3sEfegmJmZmZmVmGh6kPwCYIvc8ubA47V2johbgDdIGlVjl4W9TU7APShmZmZmZuXX3FTCM4FxkrYGHgMmA8fmd5C0LfD3NEh+d2A48FSN482SdCXwC+CV15oYcXWRxjhBMTMzMzMrs2huFq+IWC7pFGAGMBS4OCLmSJqStk8FjgI+JGkZsBQ4us6DGNcDXgLesWorcYJiZmZmZjYYNPsk+Yi4Fri2Yt3U3OtvAN8oeKwPN9MWJyhmZmZmZmXX+qfF1yRpBHAisBMwont9RNSc+SvPg+TNzMzMzMqsD54k38cuAzYFDgZuJht0/3zRyk5QzMzMzMzKLhqU1to2Ir4EvBgRlwLvBN5UtLJv8TIzMzMzK7k29JLUsyz9+6yknYEngK2KVnaCYmZmZmZWZu3pJalnmqQNgS8B1wDrAF8uWtkJipmZmZlZiQlQByUoEXFRenkzsE1P6ztBMTMzMzMruU66xUvSmmTPTdmKXL4REf9WpL4TlDb5/qO3trsJTfnYlvu0uwl94Kp2N6CUXr1hbLub0JThBz3S7iY0RfFq6980QF2tf9u+9NRH39LuJjRlo4tub3cTzKzTdVAPCvBL4DngTnJPki/KCYqZmZmZWZk1+ST5frB5RBzS28qeZtjMzMzMrOQU9UuL3Sap8LTCldyDYmZmZmZWcp3QgyLpXrKbzdYAPixpHtktXgIiIsYXOY4TFDMzMzOzMuucaYYP74uDOEExMzMzMysx0Rk9KMCTwLKIWAYgaTvgMOCRiLi66EE8BsXMzMzMrOS0IuqWFrmO9MR4SdsCt5M9B+VkSf9R9CBOUMzMzMzMyiwKlAYkHSLpAUlzJZ1WZftxku5J5TZJu1Q5zIYR8WB6fTxwRUR8CjiUHtz+5QTFzMzMzKzktKJ+qVtXGgqcT5ZI7AgcI2nHit0eAvZPA92/Ckyrcqh8KnQgcANARLwKFL4JzWNQzMzMzMxKrsmphCcCcyNiHoCk6cARwH3dO0TEbbn97wA2r3KceySdDTwGbAtcn463QU8a4x4UMzMzM7Myi+Z6UIAxwPzc8oK0rpYTgd9UWf8xYDHZOJR3RMRLaf2OwNkFzgRwD4qZmZmZWfk17kEZJWlWbnlaRHTfpqWiR5Q0iSxB2We1ChFLgf+s2H/31PtyW+X+tThBMTMzMzMrsWya4YYZyuKImFBj2wJgi9zy5sDjq72PNB64CDg0Ip4q2LyLgN0L7gv4Fi8zMzMzs3Jr/havmcA4SVtLGg5MBq7J7yBpS+Bq4IMR8bcetK5a70xd7kExMzMzMyu5Zh7UGBHLJZ0CzACGAhdHxBxJU9L2qcCXgY2ACyQBLK/TI5N3Zk/b4wTFzMzMzKzkmn2SfERcC1xbsW5q7vVHgY8Wbo80BnBjWbwAAAt1SURBVBgLPC1pv3SMW4rUdYJiZmZmZlZmAUTLnhbfkKRvAEeTTVPclVYH4ATFzMzMzGwwaLYHpY8dCWwXEa/0prITFDMzMzOzEhNNP6ixr80DhgFOUMzMzMzMBp2IItMMt9JLwGxJN5FLUiLi1CKVnaCYmZmZmZVch93idQ0V0xT3hBMUMzMzM7MyC6CDelAi4lJJawFbRsQDPa3f8EGNkkLSt3LLn5N0RsU+d0u6omLdJZJekrRubt130/FGpeUuSbMlzUnH+KwkPzzSzMzMzKwHmnxQY9+2RXoXMBu4Li3vKqlwj0qRZOAV4D3dSUWVBuyQjrOfpJEVm+cCR6T9hgCTgMdy25dGxK4RsRNwEHAY8JWijTczMzMzM7JphuuV1joDmAg8mzUtZgNbF61cJEFZDkwDPlNj+7HAZcD1wD9UbLuCbA5kgAOAP6bjrSYiFgEnAacoPZ7SzMzMzMwaiM7qQSF7yvxzq7eymKK3U50PHCdp/SrbjgauJEtGjqnY9iAwWtKGadv0em8SEfNSmzau3CbpJEmzJM168qmu1SubdSjHrpVRPm6XL32x3c0xKywfu8t6N8OpWelk0wxH3dJif5F0LDBU0jhJ/wXcVrRyoQQlIpYAPwRWmRpM0p7AkxHxCHATsHtKRvKuBiYDewF/KPB2VXtPImJaREyIiAmjNxpapNlmHcGxa2WUj9s11qq8e9esc+Vjdxhrtrs5Zi2jrqhbWuxTwE5kQ0WuAJYAny5auSezeJ0D3AX8ILfuGGB7SQ+n5fWAo4CLcvtMT/UujYgV9e7ekrQN0AUs6kG7zMzMzMwGr4hOm8XrJeB04HRJQ4GREfFy0fqFZ8yKiKeBnwAnwmuD3t8HjI+IrSJiK7IB8cdU1Hs0NfCCeseXNBqYCpwX0fp+KDMzMzOzslLULy1ti/RjSeulCbTmAA9I+nzR+j2d0vdbQPdsXvsBj0VEflauW4AdJb0+XykiLoyIv1c53lrd0wwDN5INtD+zh20yMzMzMxu8ouNu8doxDRE5ErgW2BL4YNHKDW/xioh1cq8XAmvnNr+5Yt8uoDs5OaHG8bbKvfYN+WZmZmZmzeqsG5CGSRpGlqCcFxHLpOL9OH4oopmZmZlZyWlF1C0N60uHSHpA0lxJp1XZvr2k2yW9IulzDQ53IfAwMBK4RdJYsoHyhfRkkLyZmZmZmXWiJnpQ0kD288kenL4AmCnpmoi4L7fb02Qz+h7ZuClxLnBubtUjkiYVbY8TFDMzMzOzElM0Pc5kIjA3PZMQSdPJJr96LUFJD1VfJOmdDdsjrUk2s+9WrJpv/FuRxjhBMTMzMzMruxUNHxc/StKs3PK0iJiWXo8B5ue2LSB7hmFv/RJ4DrgTev7EVCcoZmZmZmZlFkDD/ITFETGhxrZqDypspktm84g4pLeVPUjezMzMzKzktGJF3dLAAmCL3PLmwONNNOc2SW/qbWX3oJiZmZmZlVo0O83wTGCcpK2Bx4DJwLFNHG8f4ARJD5Hd4iUgImJ8kcpOUMzMzMzMyiyAJgbJR8RySacAM4ChwMURMUfSlLR9qqRNgVnAesAKSZ9m5QMZKx3a68bgBMXMzMzMrPTU5IMaI+Jasqe+59dNzb1+guzWryLHekTSPsC4iPiBpNHAOo3qdXOCYmZmZmZWZgF0NR4l3yqSvgJMALYDfgAMAy4H3lqkvhMUMzMzM7NSiyLTDLfSu4HdgLsAIuJxSesWrewExczMzMys7Jq8xauPvRoRISkAJI3sSWUnKGZmZmZmZRYBXV3tbkXeTyRdCGwg6WPAicBFRSs7QTEzMzMzK7sO6kGJiLMlHQQsAd4I/GtE3Fi0vhMUMzMzM7My65BB8pKeZ+UT6PNPp58i6WXg78DpEXFTveM4QTEzMzMzK7sO6EGJiJoD4SUNBXYGfpT+rckJipmZmZlZmXXeGJTVREQXcLek/2q0rxMUMzMzM7Oy66xphmuKiAsb7aPogO6gnpL0JPBIP77FKGBxPx6/Fcp+Dq1o/9iIGN3P77EKx25DZW8/9P85DMS4hfL/7cvefnDs9ob/7u03IH8v9NT6w0bH3hscVXef6xZfeGdETGhRk5pSyh6U/g4SSbPK8gespeznUPb21+LYra/s7YeBcQ6VWvE/5rJ/bmVvPwyMc6jk79zGyn4OZW9/nwmIDr/FqydKmaCYmZmZmVlOCe+KqsUJipmZmZlZmZVgkHxPOEGpblq7G9AHyn4OZW9/u5T9cyt7+2FgnEM7lP1zK3v7YWCcQ6sNhM+s7OdQ9vb3mSjJIPkiSjlI3szMzMzMMusP2SjevOZhdfe5/uXLSzNIfki7G2BmZmZmZk2KFfVLA5IOkfSApLmSTquyXZLOTdvvkbR7v5wHvsXLzMzMzKzUIqKpWbzSU97PBw4CFgAzJV0TEffldjsUGJfKXsD30r99blD1oEjqkjRb0hxJd0v6rKQhue0TJd2Ssse/SrpI0tqSTpD0pKQ/S3pQ0gxJe5e5zZK+mfa/R9LPJW3QT+0PSd/KLX9O0hkV+9wt6YqKdZdIeknSurl1303HG5WW6342A4lj17FbRmWM2/5sdyti13HbN8oYu2WO2/Q+jt0mRVdX3dLARGBuRMyLiFeB6cARFfscAfwwMncAG0h6fd+fCSnjGiQFeCH3emPgRuDMtLwJ2cOc3pKWBbw3rT8BOC9XdxLwBLBDWdsMvANYI73+BvCNfmr/y8BDwKi0/DngjNz2HYB7gceAkbn1lwD3AB9Iy0PS8oLcsWp+NgOtOHYdu2UsZYzbsseu47azY6CMbW5F3Dp2++Tzuw6Y1aD8pWL5pFz99wIX5ZY/mI+LtO5/gH1yyzcBE/rjfAZc9lhURCwCTgJOkSTgZODSiLg9bY+IuCoiFlap+zuyWSNOKmubI+L6iFieNt8BbN5PzV6e3vczNbYfC1wGXA/8Q8W2K4Cj0+sDgD+m462mymczYDl2HbtlVMa4Te9dtth13PaxMsZuCeMWHLtNiYhDImJCg7JzxXJ+BrRqn0XlTFpF9ukTgzZBAYiIeWSfwcbAzsCdPah+F7B9f7Srnn5q80eA3zTfuprOB46TtH6VbUcDV5J9uRxTse1BYLSkDdO26fXepOKzGdAcu69x7JZIGeMWShm7jts+VsbYLWHcgmO3nRYAW+SWNwce78U+fWJQJyhJb7PndmbdfdZmSaeTXWX4UVMtqiMilgA/BE6teO89gScj4hGybsLd05dL3tXAZLJBWH8o8HYD5mpIAY5dx24ZlTFum3n/lseu47bflDF2SxO34Nhts5nAOElbSxpO9lleU7HPNcCHlHkz8FxE/F9/NGZQJyiStgG6gEXAHGCPHlTfDbi/P9pVT1+2WdLxwOHAcRHR3w/EOQc4ERiZW3cMsL2kh4G/A+sBR1XUmw58Fbghov4ceRWfzYDm2HXsllEZ4xZKG7uO2z5UxtgtadyCY7ct0m18pwAzyP72P4mIOZKmSJqSdrsWmAfMBb4PfLI/GzRoCqsOkhpNdh9j5QCyvXL7fADYlNUHkO1Pewa99VmbgUOA+4DRLWz/WcCjwBlkyfF8YExu+yTgpvT6EuC96fXHgTek1w9TfdDbKp/NQCuOXcduGUsZ47bsseu47ewYKGObWxG3jl2XyjLYnoOylqTZwDCybsrLgG8DRMRCSZOBsyVtDKwAbiHrMgQ4WtI+wNpks0wcFRGtuCLSX20+D1gTuCGNEbsjIroz5P7yLbLsHGA/4LGIeCy3/RZgx8op6yLiwhrHq/nZDECOXcduGZUxbvuz3a2OXcdt75UxdgdK3IJjd9BTRH/31JmZmZmZmRUzqMegmJmZmZlZZ3GCYmZmZmZmHcMJipmZmZmZdQwnKGZmZmZm1jGcoJiZmZmZWcdwgmJmZmZmZh3DCYqZmZmZmXWM/wdgPIGUDGvWAwAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "titles = ['Kmeans 12 clusters', 'Kmeans 6 clusters', 'DBSCAN eps=0.5', 'DBSCAN eps=1']\n", "fig2, axes = plt.subplots(1, 4, sharey=True, figsize=(15, 3))\n", "for i, (data, title) in enumerate(zip(ces2, titles)):\n", " imi = axes[i].imshow(data, vmax=np.log(2), vmin=0)\n", " axes[i].set_xticks(np.arange(3))\n", " axes[i].set_xticklabels(labels)\n", " axes[i].set_title(title)\n", "plt.yticks(np.arange(3), labels)\n", "cbar2 = fig2.colorbar(imi, ax=axes.ravel().tolist())\n", "cbar2.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As can be seen, reducing the number of clusters in the K-means method emphasises that DCD2 is more similar to the NAMD trajectory than DCD. Meanwhile, increasing `eps` in DBSCAN clearly lowered the density required to form a cluster so much that every trajectory is in the same cluster, and therefore they have identical probability distributions." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of clusters in DBSCAN eps=1: 1\n" ] } ], "source": [ "n_db = len(details2['clustering'][-1])\n", "\n", "print('Number of clusters in DBSCAN eps=1: {}'.format(n_db))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Estimating the error in a clustering ensemble similarity analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`encore.ces` also allows for error estimation using a bootstrapping method. This returns the average Jensen-Shannon divergence, and standard deviation over the samples. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "avgs, stds = encore.ces([u1, u2, u3],\n", " selection='name CA',\n", " clustering_method=clustering_method,\n", " estimate_error=True,\n", " ncores=4)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0.68394378, 0.69314718],\n", " [0.68394378, 0. , 0.68695471],\n", " [0.69314718, 0.68695471, 0. ]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "avgs" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.00000000e+00, 7.79984746e-03, 8.59975057e-17],\n", " [7.79984746e-03, 0.00000000e+00, 7.58419318e-03],\n", " [8.59975057e-17, 7.58419318e-03, 0.00000000e+00]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf.\n", "Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of OpenClosed Transitions.\n", "Journal of Molecular Biology, 394(1):160–176, November 2009.\n", "00107.\n", "URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.\n", "\n", "[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein.\n", "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations.\n", "Proceedings of the 15th Python in Science Conference, pages 98–105, 2016.\n", "00152.\n", "URL: https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html, doi:10.25080/Majora-629e541a-00e.\n", "\n", "[3] Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein.\n", "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations.\n", "Journal of Computational Chemistry, 32(10):2319–2327, July 2011.\n", "00778.\n", "URL: http://doi.wiley.com/10.1002/jcc.21787, doi:10.1002/jcc.21787.\n", "\n", "[4] Matteo Tiberti, Elena Papaleo, Tone Bengtsen, Wouter Boomsma, and Kresten Lindorff-Larsen.\n", "ENCORE: Software for Quantitative Ensemble Comparison.\n", "PLOS Computational Biology, 11(10):e1004415, October 2015.\n", "00031.\n", "URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004415, doi:10.1371/journal.pcbi.1004415." ] } ], "metadata": { "kernelspec": { "display_name": "Python (mdanalysis)", "language": "python", "name": "mdanalysis" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }