ML-Kurs-SS2023/notebooks/03_ml_basics_tf_binary_classification_example.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "344b183c",
   "metadata": {},
   "outputs": [],
   "source": [
    "#\n",
    "# train a simple TensorFlow model to perform binary classification on a generated\n",
    "# 2-dimensional dataset \n",
    "# 02/2023\n",
    "# "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92c9d0a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3814ea1d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Generate toy data\n",
    "np.random.seed(4321)\n",
    "n_samples = 1000"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84d8bc5e",
   "metadata": {},
   "source": [
    "machine learning algorithms need data close to 1 , this generated here, it is not needed to normalize data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a1d437df",
   "metadata": {},
   "outputs": [],
   "source": [
    "class1_data = np.random.multivariate_normal([-1., -1.], [[1., 0.], [0., 1.]], n_samples)\n",
    "class2_data = np.random.multivariate_normal([1.0, 1.0], [[1., 0.], [0., 1.]], n_samples)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6bbccf56",
   "metadata": {},
   "outputs": [],
   "source": [
    "# the data is merged together and the toy labels are asigned as [1, 0]   and  [0,1]\n",
    "train_data = np.concatenate([class1_data, class2_data])\n",
    "toy_labels = np.zeros(train_data.shape)\n",
    "toy_labels[:n_samples, 0] = 1\n",
    "toy_labels[n_samples:, 1] = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f8dd0511",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot the input data with points colored according to their labels\n",
    "plt.scatter(class1_data[:, 0], class1_data[:, 1], color='red')\n",
    "plt.scatter(class2_data[:, 0], class2_data[:, 1], color='blue')\n",
    "plt.title(\"Input data with points colored according to their labels\")\n",
    "plt.xlabel(\"Feature 1\")\n",
    "plt.ylabel(\"Feature 2\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75fb1af9",
   "metadata": {},
   "source": [
    "Build a model, the sequential model is a linear stack of pre-made layers. In a Dense layer all neueral network layer is connected with all other layers. Here we have 32 nodes with input_shape 2 with means 2 dimensional data. The activation is 'relu' (rectified linear unit)\n",
    "Softmax maps the output of a model to probability distributions of the 2 classes. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad6dcef9",
   "metadata": {},
   "outputs": [],
   "source": [
    "model = tf.keras.models.Sequential([\n",
    "    tf.keras.layers.Dense(32, activation='relu', input_shape=(2,)),\n",
    "    tf.keras.layers.Dense(32, activation='relu'),\n",
    "    tf.keras.layers.Dense(2, activation='softmax')\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9c38493",
   "metadata": {},
   "source": [
    "Adam optimizer is a gradient-based optimization algorithm for updating the weights in \n",
    "a neural network. The leanrning rate depends on the first and second moments of the \n",
    "gradients of the loss function with respect to the weights\n",
    "The loss function is BinaryCrossentropy since we have 2 classes of data\n",
    "The accuracy metric measures the percentage of instances where the model \n",
    "correctly predicted the class label and it can be computed as the number of correct\n",
    "predictions divided by the total number of instances in the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f6dc4ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(optimizer='adam',\n",
    "#             loss=tf.keras.losses.CategoricalCrossentropy(),\n",
    "              loss=tf.keras.losses.BinaryCrossentropy(),\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6397c0f",
   "metadata": {},
   "source": [
    "The object history contains loss and accuracy from the training process\n",
    "The model is trained by dividing the entire training data into smaller batches\n",
    "of a specified size, updating the model's parameters after each batch.\n",
    "The batch_size parameter determines the number of samples to be used in each batch. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b98e46ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "history = model.fit(train_data, toy_labels, epochs=20, batch_size=32, verbose=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "806497d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot loss and accuracy\n",
    "plt.figure(figsize=(12, 4))\n",
    "\n",
    "plt.subplot(1, 2, 1)\n",
    "plt.plot(history.history['loss'])\n",
    "plt.title('Loss')\n",
    "plt.xlabel('Epoch')\n",
    "\n",
    "plt.subplot(1, 2, 2)\n",
    "plt.plot(history.history['accuracy'])\n",
    "plt.title('Accuracy')\n",
    "plt.xlabel('Epoch')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca6da322",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot data points and decision boundary\n",
    "x_min, x_max = train_data[:, 0].min() - .5, train_data[:, 0].max() + .5\n",
    "y_min, y_max = train_data[:, 1].min() - .5, train_data[:, 1].max() + .5\n",
    "xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))\n",
    "# creating a 2D grid from the arrays xx and yy which is the area of our inputs \n",
    "grid = np.c_[xx.ravel(), yy.ravel()]\n",
    "# get the predicted class probabilities for each data point in the grid.\n",
    "# The result Z is an array with shape (n_samples, n_classes) where n_samples\n",
    "# is the number of data points in the grid and n_classes is the number of\n",
    "# classes in the toy_labels. Z contains the predicted class probabilities\n",
    "# for each data point in the grid.\n",
    "Z = model.predict(grid)\n",
    "# The line Z = np.argmax(Z, axis=1) is used to convert the predicted probabilities\n",
    "# into class labels.\n",
    "Z = np.argmax(Z, axis=1)\n",
    "# reshaped Z variable is used to create the contour plot of the model's predictions \n",
    "# on the grid.\n",
    "Z = Z.reshape(xx.shape)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "54c02602",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(8, 8))\n",
    "plt.contourf(xx, yy, Z, cmap=plt.cm.RdBu, alpha=.8)\n",
    "plt.scatter(train_data[:, 0], train_data[:, 1], c=np.argmax(toy_labels, axis=1), cmap=plt.cm.RdBu)\n",
    "\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}