{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualising time series data" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 1: Import emergency department reattendance data.** \n", "\n", "This is a time series from a hospital that measures the number of patients per month that have reattended an ED within 7 days of a previous attendance.\n", "\n", "This can be found in **\"data/ed_reattend.csv\"**\n", "\n", "* Hint 1: look back at the lecture notes and see how `pd.read_csv()` was used. \n", "\n", "* Hint 2: The format of the 'date' column is in UK standard dd/mm/yyyy. You will need to set the `dayfirst=True` of `pd.read_csv()` to make sure pandas interprets the dates correctly.\n", "\n", "* Hint 3: The data is monthly and the dates are all the first day of the month. This is called monthly start and its shorthand is 'MS'" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "url = 'https://raw.githubusercontent.com/hsma-master/hsma/master/12_forecasting/data/ed_reattend.csv'\n", "reattends = pd.read_csv(url, index_col='date', \n", " parse_dates=True, dayfirst=True)\n", "reattends.index.freq = 'MS'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 2: Check the shape of the `DataFrame` and print out the first 5 observations**" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(43, 1)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reattends.shape" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | reattends | \n", "
---|---|
date | \n", "\n", " |
2014-04-01 | \n", "1094 | \n", "
2014-05-01 | \n", "1266 | \n", "
2014-06-01 | \n", "1170 | \n", "
2014-07-01 | \n", "1239 | \n", "
2014-08-01 | \n", "1197 | \n", "