DATAVIZData Visualisation
All Courses
track 01

EDA with Pandas

Systematic exploration before modelling. Understand shape, types, missing values, distributions, correlations, and outliers.

The EDA Checklist

StepCodeWhat you learn
Shapedf.shapeRows, columns
Dtypesdf.dtypesNumeric, categorical, datetime
Missingdf.isnull().sum()Count + pattern of nulls
Statsdf.describe()Min, max, mean, std, quartiles
Unique valsdf['col'].value_counts()Category frequencies
Correlationsdf.corr()Linear relationships
OutliersIQR / Z-scoreExtreme values

Outlier Detection

outliers.py
# IQR method
Q1, Q3 = df['col'].quantile([0.25, 0.75])
IQR = Q3 - Q1
lower, upper = Q1 - 1.5*IQR, Q3 + 1.5*IQR
outliers = df[(df['col'] < lower) | (df['col'] > upper)]

# Z-score method
z = (df['col'] - df['col'].mean()) / df['col'].std()
outliers = df[z.abs() > 3]

Interactive Notebook

โšก
Notebook: EDA with Pandas
shape, dtypes, missing values, distributions, correlations, outlier detection
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of EDA with Pandas -- 10 questions, 70% to pass.

Take Quiz
track 02

Matplotlib Deep Dive

Full control over every visual element. Publication-quality charts with the object-oriented API.

Object-Oriented API

matplotlib_oo.py
fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(x, y, color='#6b21a8', lw=2, label='data')
ax.set_title('Title', fontweight='bold')
ax.set_xlabel('X Label')
ax.legend(); ax.grid(alpha=0.3)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout(); plt.show()

Subplots with Gridspec

gridspec.py
import matplotlib.gridspec as gridspec
fig = plt.figure(figsize=(12, 6))
gs = gridspec.GridSpec(2, 3)
ax1 = fig.add_subplot(gs[:, 0:2])  # spans both rows
ax2 = fig.add_subplot(gs[0, 2])    # top-right
ax3 = fig.add_subplot(gs[1, 2])    # bottom-right

Interactive Notebook

โšก
Notebook: Matplotlib Deep Dive
line charts, subplots, gridspec, annotations, custom formatters
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Matplotlib Deep Dive -- 10 questions, 70% to pass.

Take Quiz
track 03

Seaborn Statistical Charts

Built-in statistical summaries, beautiful defaults, and FacetGrid for multi-panel plots.

Key Seaborn Plots

FunctionUseKey params
histplot()Distribution + optional KDEkde=True, hue=
boxplot()Quartiles + outliers by groupx=, y=, hue=
violinplot()Full distribution shapeinner='quartile'
regplot()Scatter + regression line + CIline_kws=, ci=
heatmap()Matrix of valuesannot=True, fmt='.2f'
FacetGrid()Multi-panel by categorycol=, row=, hue=

Interactive Notebook

โšก
Notebook: Seaborn Statistical
histplot, boxplot, violin, regplot, heatmap, FacetGrid
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Seaborn Statistical -- 10 questions, 70% to pass.

Take Quiz
track 04

Plotly Interactive Charts

Hover, zoom, filter, animate. Charts that respond to the user in the browser.

Plotly Express Quick Reference

FunctionChart type
px.scatter()Scatter with optional colour, size, animation
px.bar()Bar chart, supports animation_frame
px.line()Line chart
px.histogram()Histogram
px.box()Box plot
px.choropleth()World map coloured by value
px.treemap()Hierarchical treemap

Save Interactive Chart

save.py
fig.write_html('chart.html')     # share as file
fig.write_image('chart.png')    # static PNG
fig.write_image('chart.pdf')    # PDF for reports

Interactive Notebook

โšก
Notebook: Plotly Interactive
scatter, animated bar, multi-panel subplots, hover, choropleth
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Plotly Interactive -- 10 questions, 70% to pass.

Take Quiz
track 05

Real-World EDA Project

End-to-end analysis from raw data to actionable business recommendations.

EDA Project Structure

PhaseOutput
1. Business questionWhat decision will this analysis inform?
2. Data understandingShape, dtypes, missing, quality issues
3. Univariate analysisDistribution of each variable
4. Bivariate analysisRelationships between pairs of variables
5. MultivariateGroup comparisons, correlations heatmap
6. InsightsPatterns in plain English with numbers
7. RecommendationsSpecific, prioritised, actionable next steps

Interactive Notebook

โšก
Notebook: Real-World EDA Project
full EDA pipeline from raw data to actionable insights and recommendations
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Real-World EDA Project -- 10 questions, 70% to pass.

Take Quiz
track 06

Storytelling with Data

Charts exist to communicate insights, not display data. Every design choice should serve the message.

The 5 Principles

PrincipleIn practice
Right chart typeBar for comparison, line for trend, scatter for correlation
Remove chart junkNo 3D, no heavy borders, minimal gridlines
Pre-attentive attrsUse colour to highlight ONE thing only
Direct labellingLabel lines directly instead of using a legend
One story per chartSplit complex charts into focused panels

Title Examples

BadGood
Monthly RevenueRevenue peaked in June -- H1 target exceeded by 9%
Student ScoresStudents studying less than 3 hrs/week fail 42% more often
Sales by RegionSouth drives 38% of revenue but only 22% of headcount

Interactive Notebook

โšก
Notebook: Storytelling with Data
bad vs good charts, chart selection guide, IBCS colour rules
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Storytelling with Data -- 10 questions, 70% to pass.

Take Quiz
track 07

Tableau / Power BI Basics

BI tools for non-technical stakeholders. When to use Python vs when to use a BI tool.

Python vs BI Tools

Use Python whenUse BI when
Custom statistical analysisExecutives need self-service
Data cleaning and wranglingConnecting live databases
ML model buildingScheduled refresh dashboards
Complex custom chartsSimple interactive filters
Reproducible pipelinesNon-technical users

Tableau vs Power BI

FeatureTableauPower BI
Formula languageTableau calc fieldsDAX (Excel-like)
Best ecosystemAny data sourceMicrosoft / Azure
VisualisationRicher, more flexibleGood, improving fast
PriceHigher ($70/user/mo)Lower ($10/user/mo)
Free tierTableau PublicPower BI Desktop

Interactive Notebook

โšก
Notebook: Tableau / Power BI
Concepts and when to use BI tools (notebook focuses on Python EDA principles)
First load ~30-60s ยท Saves automatically
Open Notebook

Quiz

Test your understanding of Tableau / Power BI -- 10 questions, 70% to pass.

Take Quiz
๐ŸŽ“

Data Visualisation Complete!

Pass all topic quizzes to earn your certificate.