Prerequisites

Be prepared

As prerequisites for the course, we recommend becoming familiar with the following:

  • Browse the SciLifeLab Data-Driven Life Science (DDLS) initiative to understand national priorities and the concept of the data life cycle, which is central in this course.
  • Refresh core Python basics (variables, data types, control flow, functions, modules, simple plotting, reading/writing files). See the resources below.

Technical setup for labs (all online):

  • A computer with reliable internet access
  • A modern browser (e.g. Chrome)
  • A Google account (for Google Colab and Drive storage)
  • (Optional but encouraged) Accounts for AI coding/assistant tools (e.g. ChatGPT); free tiers are sufficient
  • A GitHub account for versioning and sharing notebooks/code

Build a foundation in Python

As a warm‑up, ensure you are comfortable coding in Python. If you are new or rusty, use any of the options below. There is no submission requirement—this is for your own preparation.

Option A: Guided practice with AI

Use curated prompts to accelerate review: Learn Python with ChatGPT. Treat AI output critically—run code, fix errors, and keep notes of what you clarified.

Recommended minimal competency checklist:

  • Running cells in Colab / Jupyter
  • Using print, f-strings, and basic input/output
  • Lists, tuples, dictionaries, sets (creation, indexing, iteration)
  • Control flow (if, for, while, list comprehensions)
  • Defining functions; understanding scope and return values
  • Importing standard libraries (math, random, json, pathlib)
  • Basic plotting with matplotlib or seaborn
  • Reading CSV/TSV data with pandas
  • Simple error handling (try/except)

Option B: Traditional refresher

Use the interactive notebook below or other beginner tutorials (e.g. Python docs, Software Carpentry). Progress until the checklist above feels easy.

Interactive Notebook

Quick Python basics:

Or load locally in Jupyter Lite:


If you are new to programming, you may also watch this introductory video:

Quick Quiz (self-check)

What is the difference between lists and tuples?

Lists

  • Lists are mutable - they can be changed
  • Slower than tuples
  • Syntax: a_list = [1, 2.0, 'Hello world']

Tuples

  • Tuples are immutable - they can’t be changed
  • Tuples are faster than lists
  • Syntax: a_tuple = (1, 2.0, 'Hello world')

Is Python case-sensitive?

Yes

How do you execute a single cell in Colab/Jupyter and restart the kernel?

Run the cell with Shift+Enter (or the play icon). Restart the kernel via the Runtime / Restart runtime (Colab) or Kernel / Restart Kernel (Jupyter) menu.

Give an f-string that prints variable x=5 as ‘Value: 5’

x = 5
print(f"Value: {x}")

Show one operation that differs between a list and a tuple

Lists are mutable: a = [1,2]; a.append(3) works. Tuples are immutable: t = (1,2); t.append(3) raises AttributeError.

List comprehension: create a list of squares for numbers 0-4

squares = [i*i for i in range(5)]  # [0,1,4,9,16]

Write a function ‘mean_or_none(values)’ returning the mean or None if empty

def mean_or_none(values):
    if not values:
        return None
    return sum(values)/len(values)

What does ‘from pathlib import Path’ enable?

It imports the Path class for object-oriented filesystem paths (joining, reading, iterating) in a cross‑platform way.

Minimal matplotlib example plotting y = x^2 for x=0..4

import matplotlib.pyplot as plt
x = list(range(5))
y = [i**2 for i in x]
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('x^2')
plt.show()

Read a CSV ‘data.csv’ into a pandas DataFrame and show first 3 rows

import pandas as pd
df = pd.read_csv('data.csv')
print(df.head(3))

Wrap code to catch a ValueError when converting input to int

try:
    n = int(user_input)
except ValueError:
    n = None

Explain scope: why does this fail? ‘def f(): x+=1’ before x=0

Inside the function, x += 1 tries to assign to local x before it exists; Python treats x as local due to assignment. Use global x or pass/return a value instead.