🧬 KTH Royal Institute of Technology  ·  SciLifeLab

Data-Driven
Life Sciences

A hands-on course where you master the AI tools, data science workflows, and biological reasoning skills that are reshaping how science is done — and how researchers stay competitive.

🤖 AI Agents & LLMs 🔬 Microscopy & Imaging 🧬 Single-cell Genomics 🧪 Protein Structure 💊 Precision Medicine 🔭 Systems Biology
7.5
ECTS Credits
6
Modules
100+
Students Trained
Years Running
⚡ Why This Matters Now

You're Graduating Into a Different World

Every decade, a shift in tools changes how knowledge workers operate. The current shift is the biggest yet — and it's happening right now.

📚 How we find information
📚
Pre-2000
Library & Textbooks
🔍
2000s
Google Search
💬
2022+
ChatGPT & LLMs
🤖
Today
AI Agents & Autonomous Research
💻 How we write code
⌨️
Pre-2020
Type every line in an IDE
2021
GitHub Copilot suggests lines
2023
Cursor writes entire functions
🚀
Today
AI agents code, test & debug entire projects
💡
The right question isn't "will AI replace scientists?" It's: will scientists who use AI effectively replace those who don't? This course gives you that edge — how to prompt correctly, direct AI agents for data analysis, train models, build tools, and critically evaluate AI outputs in a scientific context.
🔬 The Changing Nature of Science

How Biological Research Has Evolved

The scientific method itself is being transformed — from a purely human-driven process to one where AI plays an increasingly central role in every step.

Traditional Science
🧫
Pre-2000s → ongoing
Hypothesis-Driven Research
A scientist forms a hypothesis based on prior knowledge and intuition, designs a targeted experiment to test it, and interprets results manually. Powerful — but slow, narrow, and limited by human capacity to process data.
Observe phenomenon
Form hypothesis
Design & run experiment
Collect & analyze data manually
Publish & repeat
e.g. Koch's postulates, Mendel's genetics, classical drug trials — one hypothesis at a time.
Computational Biology
💻
2000s → present
Data-Driven Hypothesis Generation
Large-scale biological datasets (genomes, proteomes, single-cell profiles) are analyzed computationally. Machine learning reveals patterns that generate new hypotheses — often ones a human would never have considered.
Generate massive dataset (omics, imaging)
Apply ML / statistical models
Discover unexpected patterns
Form data-driven hypothesis
Validate experimentally
e.g. GWAS reveals disease-linked variants across 500,000 genomes; scRNA-seq uncovers unknown cell types; AlphaFold predicts protein structure from sequence alone.
🚀 Emerging Now
🤖
2024 → future
The AI Scientist — Autonomous Research
AI agents autonomously review literature, generate hypotheses, design experiments, write and execute code, interpret results, and iterate — completing in hours what took researchers months. Humans set the goals; AI drives the loop.
AI reviews entire literature corpus
Generates & ranks hypotheses
Designs & simulates experiments
Analyzes results, updates model
Drafts manuscript autonomously
e.g. Sakana AI's "AI Scientist" writes & reviews its own papers; AI discovers novel antibiotics (MIT, 2023); automated lab robots close the experiment loop end-to-end.
🔬 What is Data-Driven Life Sciences?

Where Biology Meets Computation

Modern biology generates more data than any human can process. Data-driven life sciences is the discipline of using machine learning, AI, and computational methods to extract meaning from this data — and to make discoveries that would be impossible otherwise.

🧬
AlphaFold
DeepMind's AI solved protein structure prediction — a 60-year grand challenge — in 2020. Every protein in the human genome, predicted in days.
Structural Biology
🔬
Single-Cell Sequencing
We can now profile the gene expression of individual cells, mapping every cell type in the human body at unprecedented resolution.
Genomics
👁️
AI Microscopy
Deep learning can now detect cancer cells, classify organisms, and measure cellular dynamics from microscopy images with superhuman accuracy.
Imaging
💊
Precision Medicine
By integrating genomic, clinical, and environmental data, AI is enabling treatments tailored to individual patients rather than disease averages.
Systems Biology
Six Modules, Six Fields

Each week brings a new domain, taught by leading researchers from SciLifeLab and KTH.

1
🤖
Intro to Data-Driven Life Sciences
Foundations of AI in biology, using AI agents and LLMs for scientific exploration.
2
🔬
Image Analysis & Microscopy
Deep learning for biological images: cell segmentation, classification, super-resolution.
3
🧬
Protein Structure & Molecular Biology
AlphaFold, protein language models, molecular simulation, and structure prediction.
4
🧫
Single-cell Transcriptomics & Genomics
scRNA-seq analysis, dimensionality reduction, trajectory inference, spatial transcriptomics.
5
💊
Precision Medicine & Systems Biology
Multi-omics integration, network biology, patient stratification, drug target discovery.
6
🚀
Automated Scientific Discovery & AI Agents
AI agents that design experiments, analyze results, and drive autonomous scientific workflows.
🎓
Weekly Lecture by Leading Researcher
💻
Hands-on AI-Augmented Lab (Google Colab)
📖
Journal Club & Peer Discussion
🏗️
Final Project: Build Something Real
🎯 How You'll Learn

Learning, Augmented by AI

Every week combines expert knowledge with AI-assisted hands-on practice — the same workflow you'll use in your research career.

🎓
Expert Lectures
Weekly talks by DDLS Fellows, SciLifeLab facility leaders, and KTH researchers using cutting-edge methods.
2 hrs/week
💻
AI-Augmented Labs
Hands-on Google Colab notebooks with AI coding assistants. You direct the analysis; AI handles the boilerplate.
4 hrs/week
📖
Journal Club
Critically read, present, and debate landmark papers in data-driven life sciences. AI helps you prepare.
2 hrs/week
🏗️
Real Final Project
Build a working tool, web app, or analysis pipeline. Past students built apps that got deployed in real labs.
3 weeks
🎬 Student Projects

What Past Students Built

The final project is 3 weeks of building something real — AI-powered tools, web apps, and research demos used by actual labs.

🔬
🧬
🌊 Environmental Science · SciLifeLab
Karin Garefelt
Built a full-stack web app for classifying plankton from IFCB flow cytometry images — deployed at SciLifeLab and now used by researchers.
Watch Presentation
🤖
💻
⚡ AI Coding Workflows · KTH PhD
Lasse Stahnke
Developed an AI-assisted coding workflow tool that uses agents to automate data analysis pipelines — from raw data to publishable figures.
Watch Presentation
🧪
🔭
🔬 Research Automation · SciLifeLab
Augusta Jensen
Created an AI-driven research demo showing how an AI agent can search literature, summarize findings, and propose next experimental steps.
Watch Presentation

Explore top courses

.js-id-current

Prerequisites

Be prepared As prerequisites for the course, we recommend becoming familiar with the following: Browse the SciLifeLab Data-Driven Life Science (DDLS) initiative to understand national priorities and the concept of the data life cycle, which is central in this course.

Module 2 - Image Analysis and Microscopy

Start date: Week 36 (2nd September 2025) Lecturer: Estibaliz Gómez de Mariscal (Postdoc at Instituto de Tecnologia Química e Biológica António Xavier (ITQB NOVA), NOVA University of Lisbon, Portugal; Tuesday 2nd September 13:00-15:00 CEST) Estibaliz Gómez de Mariscal completed her PhD in Mathematical Engineering in 2021 and is currently an EMBO Postdoctoral Fellow in the Optical Cell Biology Group at Instituto de Tecnologia Química e Biológica António Xavier (ITQB NOVA), NOVA University of Lisbon, Portugal.

Course Schedule

Important Dates: Application closes: Aug 25, 2025, 24:00 CEST First lecture: Aug 26, 2025, 13:00 CEST Final Project plan submission deadline: Oct 6, 2025 Final Project: Oct 6-29, 2025 Final Project consultation 1: Oct 8, 2025, 13:00-14:00 Final Project consultation 2: Oct 20, 2025, 13:00-14:00 Final Project report submission deadline: Oct 29, 2025, 23:59 Oral Presentation (only for master’s students): Oct 31, 2025, 9:00-12:00 Lectures, computer labs and seminars will be held online via Zoom.

📊 HT 2025: Data-Driven Life Sciences

Course content for the 2025 Data-Driven Life Sciences (DDLS) course.

Final Project — Exploring and Modeling Life Science Data with AI Agents

⚠️ Note: This is a draft project plan. Details (including deadlines, deliverables, and evaluation criteria) may change. Updates will be announced on the course website and in class. Welcome to the final project of DDLS 2025.

Module 4 - Single-cell Transcriptomics and Genomics

Start date: Week 38 (16th September 2025) Lecturer: Xiaojie Qiu Lecture: Tuesday, 16 September 2025 — 08:00 CET (Pacific Time: Monday, 15 September, 11:00 PM) Module 4 covers genomics and single-cell analysis featuring scRNA-seq processing pipelines, pathway analysis, and genomic workflows.

Module 1 - Introduction to Data-Driven Life Sciences

Start date: Week 35 (26th August 2025) Lecturer: Wei Ouyang (DDLS Fellow; Tuesday 26th August 13:00-15:00 CEST) Module 1 serves as the foundational introduction to the Data-Driven Life Sciences (DDLS) course.

Module 6 - Automated Scientific Discovery and AI Agents

Start date: Week 40 (30th September 2025) Module 6 focuses on automated scientific discovery and AI agents. You will learn how to design, build, and evaluate agentic workflows that plan experiments, orchestrate tools and data pipelines, and interface with lab automation.

Module 3 - Protein Structure and Molecular Biology

Start date: Week 37 (9th September 2025) Speaker 1: Patrick Bryant (Ass. Prof. at SU & DDLS Fellow) Patrick Bryant’s research seeks to answer questions about the evolution of proteins and how this information can be used to create a new range of AI tools.

Module 5 - Precision Medicine and Systems Biology

Start date: Week 39 (23rd September 2025) Module 5 focuses on clinical applications of AI and machine learning with an emphasis on multi-omics integration, biomarker discovery, and clinical data analysis. The module highlights how biologically informed models and novel machine learning methods can advance personalized medicine, cancer genomics, and systems biology modeling.

Course Schedule

Important Dates: Application open: June 5th, 2024 Application closes: Aug 19, 2024 First lecture: Aug 27, 2024, 08:00 CEST Hackathon: Oct 9-11, 2024 Final Project: Oct 7-27, 2024 Final Project consultation: Oct 17, 08:00-11:00 CEST Final Project Report submission deadline: Sunday, Oct 27, 2024, 23:59 CEST Oral Presentation (batch 1): Tuesday, Oct 29, 2024, 8:00-10:00 CEST (only for Master’s students) Oral Presentation (batch 2): Wednesday, Oct 30, 2024, 8:00-10:00 CEST (only for Master’s students) Lectures, Computer Labs, and Journal Clubs will be held online via Zoom.

Module 6 [1 ECTS] - Computational Modeling and Automated Scientific Discovery

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } Start date: Week 40 (1st October 2024)

Final Project

Start date: Week 41 (7th October 2024) Final Project The final project allows students to showcase their learning by proposing their own dataset and project idea. This open-ended assignment encourages participants to explore topics of interest within data-driven life science and apply the skills gained throughout the course individually or in a pair (both members should be at the same level, e.

Module 3 [1 ECTS] - Bioinformatics and Metagenomics

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } Start date: Week 37 (10th September 2024)

Module 1 [0.5 ECTS] - Introduction to DDLS, AI and ChatGPT

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } NOTE: Module 1 is a prerequisite for any subsequent Module if you like to obtain a certification for participation.

DDLS Hackathon - A Code Festival for the SciLifeLab/DDLS Community

Hackathon Join us at the DDLS Hackathon, a vibrant code festival for the SciLifeLab/DDLS community! This hackathon is part of the 2024 DDLS course and participants will collaborate on their projects in the dynamic field of data-driven life science, supported by teaching assistants in a hands-on environment.

Module 2 [1 ECTS] - Advanced AI Techniques in Biological Systems

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } Start date: Week 36 (3rd September 2024)

Module 5 [1 ECTS] - Single Molecule Analysis and Tissue Atlases

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } Start date: Week 39 (24th September 2024)

Module 4 [1 ECTS] - AI in Bioimage Analysis

.profile-photo { width: 150px; /* Adjust the width as needed */ height: auto; /* This keeps the aspect ratio of the image */ display: block; margin-left: auto; margin-right: auto; } Start date: Week 38 (17th September 2024)

📊 HT2024: Data-driven Life Sciences

📊 HT2024: Data-driven Life Sciences

Course content of the Data-driven Life Sciences course offered in 2024.

Module 5: Multi-omics Data Analysis and BioImage Analysis and Deep Learning

Lecture by Wen Zhong (DDLS Fellow) and Estibaliz Gómez de Mariscal (PhD)

Module 3: Super-Resolution Microscopy and Analytics to Investigate Complex Cellular Processes and Diseases

Lecture by Juliette Griffié (DDLS Fellow) on 12th Sept, at 08:00-10:00.

Module 2: Generative AI and its Applications in Life Sciences

We will introduce the basics of generative AI and its potential in life science. We will also cover recent trend of using generative large-language models and practical skills on how to use ChatGPT for reading, writing, planning and code generation. Lecture by Wei Ouyang (DDLS Fellow) on 5th Sept.

Module 6: Model Construction and Features of Alphafold2

Lecture by Darko Mitrovic (PhD) on 3rd Oct.

Module 4: Computational Methods in Evolution and Biodiversity

We will join the Computational Methods in Evolution and Biodiversity workshop in Stockholm University, participate in lectures by Serge Belongie, Katie E. Lotterhos, Tobias Andermann (DDLS Fellow) et al. on 21st Sept. The computer lab will happen on the same day in the afternoon.

Module 1: Introduction to DDLS and machine learning

We will introduce the concept of data-driven life sciences and the data life cycle, the course and the different modules, and some basics about using ChatGPT. Lecture by Wei Ouyang on 29th August.

📊 HT2023: Data-driven Life Sciences

📊 HT2023: Data-driven Life Sciences

Course content of the Data-driven Life Sciences course offered in 2023.

Prerequisites

Be prepared As prerequisites for the course, we recommend that you have a look at the following resources: Please have a look at the SciLifeLab Data-Driven Life Science (DDLS) initiative website to understand what data-driven life sciences are, and how Sweden is investing in this area.

Course Schedule

Important dates Course start: 29 August 2023 Course registration deadline: 4 September 2023 Course end: 6 October 2023 Exam: 11 October 2023 & 25 October 2023 NOTE: The schedule is subject to change, please keep an eye on the email announcement and the updated schedule HERE (which may differ from the schedule at KTH course web).

Prerequisites

Be prepared As prerequisites for the course, we recommend that you have a look at the following resources: Please have a look at the SciLifeLab Data-Driven Life Science (DDLS) initiative website to understand what data-driven life sciences are, and how Sweden is investing in this area.

📊 HT2022: Data-driven Life Sciences

📊 HT2022: Data-driven Life Sciences

Course content of the Data-driven Life Sciences course offered in 2022.

Final DDLS Project

Final project plan and report for the DDLS course.

What Students Say

The DDLS course broadened my general knowledge in data-driven life sciences. Throughout the course, we were learning how to use AI tools effectively. I was surprised by how quickly I could build my first web app!
KG
Karin Garefelt
SciLifeLab · Plankton Classification App
The course provided a solid introduction to applying agentic AI in both research and coding. The final project was especially useful as a chance to put the AI-supported coding workflows into practice.
LS
Lasse Stahnke
KTH PhD Student · AI-Supported Coding
This course was a great introduction to the world of data-driven life science research. It contained interesting lectures as well as hands-on experience working with Google Colab and Gemini CLI as an AI agent.
AJ
Augusta Jensen
SciLifeLab · AI-Assisted Research Demo

Meet the Team

Avatar

Wei Ouyang

Teacher / Examiner

Avatar

Nils Mechtel

Teaching Assistant

Avatar

Songtao Cheng

Teaching Assistant

Ready to learn?