ooresAnalytics Training

Machine Learning for Survival Analysis in R

Advanced Methods for Time-to-Event Modeling in Public Health & Biomedical Research

Days

Mar 14 – Apr 26, 2026

Live via Zoom

Machine Learning for Survival Analysis in R (14-Day Weekend Program)

Duration: 14 Days (8–10 weeks)

Schedule: Saturday/Sunday, 12–2 PM ET (New York Time)

Registration Closes: March 12, 2026

Format: Live via Zoom

Payment Options: Bank Transfer, Venmo, or Zelle accepted.

Course Fee: $250 (limited scholarships and variable pricing available based on Country of participant)

About the Training

This course provides a modern, applied introduction to machine learning approaches for time-to-event (survival) data, using R and real public-health and biomedical datasets.

Participants begin with classical survival models (Kaplan–Meier, Cox proportional hazards) and progress to penalized survival models, random survival forests, gradient-boosted survival models, deep learning approaches, and explainability tools for survival ML.

The course emphasizes interpretation, diagnostics, prediction accuracy, and clear communication of findings in epidemiology, clinical research, and health policy contexts.

Instructor-led, live Zoom sessions
Real-world public health & biomedical datasets
Classical → ML → Deep Learning progression
Strong focus on interpretation & explainability
Strong focus on interpretation & explainability

PREREQUISITES

Designed for advanced undergraduates, graduate students, faculty, and applied researchers aiming to apply ML in their work.

Who Should Register

Designed for advanced undergraduates, graduate students, faculty, and applied researchers aiming to apply ML in their work.

Learning Objectives

Foundational Survival Skills

- Understand censoring, truncation, survival & hazard functions
- Fit Kaplan–Meier curves and Cox PH models

Machine Learning Skills

- Penalized Cox models (lasso, ridge, elastic net)

- Random Survival Forests & Gradient Boosting

- Deep learning survival models (DeepSurv-style)

R & Framework Skills

- Use survival, survminer, tidymodels, glmnet, ranger, gbm

- Cross-validation and tuning for censored outcomes

Interpretation & Communication

- Hazard ratios, variable importance, SHAP/LIME

- Publication-ready plots & ML explainability graphics

Weekly Breakdown

Week 1 — Survival Data & R Foundations

Overview of ML concepts and workflow
Data types, features, and labels
Model training vs testing process
Using real datasets for exploration (e.g., Palmer Penguins)
Practical examples in R showing how caret and tidymodels integrate

Week 2 — Cox Proportional Hazard

Handling missing values, scaling, and encoding
Feature normalization using both caret and tidymodels
Real dataset: Heart Disease
Assignment: create and compare preprocessing pipelines

Week 3 — Tidymodels for Survival

Linear and multiple regression models
Regularization (Ridge, Lasso, Elastic Net)
Evaluation metrics (RMSE, MAE, R²)

Implementation using both caret and tidymodels

Week 4 — Penalized Cox Models

Logistic Regression & KNN classifiers
Model performance metrics: Accuracy, ROC, AUC, F1
Implementation with both caret and tidymodels

Visualization with ggplot2

Week 5 — Tree-Based Survival Models

Decision Trees, Random Forests, and Boosting (XGBoost)
Bagging vs Boosting theory and implementation
Performance evaluation and tuning
Bonus: Feature importance visualization using vip and DALEX

Week 6 — Boosted Survival Models

K-Means and Hierarchical Clustering
PCA for dimensionality reduction and visualization
Interpreting clusters and components
Practice using cluster, factoextra, and ggplot2

Week 7 — Deep Learning Survival Models

Understanding overfitting and underfitting
K-fold and repeated cross-validation
Grid and random search hyperparameter tuning

Building end-to-end ML pipelines with tidymodels and caret

Week 8 — Explainability & Communication

Feature importance and post-hoc model interpretation
Partial Dependence Plots (PDP) and DALEX explainers
Local explanations using LIME and SHAP
Ethical implications and responsible ML transparency

Week 9 — Project Workshop

Deploying models using Shiny and plumber
Model serialization with saveRDS() / readRDS()
Building interactive prediction interfaces
REST API creation for programmatic model access
Best practices for versioning and reproducibility

Week 10 — Final Presentations

Full end-to-end ML workflow integration
Real project selection & execution (classification, regression, clustering)
Evaluation, interpretation, and optional deployment
Deliverables: R scripts, report, and presentation
Grading rubric & next steps for professional growth

Each week includes:

🎓 Lecture slides (PowerPoint)
💻 R lab exercises (caret + tidymodels)
📊 Real open datasets

Computing & Tools

All analyses are done in R using packages like caret, tidymodels, ggplot2, and DALEX.
All example files and syntax are downloadable from the course portal.