scikit-survival

Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.

Install

Hot:9

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-scikit-survival&locale=en&source=copy

scikit-survival: Complete Python Survival Analysis Toolkit

Overview

scikit-survival is a comprehensive toolkit designed for survival analysis in Python, providing a complete workflow from data preprocessing to model evaluation. It supports various algorithms such as Cox models, random survival forests, gradient boosting, and survival SVM.

Use Cases

1. Medicine and Biomedical Research

Handles clinical trial data, patient survival time analysis, and disease prognosis modeling. Supports right-censoring, left-censoring, and interval-censoring, making it suitable for cancer research, cardiovascular studies, and other time-to-event analysis scenarios.

2. High-dimensional Feature Selection and Modeling

When the number of features exceeds the number of samples, use CoxnetSurvivalAnalysis (elastic net regularization) for feature selection and dimensionality reduction. This is suitable for gene expression data analysis, molecular biomarker screening, and similar scenarios.

3. Modeling Complex Nonlinear Relationships

Use random survival forests (RSF) or gradient boosting survival analysis (GBSA) to capture complex nonlinear relationships between features and survival time, suitable for scenarios aiming for the highest predictive performance.

Core Features

1. Diverse Survival Models

Provides Cox proportional hazards models (including regularized versions), random survival forests, gradient boosting survival analysis, survival support vector machines, and other algorithms, covering needs from interpretable modeling to high-performance prediction.

2. Specialized Evaluation Metrics

Built-in concordance index (C-index, supporting Harrell and Uno variants), time-dependent AUC, Brier score, and other survival-specific evaluation metrics ensure scientific and accurate model assessment.

3. Competing Risks and Nonparametric Estimation

Supports the cumulative incidence function (CIF) for competing risks analysis and provides Kaplan-Meier and Nelson-Aalen nonparametric estimators to meet diverse survival analysis needs.

Frequently Asked Questions

What skill level is scikit-survival suitable for?

Suitable for users with basic Python and pandas/numpy experience. If you are already familiar with scikit-learn's API style, scikit-survival has a very gentle learning curve. The package includes comprehensive documentation from introductory to advanced topics, including detailed references for Cox models, ensemble methods, SVMs, and other models.

How to choose the appropriate survival model?

If interpretability is required, choose CoxPHSurvivalAnalysis; if the data are high-dimensional (number of features > number of samples), choose CoxnetSurvivalAnalysis; if you seek the highest predictive performance and have sufficient sample size, choose GradientBoostingSurvivalAnalysis or RandomSurvivalForest; for medium-sized datasets consider FastSurvivalSVM.

How are censored data handled?

scikit-survival represents survival outcomes using structured arrays created via sksurv.util.Surv. Right-censoring is the most common case and is handled automatically by the models. For high censoring rates (>40%), it is recommended to use Uno's C-index rather than Harrell's C-index for evaluation to obtain more robust results.