Skip to content

R pkg for estimating counterfactual survival curves using state-of-the-art causal inference methods

Notifications You must be signed in to change notification settings

nishimura-zeger-lab/CausalSurvival

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CausalSurvival: causal survival methods for large-scale observational data

The CausalSurvival R package implements several causal survival methods for analyzing large-scale observational data with point treatment, baseline covariates and discrete time-to-event outcome with informative right-censoring. The estimation methods include inverse probability weighting (IPW, weighting by the inverse of propensity score and censoring probability), propensity score stratification (stratified Cox model), Targeted Maximum Likelihood Estimation (TMLE), and augmented IPW (AIPW) in estimating the difference in survival probability and restricted mean survival time at one or more pre-specified time points.

Install package from GitHub

remotes::install_github("nishimura-zeger-lab/CausalSurvival")

Estimation methods

This section shows how to run the algorithms with the breast cancer dataset in the survival package.

Load dataset

To run the algorithms, we need to organize data into several data frames: id, treatment, outcome, time and covariates. We then further coarsen survival data into time intervals with each interval containing 10 events.

library(survival)
library(tidyr)
library(dplyr)
library(CausalSurvival)

data(cancer, package="survival")

id <- 1:dim(gbsg)[1]
treatment <- gbsg$hormon
outcome <- gbsg$status
time <- gbsg$rfstime

gbsg$size <- (gbsg$size-mean(gbsg$size))/sd(gbsg$size)
gbsg$pgr <- (gbsg$pgr-mean(gbsg$pgr))/sd(gbsg$pgr)
gbsg$er <- (gbsg$er-mean(gbsg$er))/sd(gbsg$er)
gbsg$grade <- ifelse(gbsg$grade == 1, 1, 0)
gbsg$nodes <- ifelse(gbsg$nodes == 1, 1, 0)
covariates <- gbsg %>% mutate(rowId = id) %>% 
                          gather(covariateId, covariateValue, -c(hormon, status, rfstime, rowId)) %>% 
                          select(-c(hormon, status, rfstime))
covariates$covariateId <- as.numeric(as.character(factor(covariates$covariateId, labels=1:8)))

coarsenedData <- coarsenData(time=time, outcome=outcome)

Estimate nuisance parameters

First, estimate the nuisance parameters (propensity score, survival and censoring hazards) with ridge logistic regression.

## propensity score
estimatedPS <- estimateTreatProb(id=id, treatment=treatment, covariates=covariates)

## survival hazards
estimatedSurvHazard <- estimateHazards(coarsenedData=coarsenedData, outcome=outcome, 
                                       treatment=treatment, covariates=covariates, 
                                       hazEstimate="survival", hazMethod="ns")
                                       
## censoring hazards
estimatedCenHazard <- estimateHazards(coarsenedData=coarsenedData, outcome=outcome, 
                                      treatment=treatment, covariates=covariates, 
                                      hazEstimate="censoring", hazMethod="ns")
                                      

IPW

ipw <- computeEstimator(treatment=treatment, outcome=outcome, time=coarsenedData$timeInt,
                       initial_survHaz=NULL, cenHaz=estimatedCenHazard$haz, treatProb=treatProb,
                       coarseningParam=coarsenedData, method="IPW")

Stratified Cox model

cox <- CausalSurvival::computeEstimator(treatment=treatment, outcome=outcome, time=coarsenedData$timeInt,
                                        initial_survHaz=estimatedSurvHazard$haz, cenHaz=estimatedCenHazard$haz, 
                                        treatProb=treatProb, coarseningParam=coarsenedData_sim, 
                                        method="cox", hazMethod="ns")

weightedCox <- CausalSurvival::computeEstimator(treatment=treatment, outcome=outcome, time=coarsenedData$timeInt,
                                                initial_survHaz=estimatedSurvHazard$haz, cenHaz=estimatedCenHazard$haz, 
                                                treatProb=treatProb, coarseningParam=coarsenedData_sim, 
                                                method="weightedCox", hazMethod="ns")

TMLE

tmle <- computeEstimator(treatment=treatment, outcome=outcome, time=coarsenedData$timeInt,
                         initial_survHaz=estimatedSurvHazard$haz, cenHaz=estimatedCenHazard$haz, treatProb=treatProb,
                         coarseningParam=coarsenedData, method="TMLE")

AIPW

aipw <- computeEstimator(treatment=treatment, outcome=outcome, time=coarsenedData$timeInt,
                         initial_survHaz=estimatedSurvHazard$haz, cenHaz=estimatedCenHazard$haz, treatProb=treatProb,
                         coarseningParam=coarsenedData, method="AIPW")

More information on using estimation methods with a simulated dataset to perform a simulation study and output the two types of causal estimands of interest specified in the protocol can be found in Vignette.

References

About

R pkg for estimating counterfactual survival curves using state-of-the-art causal inference methods

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published