The headlines:

  • predictms now supports the Cox model as a transition model, estimated using
    merlin or stmerlin
  • Predictions from a multi-state Cox model are implemented using a
    simulation approach
  • Supported predictions from a multi-state Cox model include transition
    probabilities, probability, and length of stay, los

Let’s take a look at what we can now do with multistate and in particular,
the predictms command.

We’ll use the breast cancer dataset that is included in the multistate package. We load the data,

. use http://fmwww.bc.edu/repec/bocode/m/multistate_example, clear
(Rotterdam breast cancer data, truncated at 10 years)

which comes in wide format,

. list pid rf rfi os osi if pid==1 | pid==1371, sepby(pid) noobs
  +-------------------------------------+
  |  pid     rf   rfi     os        osi |
  |-------------------------------------|
  |    1   59.1     0   59.1      alive |
  |-------------------------------------|
  | 1371   16.6     1   24.3   deceased |
  +-------------------------------------+

where pid is our patient identifier, rf contains the time of recurrence, rfi the recurrence event indicator, os overall survival time and osi our survival indicator. We can use msset to reshape our wide dataset into the stacked format, with a row for each transition of which a patient is at risk for.

. msset, id(pid) states(rfi osi) times(rf os)

msset creates internal variables for use in subsequent analyses, similar to stset,

. list pid _start _stop _from _to _status _trans if pid==1 | pid==1371, noobs
  +---------------------------------------------------------------+
  |  pid      _start       _stop   _from   _to   _status   _trans |
  |---------------------------------------------------------------|
  |    1           0   59.104721       1     2         0        1 |
  |    1           0   59.104721       1     3         0        2 |
  | 1371           0   16.558521       1     2         1        1 |
  | 1371           0   16.558521       1     3         0        2 |
  | 1371   16.558521   24.344969       2     3         1        3 |
  +---------------------------------------------------------------+

and also returns a default transition matrix, if one was not provided. We need to store this for later use. In this case it’s the illness-death transition matrix, as it will assume an upper triangular transition matrix, with a common initial state.

. mat tmat = r(transmatrix)
. mat list tmat
tmat[3,3]
               to:    to:    to:
            start    rfi    osi
from:start      .      1      2
  from:rfi      .      .      3
  from:osi      .      .      .

We can now stset our dataset, using the msset created variables,

. stset _stop, enter(_start) failure(_status==1) scale(12) 

Survival-time data settings

         Failure event: _status==1
Observed time interval: (0, _stop]
     Enter on or after: time _start
     Exit on or before: failure
     Time for analysis: time/12

--------------------------------------------------------------------------
      7,482  total observations
          0  exclusions
--------------------------------------------------------------------------
      7,482  observations remaining, representing
      2,790  failures in single-record/single-failure data
 38,474.539  total analysis time at risk and under observation
                                                At risk from t =         0
                                     Earliest observed entry t =         0
                                          Last observed exit t =  19.28268

We also changed the timescale from months into years by using scale(12). Tumour size at diagnosis, size, is a three-level factor variable, which we now create dummy indicator variables by,

. tab size, gen(sz)

     Tumour |
    size, 3 |
classes (t) |      Freq.     Percent        Cum.
------------+-----------------------------------
    <=20 mm |      3,339       44.63       44.63
  >20-50mmm |      3,327       44.47       89.09
     >50 mm |        816       10.91      100.00
------------+-----------------------------------
      Total |      7,482      100.00

Neither merlin or predictms support factor variables so you must create your own dummies. We can now fit our multi-state model, in this case fitting transition-specific Cox models. For transition 1,

. stmerlin age sz2 sz3 nodes pr_1 hormon if _trans==1, dist(cox) 
Obtaining initial values

Fitting full model:

Iteration 0:   log likelihood = -11210.638  
Iteration 1:   log likelihood = -11209.949  
Iteration 2:   log likelihood = -11209.948  

Survival model                                           Number of obs = 2,982
Log likelihood = -11209.948
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
_t:          |            
         age |  -.0065188   .0020942    -3.11   0.002    -.0106234   -.0024141
         sz2 |   .3699927   .0579848     6.38   0.000     .2563447    .4836407
         sz3 |   .6465575   .0871242     7.42   0.000     .4757972    .8173178
       nodes |   .0787774   .0045086    17.47   0.000     .0699407    .0876141
        pr_1 |  -.0440715   .0115572    -3.81   0.000    -.0667232   -.0214198
      hormon |  -.0545278   .0823277    -0.66   0.508    -.2158871    .1068315
------------------------------------------------------------------------------

. estimates store m1

Note I use stmerlin, which is a wrapper for the more complex merlin command, but it uses the st variables created by stset so is much more convenient for use when fitting standard survival models. We can reassure ourselves that stmerlin‘s Cox model agrees with official Stata’s stcox,

. stcox age sz2 sz3 nodes pr_1 hormon if _trans==1

         Failure _d: _status==1
   Analysis time _t: _stop/12
  Enter on or after: time _start

Iteration 0:   log likelihood = -11429.625
Iteration 1:   log likelihood =   -11225.2
Iteration 2:   log likelihood = -11209.961
Iteration 3:   log likelihood = -11209.948
Refining estimates:
Iteration 0:   log likelihood = -11209.948

Cox regression with Breslow method for ties

No. of subjects =       2,982                           Number of obs =  2,982
No. of failures =       1,518
Time at risk    = 17,203.8001
                                                        LR chi2(6)    = 439.35
Log likelihood = -11209.948                             Prob > chi2   = 0.0000

------------------------------------------------------------------------------
          _t | Haz. ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .9935024   .0020806    -3.11   0.002     .9894329    .9975888
         sz2 |   1.447724   .0839459     6.38   0.000     1.292198    1.621969
         sz3 |   1.908958   .1663164     7.42   0.000     1.609296    2.264418
       nodes |   1.081963   .0048782    17.47   0.000     1.072445    1.091567
        pr_1 |   .9568855   .0110589    -3.81   0.000     .9354541    .9788079
      hormon |   .9469315   .0779587    -0.66   0.508     .8058257    1.112746
------------------------------------------------------------------------------

which indeed it does. Phew. For transition 2,

. stmerlin age sz2 sz3 nodes pr_1 hormon if _trans==2, dist(cox)
Obtaining initial values

Fitting full model:

Iteration 0:   log likelihood = -1187.0791  
Iteration 1:   log likelihood = -1186.9616  
Iteration 2:   log likelihood = -1186.9616  

Survival model                                           Number of obs = 2,982
Log likelihood = -1186.9616
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
_t:          |            
         age |   .1276223   .0080824    15.79   0.000      .111781    .1434635
         sz2 |   .1803381   .1613346     1.12   0.264     -.135872    .4965482
         sz3 |   .4201997   .2333692     1.80   0.072    -.0371956     .877595
       nodes |   .0438643   .0184724     2.37   0.018     .0076591    .0800694
        pr_1 |   .0306077   .0335758     0.91   0.362    -.0351997    .0964151
      hormon |  -.0934567   .2313061    -0.40   0.686    -.5468083    .3598949
------------------------------------------------------------------------------

. estimates store m2

and for transition 3,

. stmerlin age sz2 sz3 nodes pr_1 hormon if _trans==3, dist(cox)
note; a delayed entry model is being fitted
Obtaining initial values

Fitting full model:

Iteration 0:   log likelihood = -6277.4464  
Iteration 1:   log likelihood = -6277.4409  
Iteration 2:   log likelihood = -6277.4409  

Survival model                                           Number of obs = 1,518
Log likelihood = -6277.4409
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
_t:          |            
         age |   .0047152   .0024199     1.95   0.051    -.0000277    .0094581
         sz2 |   .1612421   .0712509     2.26   0.024      .021593    .3008913
         sz3 |   .3141532   .0992226     3.17   0.002     .1196805     .508626
       nodes |   .0288918   .0057254     5.05   0.000     .0176702    .0401134
        pr_1 |  -.1028742   .0139701    -7.36   0.000    -.1302551   -.0754934
      hormon |   .0879435   .0966455     0.91   0.363    -.1014783    .2773653
------------------------------------------------------------------------------

. estimates store m3

Each time we store the model results using estimates store. We can then pass the model objects to predictms to obtain a huge range of predictions. It’s that simple.

First we create a time variable, at which to calculate predictions,

. range tvar 0 15 100
(7,382 missing values generated)

We request transition probabilities with the probability option, which by default assume you start in state 1 at time 0. I’m also predicting for a patient with age 50 through use of the at1() option (variables not specified in the at#() statement(s) will be set to 0),

. predictms,      transmatrix(tmat)       ///
>                 models(m1 m2 m3)        ///
>                 probability             ///
>                 at1(age 50)             ///
>                 timevar(tvar)

which gives us some new variables,

. list _prob_* tvar if inlist(_n,34,67,100), noobs ab(12)

  +---------------------------------------------------+
  | _prob_at1_~1   _prob_at1_~2   _prob_at1_~3   tvar |
  |---------------------------------------------------|
  |       .67081         .14757         .18162      5 |
  |       .51839         .13569         .34592     10 |
  |        .3952         .12417         .48063     15 |
  +---------------------------------------------------+

We can use the graphms command to obtain a stacked plot of the transition probabilities,

. graphms

As well as transition probabilities, we can calculate the mean length of time spent in each state, as a function of follow-up time, by adding the los option,

. predictms,      transmatrix(tmat)       ///
>                 models(m1 m2 m3)        ///
>                 probability             ///
>                 los                     ///
>                 at1(age 50)             ///
>                 timevar(tvar)
. list _los_* tvar if inlist(_n,34,67,100)

      +------------------------------------------+
      | _los_at~1   _los_at~2   _los_at~3   tvar |
      |------------------------------------------|
  34. | 4.1014629   .51887885   .37965824      5 |
  67. | 7.0596484   1.2160731   1.7242785     10 |
 100. |  9.374813   1.7923428   3.8328442     15 |
      +------------------------------------------+

We can get useful contrasts between covariate patterns through use of the at2() option (you can use more at#()s if you wish). Here, we’ll calculate the difference in transition probabilities for a patient aged 60, compared to a patient aged 50 (at1() is the default reference group – you can change that with atreference()),

. predictms,      transmatrix(tmat)       ///
>                 models(m1 m2 m3)        ///
>                 probability             ///
>                 los                     ///
>                 at1(age 50)             ///
>                 at2(age 60)             ///
>                 timevar(tvar)           ///
>                 difference
. list _diff_prob_* tvar if inlist(_n,34,67,100), ab(15)

      +------------------------------------------------------------+
      | _diff_prob_at~1   _diff_prob_at~2   _diff_prob_at~3   tvar |
      |------------------------------------------------------------|
  34. |          .00441           -.01051             .0061      5 |
  67. |         -.00306           -.01338            .01644     10 |
 100. |         -.02906           -.01822            .04728     15 |
      +------------------------------------------------------------+

Instead of the difference, we could get the ratio,

. predictms,      transmatrix(tmat)       ///
>                 models(m1 m2 m3)        ///
>                 probability             ///
>                 los                     ///
>                 at1(age 50)             ///
>                 at2(age 60)             ///
>                 timevar(tvar)           ///
>                 ratio

The above is brief (we know), and shows the main new additions to functionality. Currently, confidence intervals through the ci option are not supported, but we’re working on this, along with many more developments. Drop us an email to [email protected], for any feedback, feature requests and bug reports.

Latest Resources

Specialist subjects

Real-world evidence (RWE)

Real-world evidence (RWE) Data and information that, unlike data generated in clinical trials conducted in controlled environments, has been obtained from everyday clinical practice, patient registers, or other sources outside the clinical trial setting.   RWE plays a crucial role in complementing traditional clinical trial data, providing insights into the safety, effectiveness, and overall performance […]
Read more

Videos

State-of-the-art statistical models for modern HTA

At @RedDoorAnalytics, we develop methodology and software for efficient modelling of biomarkers, measured repeatedly over time, jointly with survival outcomes, which are being increasingly used in cancer settings. We have also developed methods and software for general non-Markov multi-state survival analysis, allowing for the development of more plausible natural history models, where patient history can […]
Read more

Videos

Multilevel (hierarchical) survival models: Estimation, prediction, interpretation

Hierarchical time-to-event data is common across various research domains. In the medical field, for instance, patients are often nested within hospitals and regions, while in education, students are nested within schools. In these settings, the outcome is typically measured at the individual level, with covariates recorded at any level of the hierarchy. This hierarchical structure […]
Read more

Statistical Primers

What are competing risks?

Competing risks In survival analysis, competing risks refer to the situation when an individual is at risk of experiencing an event that precludes the event under study to occur. Competing risks commonly occur in studies of cause-specific mortality, as all other causes of death than the one under study might happen before the individuals “have […]
Read more

Statistical Primers

What is immortal time bias?

Immortal time bias Immortal time bias is a type of bias that can occur in observational research when the study design allows for a period of time during which the outcome of interest cannot occur, often referred to as “immortal time”. Simply put, immortal time bias occurs when information from a future event is incorporated into the […]
Read more

Statistical Primers

What is the proportional hazards assumption?

Proportional hazards Proportional hazards in survival analysis means that the rate at which an event of interest occurs over time for two or more groups or individuals is proportional over time. Specifically, it assumes that the hazard ratio, which represents the relative rate of an event occurring between two groups or individuals, is constant over […]
Read more

Statistical Primers

What is censoring?

Censoring refers to a situation in survival analysis where the event of interest is not observed for some of the individuals under study. In this Statistical Primer, we’ll define three types of censoring often seen in survival analysis studies. Censoring occurs when the information on the survival time is incomplete or only partially observed. Censoring […]
Read more

Statistical Primers

What is the Cox model?

The Cox model The Cox model, also known as the proportional hazards model, is a popular statistical tool used to analyse survival data. It was developed by British statistician Sir David Cox, and published in 1972. It has gained popularity largely by avoiding making parametric assumptions about the shape of the baseline rate in a […]
Read more

Statistical Primers

What is survival analysis?

Survival analysis is a statistical method used to analyse the time until an event of interest occurs. The key feature of survival analysis is that the outcome has two dimensions: – an event indicator (yes/no), and – the time spent at risk for the event All survival analyses require precise definitions of start and end of […]
Read more

Tutorials

Multivariate joint longitudinal-survival models

Joint longitudinal-survival models have been widely developed, but there are many avenues of research where they are lacking in terms of methodological development, and importantly, accessible implementations. We think merlin fills a few gaps. In this post, we’ll take a look at the extension to modelling multiple continuous longitudinal outcomes, jointly with survival. For simplicity, I’ll concentrate […]
Read more

Tutorials

Simulation and estimation of three-level survival models: IPD meta-analysis of recurrent event data

In this example I’ll look at the analysis of clustered survival data with three levels. This kind of data arises in the meta-analysis of recurrent event times, where we have observations (events or censored), k (level 1), nested within patients, j (level 2), nested within trials, i (level 3). Random intercepts The first example will […]
Read more

Tutorials

Probabilistic sensitivity analysis and survival models

Today we’re going to take a little look into probabilistic sensitivity analysis (PSA), and how it can be implemented within the context of survival analysis. Now PSA is used extensively in health economic modelling, where a particular parameter (or parameters) of interest, are altered or varied, to represent different scenarios and levels of variation. We […]
Read more
All Resources