DJ_laundry_list t1_j2q6fzs wrote on January 3, 2023 at 4:07 AM

When you say "has any influence", I'm assuming you mean causal influence, rather than just being correlated with a particular outcome. This puts us in the domain of causal inference. I suggest you go through a causal inference tutorial or two to get some domain knowledge. See https://causalinference.gitlab.io/kdd-tutorial/ and https://economics.mit.edu/sites/default/files/inline-files/causal_tutorial_1.pdf. Econometric modeling revolves heavily around this, so you're probably going to find more sources that are econometric rather than medical.

My personal approach: Train an xgboost model (or really any ML model) using the appropriate hazard function and bayesian optimization for hyperparameter tuning, then compare the log likelihood function of the parameter at its actual values vs counterfactual values. If the counterfactual values provide a similar fit, you're looking at something that is not likely causal.

cantfindaname2take t1_j2rch7q wrote on January 3, 2023 at 12:08 PM

On that note there is a library that extends xgboost's parameters with survival analysis capabilities specifically. Here is a tutorial: https://loft-br.github.io/xgboost-survival-embeddings/how_xgbse_works.html

DJ_laundry_list t1_j2uzlay wrote on January 4, 2023 at 3:09 AM

Good call, didn't know about that