Simpson's Paradox and Kidney Stone
It is such a coincident. I was reading up on Simpson Paradox a few days ago and then when I logged into my Datacamp account yesterday, I noticed that Datacamp just uploaded a new project related to Simpson Paradox, based on a medical study on kidney stone treatment.
So what is Simpson's Paradox?
Simpson's Paradox occurs when trends appear in subgroups but disappear or reverse at the population levels.
The Simpson's Paradox and Kidney Stone project compiled by Datacamp is based on the medical dataset published in 1986, in "The British Medical Journal". The dataset simulated by Datacamp includes treatment data measuring the effectiveness of two types of kidney stone treatments and the size of the stone. This goal of the medical study is to compare the success rates of the two treatments.
Eventually, the published study showed that the "lurking" variable (or confounding variable) is the severity of the case, which in turn influence the doctors treatment decision.
The following are the results and codes inspired from Datacamp project.
a. The Logistic Regression Result
b. The Coefficient Estimate Plot from the model result
c. The R code is as below: (I'll edit and repost after I recompile the codes with KnitR)