Brace yourselves, this one is going to be intense! I selected the next paper due to its use of an unusual modeling technique, the random forest model (RFM). What, you say, is that? Exactly!
The RFM is a relatively new method (5 years old for trauma stuff) that uses artificial intelligence (AI) to try to tease out relationships in data. It is different from its better known cousin, the neural network. The RFM tries to strike a balance of flexibility so that it can deduce rules from data sets that may not otherwise be apparent.
The authors from the trauma program at Emory in Atlanta wanted to develop a predictive model to identify factors leading to acute kidney injury in trauma patients. They assembled a small data set from 145 patients culled over a four year period. Some esoteric lab tests were collected on these patients (including serum vascular endothelial growth factor and serum monocyte chemoattractant protein-1), the sequential organ failure assess score (SOFA) was calculated, and then all was fed to the machine learning system.
The authors go into some detail about how they accomplished this work. The main results are the sensitivity and specificity of both the RFM analysis. The RFM numbers were also converted to a regression equation and similarly examined. The area under the receiver operating characteristic curve (AUROC) was calculated for both.
Here are the factoids when using SOFA and the two biomarkers above:
- For RFM: sensitivity .82, specificity .61, AUROC 0.74
- For the resulting logistic regression: sens 0.77, spec 0.64, AUROC 0.72
The authors conclude that the biomarkers “may have diagnostic utility” in the early identification of patients who go on to develop AKI and that “further refinement and validation” could be helpful.
I’ll say! First, RFM is a very esoteric analysis tool, especially in the trauma world. Typically, it’s strengths are the following:
- Requires few statistical assumptions like normal distribution
- Allows the use of lower quality models to come up with a result
- Shows the relative importance of each prediction feature, unlike the opacity of neural networks
- It’s complicated
- Doesn’t do well with data outside the ranges found in the dataset
- May be difficult to interpret
But the real problem here is with the results. At this point, they are weak at best. The algorithm predicts only 4 of 5 actual cases of AKI correctly and identifies barely more than half of patients who don’t. Coin toss. A good AUROC number is better than 0.8. The ones obtained here are fair to poor at best.
I understand that this is probably a pilot study. But it seems unlikely that adding more data points will help, especially if the same input parameters are to be used in the future. I think this is an interesting exercise, but I need help seeing any future clinical applicability!
Here are my questions for the presenter and authors:
- Why did it occur to you to try this technique? Who thought to use it? Your statisticians? What was the rationale, aside from not being able to collect any more data for the study? The origin study should be very interesting!
- Given the lackluster results, how are you planning to “refine and validate” to make them better?
- What future do you see for using RFM in other trauma-related studies?
I’m intrigued! Can’t wait to hear the punch lines!
Reference: Random forest model predicts acute kidney injury after trauma laparotomy. AAST Oral Abstract #11.