It is amazing how many hours one can spend with data and get...well...50 cells along in an ENTIRE day! But I did learn a lot about what one CAN and canNOT do to sweet talk decision trees and random forest models. I tried and tried to get data that would give me more exciting plots, but it looks like 85.5% accuracy against a validation set is about the best one can do, at least when dealing with the rain in Australia.
Just blame it on the data, right?
Here are my favorite cells from my work today! I am quite proud of my precious little functions to get some training vs validation numbers out of all my lovely trees and then to plot those findings so that I can come up with the perfect combinations of parameter values for the best possible results! This is a FANTASTIC way to TRULY know a machine learning model! And those are some mighty gorgeous plots, if I do say so myself!