Subject: Re: ML for MI
FC & Bob ..... I think the Kaggle evaluation is a bit misguided. R and Python are an increasingly convergent place - infact I went to the website of Wes Mckinney - he now works at Posit .... ie erstwhile R-Studio labs. Posit is clearly aiming at bringing massive convergence and fungibility between these 2 platforms.
Most tools are common - but there's a key differentiator typically for Python - its a development tool, while R is primarily research oriented ( Shiny being exceptional attempt). Because both are in-memory -computing architecture is critical and this is where R has overtaken Python in the last few years. A lot of this may not be very relevant at all for the ML exercise undertaken for MI type attributes ( Data is atmost Daily and most firm related attributes change only on Quarterly basis) - but if the universe is all stocks ( like CRSP) - computing efficiency will start to matter.
If you are into writing your own code - pick whichever you are comfortable with.
But if you are just after something which involves minimal code or you want GUI look-and-feel my suggestions would be
H2O.ai : That team is primarily responsible for most of the Big data breakthrus in R and that platform is also cross - infact their Python one is a bit more feature rich.
But they use Distributed computing - so you can pretty much throw a lot at it.
Infact once you setup an ML pipeline in H2O it would practically work across algorithms/methods. And they also now have Deep Learners
MLR3 : R specific ML pipelines
Rattle : This is practically a defunct R package - but it does basic ML modeling in a GUI driven interface - however most of its dependencies are not supported on CRAN anymore. And SIZE of DATA will be an issue. I dont think it can efficiently process something like CRSP.
Best of luck!