How to choose the Machine Learning algorithm to use for a problem?
The choice of machine learning algorithm to solve a particular problem is very hard to determine before trying a bunch of algorithms along with hyperparameter optimisation. But there are some pointers that can be kept in mind while figuring out the right algorithm:
1. Time Series Data: For data having one dependent variable in the form of time sequence algorithms like ARIMA or sequence models like LSTM can be benchmarked to find the optimum solution.
2. Speech/ Text Analytics: Probably a deep learning based approach along with sequence to sequence models like RNN and LSTM can be a good start.
3. Text Classification: Sequence models like HMM, CRF and LSTM can be tried for this solution.
4. Structured Data(Regression): Linear regression can be used as baseline, followed by SVM regression followed by using non linear kernels like rbf. Tree based ensemble models like Random Forrest and XGBoost should be tried for a more intuitive solution.
5. Structured Data(Classification): We can start with Logistic Regression for baseline. It also explains the importance of each of the dependent variables in terms of the coefficient. Furthermore, SVM, Random Forrest and XGBoost can be tried. If we have a large number of training examples and better hardware then solution based on deep learning along with appropriate activation functions like softmax(Multiclass) or sigmoid(binary) can be used.
6. Image/Video based Data: For image/video based data a pre trained DL based network is a good starting point. Particularly tried and tested architectures like VGG and Resnet 50 trained on Image net dataset can be used and the final few layers can be retrained to tail it to our particular problem. For real time object detection YOLO is a very elegant solution and can be given a try.