Data Analysis and Text Mining is gaining its intrinsic value over time. And a major part of it is because of the availability of various tools and application algorithms that make it so easy.

Some of our best picks are below

Decision Trees – Ease in data regression and classification. Some of the popular Decision Tree algorithms include the C4.4, CART (Classification And Regression Tree) and ID3.

Naive-Bayes – A popular text mining technique that works on Probabilistic Classification of content categories. Usually used in binary result application such as positive and negative, open and close, spam or not spam.

Artificial Neural Networks

These are non-linear based classifications. Highly cognitive and aid in deep learning of data. Mostly used in stock analysis, handwriting identification/analysis and predictive analytics.

Support Vector Machines

SVMs are Supervised Machine Learning algorithms that are applied on regression and classification scenarios. Mostly used in classification of visual data such as facial recognition, image classification, image ALT attributes, etc.

K Nearest Neighbors

k-NN is used is search items where you are looking for something similar. You determine similarity by creating a vector representation of the items and then compare how similar or dissimilar they are using a distance metric like Euclidean distance.         

The best example of k-NN’s prowess is an e-commerce site’s product recommendation feature. You can also utilize k-NN to do Concept Search (finding semantically similar documents).

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocations help establish linear connections between different pieces of data. LDAs find features that distinguish multiple events, classes and objects and helps establish connections between different units of data for holistic understanding.

Which of these is your favourite?