Interactive Naive Bayes using Shiny: Text Retrieval, Classification, Quantification
Interactive Machine Learning (IML) is a relatively new area of ML where focused interactions between algorithms and humans allow for faster and more accurate model updates with respect to classical ML algorithms. By involving users directly in the process of optimizing the parameters of the ML model, it is possible to quickly improve the effectiveness of the model and also to understand why some values of the parameters of the model work better than others through low-cost trial and error and experimentation with inputs and outputs. In this talk, we show three interactive applications developed with the Shiny package on the problems of text retrieval, text classification and text quantification. These applications implement a probabilistic model that use the Naïve Bayes (NB) assumption which has been widely recognised as a good trade-off between efficiency and efficacy, but it achieves satisfactory results only when optimized properly. All these three applications provide a two-dimensional representation of probabilities that has been inspired by the approach named Likelihood Spaces. This representation provides an adequate data visualization to understand how parameters and costs optimization affects the performance of the retrieval/classification/quantification application in a real machine learning setting on standard text collections. We will show that this particular geometrical interpretation of the probabilistic model together with the interaction significantly improves not only the performance but also the understanding of the models and opens new perspectives for new research studies.