If we have large unstructred text data, it's very diffcult to understand the content/context as it takes a lot of effort and time to go through the content. Hence there is a need of technique which can help us to find out key topics discussed in text wihout even reading it. This technique is know as Topic Modelling.
A lot of research has been done on Topic Modelling. Currently latent dirichlet allocation (LDA) is state of the art algorithm to find hidden topics in text data. I have used Gensim which is one of the most popular library for topic modelling in python.
Using LDA, I tried to find out key topic discussed in New york times dataset. To visuzalize topic word distribution I've used pyLDAvis, word cloud and co-occurence graphs.
please refer the notebook for more details.