Question and answer websites (e.g., Stack Exchange Communities) are one of the main online sources for developers to discuss on various technical issues related to software development. Sophisticated extraction of the common-discussion topics related to software development leads to better understanding of users’ needs and discussion trends. In this project, we aim to analyze the textual content of these websites using text mining techniques. We leverage the power of Latent Dirichlet Allocation (LDA), a statistical topic modeling technique, to automatically identify common topics in several domains of software development, including Android application development, cryptography, security, and software engineering. We evaluate our model on several performance measures. The most common topics in each domain are also provided and findings are discussed.
sinadabiri/Software_Engineering_GibbsLDA
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|