Visual exploration of software evolution via topic modeling

For various reasons, such as new requirements, architecture refactoring, and bug fixing, software projects often evolve to yield better quality and performance. All changes produced during the development process are reflected in the source code, which provides an opportunity to explore software evolution. In this paper, we propose a visual analytics system to support evolution analysis based on topic modeling. We focus on three aspects: (1) when significant changes to source code occur, (2) how software features evolve, and (3) why software evolution occurs. Each source file is regarded as a document and represented by its topic vector. The files of each two successive versions are classified into four types to quantify version differences, and the number of topic-associated files is denoted as the topic assignment to characterize feature evolution. Finally, we inspect the causes of software evolution through the visual comparison between versions. Two case studies on JavaScript libraries demonstrate the usefulness and effectiveness of our system.