Clustering questions in community question answering systems

In a Community Question Answering (CQA) service, each user interaction is different and since there are a variety of complex questions, identifying similar questions for re-using answers is difficult. This is mainly because of lexical mismatch problem. This paper aims to develop a quadripartite graph-based clustering (QGC) approach by harnessing relationship of a question with common answers and associated users. It was found that QGC approach outperformed other baseline clustering techniques in identifying similar questions in CQA corpora. We believe that these findings can serve to guide future developments in the reuse of similar question in CQA services.