Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews (MSR 2021 - Registered Reports)

Who

Mohammad Abdul Hadi, Fatemeh Hendijani Fard

Track

MSR 2021 Registered Reports

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 May 2021 02:17 - 02:20 at MSR Room 2 - NLP Chair(s): Chunyang Chen

Abstract

Context: Mobile app reviews written by users on app stores or social media are significant resources for app developers. Analyzing app reviews have proved to be useful for many areas of software engineering (e.g. requirement engineering, testing). Automatic classification of app reviews requires extensive efforts to manually curate a labeled dataset. When the classification purpose changes (e.g. identifying bugs versus usability issues or sentiment), new datasets should be labeled, which prevents the extensibility of the developed models for new desired classes/tasks in practice. Recent pre-trained neural language models (PTM) are trained on large corpora in an unsupervised manner and have found success in solving similar Natural Language Processing problems. However, the applicability of PTMs is not explored for app review classification.

Objective: We investigate the benefits of PTMs for app review classification compared to the existing models, as well as the transferability of PTMs in multiple settings.

Method: We empirically study the accuracy and time efficiency of PTMs compared to prior approaches using six datasets from literature. In addition, we investigate the performance of the PTMs trained on app reviews (i.e. domain-specific PTMs). We set up different studies to evaluate PTMs in multiple settings: binary vs. multi-class classification, zero-shot classification (when new labels are introduced to the model), multi-task setting, and classification of reviews from different resources. The datasets are manually labeled app review datasets from Google Play Store, Apple App Store, and Twitter data. In all cases, Micro and Macro Precision, Recall, and F1-scores will be used and we will report the time required for training and prediction with the models.

Link to Preprint

https://arxiv.org/abs/2104.05861

Mohammad Abdul Hadi

University of British Columbia

Canada

Fatemeh Hendijani Fard

University of British Columbia

Canada

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50	NLPRegistered Reports / Technical Papers at MSR Room 2 Chair(s): Chunyang Chen Monash University

02:01 4m Talk		Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions Technical Papers Sofonias Yitagesu Tianjin University, Xiaowang Zhang Tianjin University, Zhiyong Feng Tianjin University, Xiaohong Li TianJin University, Zhenchang Xing Australian National University Pre-print
02:05 4m Talk		Attention-based model for predicting question relatedness on Stack Overflow Technical Papers Jiayan Pei South China University of Technology, Yimin Wu South China University of Technology, Research Institute of SCUT in Yangjiang, Zishan Qin South China University of Technology, Yao Cong South China University of Technology, Jingtao Guan Research Institute of SCUT in Yangjiang Pre-print
02:09 4m Talk		Characterising the Knowledge about Primitive Variables in Java Code Comments Technical Papers Mahfouth Alghamdi The University of Adelaide, Shinpei Hayashi Tokyo Institute of Technology, Takashi Kobayashi Tokyo Institute of Technology, Christoph Treude University of Adelaide Pre-print
02:13 4m Talk		Googling for Software Development: What Developers Search For and What They Find Technical Papers Andre Hora UFMG Pre-print Media Attached
02:17 3m Talk		Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews Registered Reports Mohammad Abdul Hadi University of British Columbia, Fatemeh Hendijani Fard University of British Columbia Pre-print
02:20 3m Talk		Cross-status Communication and Project Outcomes in OSS Development–A Language Style Matching Perspective Registered Reports Yisi Han Nanjing University, Zhendong Wang University of California, Irvine, Yang Feng State Key Laboratory for Novel Software Technology, Nanjing University, Zhihong Zhao Nanjing Tech Unniversity, Yi Wang Beijing University of Posts and Telecommunications Pre-print
02:23 27m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Wed 19 May 2021 02:00 - 02:50 at MSR Room 2 - NLP Chair(s): Chunyang Chen

Info for room MSR Room 2:

Go directly to this room on Clowdr