Write a Blog >>
MSR 2021
Mon 17 - Wed 19 May 2021
co-located with ICSE 2021

Call for Papers

Data Showcase papers should describe data sets that are curated by their authors and made available to use by others. Ideally, these data sets should be of value to others in the community, should be preprocessed or filtered in some way, and should provide an easy-to-understand schema. Data showcase papers are expected to include:

  • a description of the data source,
  • a description of the methodology used to gather the data (including provenance and the tool used to create/generate/gather the data, if any),
  • a description of the storage mechanism, including a schema if applicable,
  • if the data has been used by the authors or others, a description of how this was done including references to previously published papers,
  • a description of the originality of the data set (that is, even if the data set has been used in a published paper, its complete description must be unpublished) and similar existing datasets (if any)
  • ideas for future research questions that could be answered using the data set,
  • ideas for further improvements that could be made to the data set, and
  • any limitations and/or challenges in creating or using the data set.

The data set should be made available at the time of submission of the paper for review but will be considered confidential until publication of the paper. The data set should include detailed instructions about how to set up the data set environment, how to import the data, and how to access the data once it has been imported.

At the latest upon publication of the paper the authors should archive the data on a persistent repository that can provide a digital object identifier (DOI) such as zenodo.org, figshare.com, Archive.org, or institutional repositories. In this way the data will become citable; the DOI-based citation of the data set should be included in the camera-ready version of the paper.

Data showcase papers are not:

  • empirical studies
  • tool demos
  • or data sets that are:1) based on poorly explained or untrustworthy heuristics for data collection, or 2) result of trivial application of generic tools.

If custom tools have been used to create the data set, we expect the paper to be accompanied by the source code of the tools, along with clear documentation on how to run the tools to recreate the data set. The tools should be open source, accompanied by an appropriate license; the source code should be citable, i.e., refer to a specific release and have a DOI. GitHub provides an easy way to make source code citable. If you cannot provide the source code or the source code clause is not applicable (e.g., because the data set consists of qualitative data), please provide a short explanation of why this is not possible.

Dates
Tracks

This program is tentative and subject to change.

You're viewing the program in a time zone which is different from your device's time zone - change time zone

Mon 17 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50
03:10 - 04:00
10:00 - 10:50
Resources for MSR ResearchTechnical Papers / Data Showcase at MSR Room 1
10:00
8m
Talk
PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code
Technical Papers
Egor SpirinJetBrains Research; National Research University Higher School of Economics, Egor BogomolovJetBrains Research, Vladimir KovalenkoJetBrains Research, Timofey BryksinJetBrains Research, Saint Petersburg State University
10:08
8m
Talk
Mining DEV for social and technical insights about software development
Technical Papers
Maria PapoutsoglouAristotle University of Thessaloniki, Johannes WachsVienna University of Economics and Business & Complexity Science Hub Vienna, Georgia KapitsakiUniversity of Cyprus
Pre-print
10:16
8m
Talk
TNM: A Tool for Mining of Socio-Technical Data from Git Repositories
Technical Papers
Nikolai SviridovITMO University, Mikhail EvtikhievJetBrains Research, Vladimir KovalenkoJetBrains Research
10:25
8m
Talk
Identifying Versions of Libraries used in Stack Overflow Code Snippets
Technical Papers
Ahmed ZeroualiVrije Universiteit Brussel, Camilo Velázquez-RodríguezVrije Universiteit Brussel, Coen De RooverVrije Universiteit Brussel
Pre-print
10:33
8m
Talk
Sampling Projects in GitHub for MSR Studies
Data Showcase
Ozren DabicSoftware Institute, Università della Svizzera italiana (USI), Switzerland, Emad AghajaniSoftware Institute, USI Università della Svizzera italiana, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana
Pre-print
10:41
8m
Talk
gambit – An Open Source Name Disambiguation Tool for Version Control Systems
Technical Papers
Christoph GoteChair of Systems Design, ETH Zurich, Christian ZinggChair of Systems Design, ETH Zurich
Pre-print
10:00 - 10:50
10:00
8m
Talk
A Traceability Dataset for Open Source Systems
Data Showcase
Mouna HammoudiJOHANNES KEPLER UNIVERSITY LINZ, Christoph Mayr-DornJohannes Kepler University, Linz, Atif MashkoorJohannes Kepler University Linz, Alexander EgyedJohannes Kepler University
10:08
8m
Talk
How Java Programmers Test Exceptional Behavior
Technical Papers
Diego MarcilioUSI Università della Svizzera italiana, Carlo A. FuriaUniversità della Svizzera italiana (USI)
Pre-print
10:16
8m
Talk
An Exploratory Study of Log Placement Recommendation in an Enterprise System
Technical Papers
Jeanderson CândidoDelft University of Technology, Jan HaesenAdyen N.V., Maurício AnicheDelft University of Technology, Arie van DeursenDelft University of Technology, Netherlands
Pre-print
10:25
8m
Talk
Does Code Review Promote Conformance? A Study of OpenStack Patches
Technical Papers
Panyawut Sri-iesaranusornNara Institute of Science and Technology, Raula Gaikovina KulaNAIST, Takashi IshioNara Institute of Science and Technology
Pre-print
10:33
8m
Talk
A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests
Technical Papers
Guillaume HabenUniversity of Luxembourg, Sarra HabchiUniversity of Lille, Mike PapadakisUniversity of Luxembourg, Luxembourg, Maxime CordyUniversity of Luxembourg, Luxembourg, Yves Le TraonUniversity of Luxembourg, Luxembourg
Pre-print
10:41
8m
Talk
On the Use of Mutation in Injecting Test Order-Dependency
Registered Reports
Sarra HabchiUniversity of Lille, Maxime CordyUniversity of Luxembourg, Luxembourg, Mike PapadakisUniversity of Luxembourg, Luxembourg, Yves Le TraonUniversity of Luxembourg, Luxembourg
11:10 - 12:00
17:00 - 17:50
Mining Challenge Session Technical Papers at MSR Room 1
17:50 - 18:10
Break / Discussion Rooms Technical Papers at MSR Room 1

Tue 18 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50
03:10 - 04:00
03:10
10m
Talk
AndroCT: Ten Years of App Call Traces in Android
Data Showcase
Wen Li, Xiaoqin FuWashington State University, Haipeng CaiWashington State University, USA
Pre-print Media Attached
03:20
10m
Talk
Mining Workflows for Anomalous Data Transfers
Technical Papers
Huy TuNorth Carolina State University, USA, George PapadimitriouUniversity of Southern California, Mariam KiranESnet, LBNL, Cong WangRenaissance Computing Institute, Anirban MandalRenaissance Computing Institute, Ewa DeelmanUniversity of Southern California, Tim MenziesNorth Carolina State University, USA
Pre-print
03:30
10m
Talk
Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data
Technical Papers
Samuel W. FlintUniversity of Nebraska-Lincoln, Jigyasa ChauhanUniversity of Nebraska-Lincoln, Robert DyerUniversity of Nebraska - Lincoln
Pre-print
03:40
10m
Paper
On the Naturalness and Localness of Software Logs
Technical Papers
Sina GholamianUniversity of Waterloo, Paul A. S. WardUniversity of Waterloo
Pre-print
03:50
10m
Talk
How Do Software Developers Use GitHub Actions to Automate Their Workflows?
Technical Papers
Timothy KinsmanUniversity of Adelaide, Mairieli WesselUniversity of Sao Paulo, Marco GerosaNorthern Arizona University, USA, Christoph TreudeUniversity of Adelaide
Pre-print
10:00 - 10:50
Developer communicationsTechnical Papers / Data Showcase at MSR Room 1
10:00
10m
Talk
Waiting around or job half-done? Sentiment in self-admitted technical debt
Technical Papers
Gianmarco FucciUniversity of Sannio, Nathan CasseeEindhoven University of Technology, Fiorella ZampettiUniversity of Sannio, Italy, Nicole NovielliUniversity of Bari, Alexander SerebrenikEindhoven University of Technology, Massimiliano Di PentaUniversity of Sannio, Italy
Pre-print
10:10
10m
Research paper
Automatically Selecting Follow-up Questions for Deficient Bug Reports
Technical Papers
Mia Mohammad ImranVirginia Commonwealth University, Agnieszka Ciborowska Virginia Commonwealth University, Kostadin DamevskiVirginia Commonwealth University
Pre-print
10:20
10m
Full-paper
An Empirical Study of Developer Discussions on Low Code Software Development Challenges
Technical Papers
Md Abdullah Al AlaminUniversity of Calgary, Sanjay MalakarBangladesh University of Engineering and Technology, Gias UddinUniversity of Calgary, Canada, Sadia AfrozBangladesh University of Engineering and Technology, Tameem Bin HaiderBangladesh University of Engineering and Technology, Anindya IqbalBangladesh University of Engineering and Technology Dhaka, Bangladesh
Pre-print
10:30
10m
Talk
Challenges in Developing Desktop Web Apps: a Study of Stack Overflow and GitHub
Technical Papers
Gian Luca ScocciaUniversity of L'Aquila, Patrizio MigliariniDISIM, University of L'Aquila, Marco AutiliUniversity of L'Aquila, Italy
Pre-print
10:40
10m
Talk
Search4Code: Code Search Intent Classification Using Weak Supervision
Data Showcase
Nikitha RaoMicrosoft Research, Chetan BansalMicrosoft Research, Joe GuanMicrosoft
Pre-print
10:00 - 10:50
10:00
8m
Talk
Fast and Memory-Efficient Neural Code Completion
Technical Papers
Alexey SvyatkovskiyMicrosoft, Sebastian LeeUniversity of Oxford, Anna HadjitofiAlan Turing Institute, Maik RiechertMicrosoft Research, Juliana Vicente FrancoMicrosoft Research, Miltiadis AllamanisMicrosoft Research, UK
Pre-print
10:08
8m
Research paper
Comparative Study of Feature Reduction Techniques in Software Change Prediction
Technical Papers
Ruchika MalhotraDelhi Technological University, Ritvik KapoorDelhi Technological University, Deepti AggarwalDelhi Technological University, Priya GargDelhi Technological University
Pre-print
10:16
8m
Talk
An Empirical Study on the Usage of BERT Models for Code Completion
Technical Papers
Matteo CiniselliUniversità della Svizzera Italiana, Nathan CooperWilliam & Mary, Luca PascarellaUniversità della Svizzera italiana, Denys PoshyvanykCollege of William & Mary, Massimiliano Di PentaUniversity of Sannio, Italy, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana
Pre-print
10:25
8m
Talk
ManyTypes4Py: A benchmark Python dataset for machine learning-based type inference
Data Showcase
Amir MirDelft University of Technology, Evaldas LatoskinasDelft University of Technology, Georgios GousiosFacebook & Delft University of Technology
10:33
8m
Talk
KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
Data Showcase
Luigi QuarantaUniversity of Bari, Italy, Fabio CalefatoUniversity of Bari, Filippo LanubileUniversity of Bari
10:41
8m
Talk
Exploring the relationship between performance metrics and cost saving potential of defect prediction models
Registered Reports
Steffen HerboldUniversity of Göttingen
Pre-print
11:10 - 12:00
17:00 - 17:50
17:00
8m
Talk
An Exploratory Study of Project Activity Changepoints in Open Source Software Evolution
Hackathon
James WaldenNorthern Kentucky University, NoahBurgin, Kuljit Kaur ChahalKaur
17:08
8m
Paper
The Diversity-Innovation Paradox in Open-Source Software
Hackathon
Mengchen Sam YongCarnegie Mellon University, Pittsburgh, Pennsylvania, United States, Lavinia Francesca PaganiniFederal University of Pernambuco, Huilian Sophie QiuCarnegie Mellon University, Pittsburgh, Pennsylvania, United States, José Bayoán Santiago CalderónUniversity of Virginia, USA
DOI Pre-print
17:16
8m
Talk
The Secret Life of Hackathon Code
Technical Papers
Ahmed Samir Imam MahmoudUniversity of Tartu, Tapajit DeyLero - The Irish Software Research Centre and University of Limerick, Alexander NolteUniversity of Tartu, Audris MockusThe University of Tennessee, Jim HerbslebCarnegie Mellon University
Pre-print
17:25
8m
Talk
The Secret Life of Hackathon Code
Hackathon
Ahmed Samir Imam MahmoudUniversity of Tartu, Tapajit DeyLero - The Irish Software Research Centre and University of Limerick
Pre-print
17:33
8m
Talk
Tracing Vulnerable Code Lineage
Hackathon
David ReidUniversity of Tennessee, Kalvin EngUniversity of Alberta, Chris BogartCarnegie Mellon University, Adam TutkoUniversity of Tennessee - Knoxville
Pre-print
17:41
8m
Talk
Building the Collaboration Graph of Open-Source Software Ecosystem
Hackathon
Pre-print
17:00 - 17:50
17:00
8m
Talk
What Code Is Deliberately Excluded from Test Coverage and Why?
Technical Papers
Pre-print
17:08
8m
Talk
AndroR2: A Dataset of Manually-Reproduced Bug Reports for Android apps
Data Showcase
Tyler WendlandUniversity of Minnesota, Jingyang SunUniversity of Bristish Columbia, Junayed MahmudGeorge Mason University, S M Hasan MansurGeorge Mason University, Steven HuangUniversity of Bristish Columbia, Kevin MoranGeorge Mason University, Julia RubinUniversity of British Columbia, Canada, Mattia FazziniUniversity of Minnesota
17:16
8m
Talk
Apache Software Foundation Incubator Project Sustainability Dataset
Data Showcase
Likang YinUniversity of California, Davis, Zhiyuan ZhangUniversity of California, Davis, Qi XuanInstitute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China, Vladimir FilkovUniversity of California at Davis, USA
17:25
8m
Talk
Leveraging Models to Reduce Test Cases in Software Repositories
Technical Papers
Golnaz GharachorluSimon Fraser University, Nick SumnerSimon Fraser University
Pre-print
17:33
8m
Talk
Which contributions count? Analysis of attribution in open source
Technical Papers
Jean-Gabriel YoungUniversity of Vermont, Amanda CasariOpen Source Programs Office, Google, Katie McLaughlinOpen Source Programs Office, Google, Milo TrujilloUniversity of Vermont, Laurent Hébert-DufresneUniversity of Vermont, James P. BagrowUniversity of Vermont
Pre-print
17:41
8m
Talk
On Improving Deep Learning Trace Analysis with System Call Arguments
Technical Papers
Quentin FournierPolytechnique Montréal, Daniel AloisePolytechnique Montréal, Seyed Vahid AzhariCiena, François TetreaultCiena
Pre-print
17:50 - 18:10
Break / Discussion RoomsTechnical Papers at MSR Room 1
18:10 - 19:00

Wed 19 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50
02:00
10m
Talk
Practitioners' Perceptions of the Goals and Visual Explanations of Defect Prediction Models
Technical Papers
Jirayus JiarpakdeeMonash University, Australia, Chakkrit TantithamthavornMonash University, John GrundyMonash University
Pre-print
02:10
10m
Talk
On the Effectiveness of Deep Vulnerability Detectors to Simple Stupid Bug Detection
Mining Challenge
Jiayi HuaBeijing University of Posts and Telecommunications, Haoyu WangBeijing University of Posts and Telecommunications
Pre-print
02:20
10m
Talk
An Empirical Study of OSS-Fuzz Bugs
Technical Papers
Zhen Yu DingMotional, Claire Le GouesCarnegie Mellon University
Pre-print
02:30
10m
Talk
Denchmark: A Bug Benchmark of Deep Learning-related Software
Data Showcase
Misoo KimSungkyunkwan University, Youngkyoung KimSungkyunkwan University, Eunseok LeeSungkyunkwan University
02:40
10m
Talk
JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction
Technical Papers
Chanathip PornprasitMonash University, Chakkrit TantithamthavornMonash University
Pre-print
02:00 - 02:50
02:00
8m
Talk
Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions
Technical Papers
SOFONIAS YITAGESUTianjin University, Xiaowang ZhangTianjin University, Zhiyong FengTianjin University, Li XiaohongTianJin University, Zhenchang XingAustralian National University
Pre-print
02:08
8m
Talk
Attention-based model for predicting question relatedness on Stack Overflow
Technical Papers
Jiayan PeiSouth China University of Technology, Yimin WuSouth China University of Technology, Research Institute of SCUT in Yangjiang, Zishan QinSouth China University of Technology, Yao CongSouth China University of Technology, Jingtao GuanResearch Institute of SCUT in Yangjiang
Pre-print
02:16
8m
Talk
Characterising the Knowledge about Primitive Variables in Java Code Comments
Technical Papers
Mahfouth AlghamdiThe University of Adelaide, Shinpei HayashiTokyo Institute of Technology, Takashi KobayashiTokyo Institute of Technology, Christoph TreudeUniversity of Adelaide
Pre-print
02:25
8m
Talk
Googling for Software Development: What Developers Search For and What They Find
Technical Papers
Pre-print
02:33
8m
Talk
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
Registered Reports
Mohammad Abdul HadiUniversity of British Columbia, Fatemeh Hendijani FardUniversity of British Columbia
02:41
8m
Talk
Cross-status Communication and Project Outcomes in OSS Development–A Language Style Matching Perspective
Registered Reports
Yisi HanNanjing University, Zhendong WangUniversity of California, Irvine, Yang FengState Key Laboratory for Novel Software Technology, Nanjing University, Zhihong ZhaoNanjing Tech Unniversity, Yi WangBeijing University of Posts and Telecommunications
02:50 - 03:10
Break / Discussion Rooms Technical Papers at MSR Room 1
03:10 - 04:00
10:00 - 10:50
10:00
8m
Talk
AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories
Data Showcase
Sebastian NielebockOtto-von-Guericke University Magdeburg, Germany, Paul BlockhausOtto-von-Guericke-University Magdeburg, Germany, Jacob KrügerOtto von Guericke University Magdeburg, Frank OrtmeierOtto-von-Guericke-University Magdeburg, Faculty of Computer Science, Chair of Software Engineering
Pre-print Media Attached
10:08
8m
Talk
GE526: A Dataset of Open Source Game Engines
Data Showcase
Dheeraj VagavoluIndian Institute of Technology Tirupati, Vartika AgrahariIndian Institute of Technology Tirupati, Sridhar ChimalakondaIndian Institute of Technology Tirupati, Akhila Sri Manasa VenigallaIIT Tirupati, India
10:16
8m
Talk
Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution
Data Showcase
Ruben OpdebeeckVrije Universiteit Brussel, Ahmed ZeroualiVrije Universiteit Brussel, Coen De RooverVrije Universiteit Brussel
10:25
8m
Talk
The Wonderless Dataset for Serverless Computing
Data Showcase
Nafise EskandaniTU Darmstadt, Guido SalvaneschiUniversity of St. Gallen
10:33
8m
Talk
DUETS: A Dataset of Reproducible Pairs of Java Library-Clients
Data Showcase
Thomas DurieuxKTH Royal Institute of Technology, Sweden, César Soto-ValeroKTH Royal Institute of Technology, Benoit BaudryKTH Royal Institute of Technology
Pre-print
10:41
8m
Talk
EQBENCH: A Dataset of Equivalent and Non-equivalent Program Pairs
Data Showcase
Sahar BadihiUniversity of British Columbia, Canada, Yi LiNanyang Technological University, Singapore, Julia RubinUniversity of British Columbia, Canada
10:00 - 10:50
10:00
10m
Talk
Identifying Critical Projects via PageRank and Truck Factor
Technical Papers
Rolf-Helge PfeifferIT University of Copenhagen
10:10
10m
Talk
Revisiting Dockerfiles in Open Source Software Over Time
Technical Papers
Kalvin EngUniversity of Alberta, Abram HindleUniversity of Alberta
Pre-print
10:20
10m
Talk
Can I Solve it? Identifying the APIs required to complete OSS tasks
Technical Papers
Fabio Marcos De Abreu SantosNorthern Arizona University, USA, Igor Scaliante WieseFederal University of Technology – Paraná - UTFPR, Bianca TrinkenreichNorthern of Arizona Univeristy, Igor SteinmacherUnivesidade Tecnologica Federal do Parana, Anita SarmaOregon State University, Marco GerosaNorthern Arizona University, USA
Pre-print
10:30
10m
Talk
On the Use of Dependabot Security Pull Requests
Technical Papers
Mahmoud AlfadelConcordia Univerisity, Diego CostaConcordia University, Canada, Emad ShihabConcordia University, Mouafak MkhallalatiConcordia University
Pre-print
10:40
10m
Talk
Does the First-Response Matter for Future Contributions? A Study of First Contributions
Registered Reports
Noppadol AssavakamhaenghanNara Institute of Science and Technology, Supatsara WattanakriengkraiNara Institute of Science and Technology, Naomichi ShimadaNara Institute of Science and Technology, Raula Gaikovina KulaNAIST, Takashi IshioNara Institute of Science and Technology, Kenichi MatsumotoNara Institute of Science and Technology
10:50 - 11:10
Break / Discussion Rooms Technical Papers at MSR Room 1
11:10 - 12:00
Mini-Keynotes Technical Papers at MSR Room 1
17:00 - 17:50
Energy and LoggingTechnical Papers at MSR Room 1
17:00
12m
Talk
S3M: Siamese Stack (Trace) Similarity Measure
Technical Papers
Aleksandr KhvorovJetBrains, ITMO University, Roman VasilievJetBrains, George ChernishevSaint-Petersburg State University, Irving Muller RodriguesPolytechnique Montreal, Montreal, Canada, Dmitrij KoznovSaint-Petersburg State University, Nikita PovarovJetBrains
Pre-print
17:12
12m
Talk
Mining the ROS ecosystem for Green Architectural Tactics in Robotics and an Empirical Evaluation
Technical Papers
Ivano MalavoltaVrije Universiteit Amsterdam, Katerina ChinnappanVrije Universiteit Amsterdam, Stan SwanbornVrije Universiteit Amsterdam, The Netherlands, Grace LewisCarnegie Mellon Software Engineering Institute, Patricia LagoVrije Universiteit Amsterdam
17:25
12m
Talk
Mining Energy-Related Practices in Robotics Software
Technical Papers
Michel AlbonicoUTFPR, Ivano Malavolta, Gustavo PintoFederal University of Pará, Emitzá Guzmán, Katerina ChinnappanVrije Universiteit Amsterdam, Patricia LagoVrije Universiteit Amsterdam
Pre-print
17:37
12m
Talk
Mining API Interactions to Analyze Software Revisions for the Evolution of Energy Consumption
Technical Papers
Andreas SchulerUniversity of Applied Sciences Upper Austria, Gabriele Anderst-KotsisJohannes Kepler University, Linz, Austria
Pre-print
17:00 - 17:50
Change Management and AnalysisTechnical Papers / Registered Reports at MSR Room 2
17:00
10m
Talk
Studying the Change Histories of Stack Overflow and GitHub Snippets
Technical Papers
Saraj Singh ManesCarleton University, Olga BaysalCarleton University
Pre-print
17:10
10m
Talk
Learning Off-By-One Mistakes: An Empirical Study
Technical Papers
Hendrig SellikDelft University of Technology, Onno van ParidonAdyen N.V., Georgios GousiosFacebook & Delft University of Technology, Maurício AnicheDelft University of Technology
Pre-print
17:20
10m
Talk
Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study
Technical Papers
Anderson UchôaPontifical Catholic University of Rio de Janeiro (PUC-Rio), Caio BarbosaPontifical Catholic University of Rio de Janeiro (PUC-Rio), Daniel CoutinhoPontifical Catholic University of Rio de Janeiro (PUC-Rio), Willian OizumiPontifical Catholic University of Rio de Janeiro (PUC-Rio), Wesley AssunçãoPontifical Catholic University of Rio de Janeiro (PUC-Rio), Silvia Regina VergilioFederal University of Paraná, Juliana Alves PereiraPUC-Rio, Anderson OliveiraPUC-Rio, Alessandro GarciaPUC-Rio
Pre-print
17:30
10m
Talk
Rollback Edit Inconsistencies in Developer Forum
Technical Papers
Saikat MondalUniversity of Saskatchewan, Gias UddinUniversity of Calgary, Canada, Chanchal K. RoyUniversity of Saskatchewan
Pre-print
17:40
10m
Talk
Assessing the Exposure of Software Changes: The DiPiDi Approach
Registered Reports
Mehran MeidaniUniversity of Waterloo, Maxime LamotheUniversity of Waterloo, Shane McIntosh
Pre-print
17:50 - 18:10
Break / Discussion Rooms Technical Papers at MSR Room 1
18:10 - 19:00
18:10
7m
Talk
A large-scale study on human-cloned changes for automated program repair
Mining Challenge
Fernanda MadeiralKTH Royal Institute of Technology, Thomas DurieuxKTH Royal Institute of Technology, Sweden
Pre-print
18:17
7m
Talk
Applying CodeBERT for Automated Program Repair of Java Simple Bugs
Mining Challenge
Ehsan MashhadiUniversity of Calgary, Hadi HemmatiUniversity of Calgary
Pre-print
18:24
7m
Talk
PySStuBs: Characterizing Single-Statement Bugs in Popular Open-Source Python Projects
Mining Challenge
Arthur Veloso KamienskiUniversity of Alberta, Luisa Palechor AnaconaUniversity of Alberta, Abram HindleUniversity of Alberta, Cor-Paul BezemerUniversity of Alberta
18:31
7m
Talk
How Effective is Continuous Integration in Indicating Single-Statement Bugs?
Mining Challenge
Jasmine LatendresseConcordia University, Rabe AbdalkareemQueens University, Kingston, Canada, Diego CostaConcordia University, Canada, Emad ShihabConcordia University
Pre-print
18:38
7m
Talk
Mea culpa: How developers fix their own simple bugs differently from other developers
Mining Challenge
Wenhan ZhuUniversity of Waterloo, Michael W. GodfreyUniversity of Waterloo, Canada
Pre-print
18:45
7m
Talk
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study
Mining Challenge
Anthony PerumaRochester Institute of Technology, Christian D. NewmanRochester Institute of Technology
Pre-print Media Attached
18:52
7m
Talk
On the Rise and Fall of Simple Stupid Bugs: a Life-Cycle Analysis of SStuBs
Mining Challenge
Balázs MosolygóUniversity of Szeged, Norbert VándorUniversity of Szeged, Gabor AntalUniversity of Szeged, Peter HegedusUniversity of Szeged
Pre-print

Submission

Submit your data paper (maximum 4 pages, plus 1 additional page of references) via the HotCRP submission site on or before January 29th, 2021.

Submitted papers will undergo single-blind peer review. We opt for single-blind peer review (as opposed to the double-blind peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such reference is likely to disclose the authors’ identity.

To make research data sets and research software accessible and citable, we further encourage authors to attend to the FAIR rules, i.e., data should be: Findable, Accessible, Interoperable, and Reusable.

Submissions must conform to the IEEE formatting instructions IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTeX users must use \documentclass[10pt,conference]{IEEEtran} without including the compsoc or compsocconf options).

Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere for the duration of consideration. ACM plagiarism policies and procedures shall be followed for cases of double submission. The submission must also comply with the IEEE Policy on Authorship. Please read the ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ before submitting.

Upon notification of acceptance, all authors of accepted papers will be asked to complete a copyright form and will receive further instructions for preparing their camera-ready versions. At least one author of each paper is expected to register and present the results at the MSR 2021 conference. All accepted contributions will be published in the conference electronic proceedings.

Accepted Papers

Title
A Traceability Dataset for Open Source Systems
Data Showcase
AndroCT: Ten Years of App Call Traces in Android
Data Showcase
Pre-print Media Attached
AndroR2: A Dataset of Manually-Reproduced Bug Reports for Android apps
Data Showcase
AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories
Data Showcase
Pre-print Media Attached
Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution
Data Showcase
Apache Software Foundation Incubator Project Sustainability Dataset
Data Showcase
DUETS: A Dataset of Reproducible Pairs of Java Library-Clients
Data Showcase
Pre-print
Denchmark: A Bug Benchmark of Deep Learning-related Software
Data Showcase
EQBENCH: A Dataset of Equivalent and Non-equivalent Program Pairs
Data Showcase
GE526: A Dataset of Open Source Game Engines
Data Showcase
KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
Data Showcase
ManyTypes4Py: A benchmark Python dataset for machine learning-based type inference
Data Showcase
QScored: A Large Dataset of Code Smells and Quality Metrics
Data Showcase
Pre-print
Sampling Projects in GitHub for MSR Studies
Data Showcase
Pre-print
Search4Code: Code Search Intent Classification Using Weak Supervision
Data Showcase
Pre-print
The Wonderless Dataset for Serverless Computing
Data Showcase