DUETS: A Dataset of Reproducible Pairs of Java Library-Clients
Software engineering researchers look for software artifacts to study their characteristics or to evaluate new techniques. In this paper, we introduce DUETS, a new dataset of software libraries and their clients. This dataset can be exploited to gain many different insights, such as API usage, usage inputs, or novel observations about the test suites of clients and libraries. DUETS is meant to support both static and dynamic analysis. This means that the libraries and the clients compile correctly, they are executable and their test suites pass. The dataset is composed of open-source projects that have more than five stars on GitHub. The final dataset contains 395 libraries and 2,874 clients. Additionally, we provide the raw data that we use to create this dataset, such as 34,560 pom.xml files or the complete file list from 34,560 projects. This dataset can be used to study how libraries are used by their clients or as a list of software projects that succesfully build. The client’s test suite can be used as an additional verification step for code transformation techniques that modify the libraries.
Wed 19 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:00 - 10:50 | DatasetsData Showcase / Technical Papers at MSR Room 1 Chair(s): Sridhar Chimalakonda Indian Institute of Technology Tirupati | ||
10:01 3mTalk | AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories Data Showcase Sebastian Nielebock Otto-von-Guericke University Magdeburg, Germany, Paul Blockhaus Otto-von-Guericke-University Magdeburg, Germany, Jacob Krüger Otto von Guericke University Magdeburg, Frank Ortmeier Otto-von-Guericke-University Magdeburg, Faculty of Computer Science, Chair of Software Engineering Pre-print Media Attached | ||
10:04 3mTalk | GE526: A Dataset of Open Source Game Engines Data Showcase Dheeraj Vagavolu Indian Institute of Technology Tirupati, Vartika Agrahari Indian Institute of Technology Tirupati, Sridhar Chimalakonda Indian Institute of Technology Tirupati, Akhila Sri Manasa Venigalla IIT Tirupati, India | ||
10:07 3mTalk | Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution Data Showcase Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel | ||
10:10 3mTalk | The Wonderless Dataset for Serverless Computing Data Showcase Pre-print | ||
10:13 3mTalk | DUETS: A Dataset of Reproducible Pairs of Java Library-Clients Data Showcase Thomas Durieux KTH Royal Institute of Technology, Sweden, César Soto-Valero KTH Royal Institute of Technology, Benoit Baudry KTH Royal Institute of Technology Pre-print | ||
10:16 3mTalk | EQBENCH: A Dataset of Equivalent and Non-equivalent Program Pairs Data Showcase Sahar Badihi University of British Columbia, Canada, Yi Li Nanyang Technological University, Julia Rubin University of British Columbia, Canada | ||
10:19 31mLive Q&A | Discussions and Q&A Technical Papers |
Go directly to this room on Clowdr