Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution
Cloud-native applications increasingly provision infrastructure resources programmatically through Infrastructure as Code (IaC) scripts. These scripts have in turn become the subject of empirical software engineering research. However, an often-overlooked part are the software ecosystems that have grown around the IaC languages. For example, Galaxy is an ecosystem for the popular Ansible IaC language. Galaxy features a large number of so-called “roles”, which are reusable collections of Ansible code akin to libraries for general-purpose languages. In contrast to, and despite their similarities, such IaC ecosystems have enjoyed far less attention in the literature than library ecosystems for general-purpose languages.
In this data showcase paper, we present Andromeda, the first dataset capturing the Ansible Galaxy ecosystem, its roles, and their evolution. Andromeda provides structural representations of more than 125000 role versions, and upwards of 800000 concrete changes between such versions extracted from the underlying git repositories. Andromeda aims to provide an extensive view of the contributor side of the Galaxy ecosystem, which we hope will stimulate additional research on IaC ecosystems.
Wed 19 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:00 - 10:50 | DatasetsData Showcase / Technical Papers at MSR Room 1 Chair(s): Sridhar Chimalakonda Indian Institute of Technology Tirupati | ||
10:01 3mTalk | AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories Data Showcase Sebastian Nielebock Otto-von-Guericke University Magdeburg, Germany, Paul Blockhaus Otto-von-Guericke-University Magdeburg, Germany, Jacob Krüger Otto von Guericke University Magdeburg, Frank Ortmeier Otto-von-Guericke-University Magdeburg, Faculty of Computer Science, Chair of Software Engineering Pre-print Media Attached | ||
10:04 3mTalk | GE526: A Dataset of Open Source Game Engines Data Showcase Dheeraj Vagavolu Indian Institute of Technology Tirupati, Vartika Agrahari Indian Institute of Technology Tirupati, Sridhar Chimalakonda Indian Institute of Technology Tirupati, Akhila Sri Manasa Venigalla IIT Tirupati, India | ||
10:07 3mTalk | Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution Data Showcase Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel | ||
10:10 3mTalk | The Wonderless Dataset for Serverless Computing Data Showcase Pre-print | ||
10:13 3mTalk | DUETS: A Dataset of Reproducible Pairs of Java Library-Clients Data Showcase Thomas Durieux KTH Royal Institute of Technology, Sweden, César Soto-Valero KTH Royal Institute of Technology, Benoit Baudry KTH Royal Institute of Technology Pre-print | ||
10:16 3mTalk | EQBENCH: A Dataset of Equivalent and Non-equivalent Program Pairs Data Showcase Sahar Badihi University of British Columbia, Canada, Yi Li Nanyang Technological University, Julia Rubin University of British Columbia, Canada | ||
10:19 31mLive Q&A | Discussions and Q&A Technical Papers |
Go directly to this room on Clowdr