AndroCT: Ten Years of App Call Traces in Android
Data-driven approaches have proven to be promising in mobile software analysis, yet these approaches rely on sizable and quality datasets. For Android app analysis in particular, there have been several well-known datasets that are widely used by the community. However, there is still a lack of such datasets that represent the run-time behaviors of apps— existing datasets are largely static, whereas run-time datasets are essential for data-driven dynamic and hybrid analysis of apps. In this paper, we present AndroCT, a large-scale dataset on the run-time traces of function calls in 35,974 benign and malicious Android apps from ten historical years (2010 through 2019). These call traces were produced by running each sample app against automatically generated test inputs for ten minutes. Moreover, each app was exercised both on an emulator and a real device, and the traces were separately curated. AndroCT has been used to build a novel dynamic profile of Android apps that has enabled several effective techniques and informative empirical studies concerning Android app security. We describe what this dataset includes, how it was created and stored, and how it has been used in past and would be used in the future.
Tue 18 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
03:10 - 04:00 | Time series dataData Showcase / Technical Papers at MSR Room 2 Chair(s): Shane McIntosh University of Waterloo | ||
03:11 3mTalk | AndroCT: Ten Years of App Call Traces in Android Data Showcase Pre-print Media Attached | ||
03:14 4mTalk | Mining Workflows for Anomalous Data Transfers Technical Papers Huy Tu North Carolina State University, USA, George Papadimitriou University of Southern California, Mariam Kiran ESnet, LBNL, Cong Wang Renaissance Computing Institute, Anirban Mandal Renaissance Computing Institute, Ewa Deelman University of Southern California, Tim Menzies North Carolina State University, USA Pre-print | ||
03:18 4mTalk | Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data Technical Papers Samuel W. Flint University of Nebraska-Lincoln, Jigyasa Chauhan University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln Pre-print Media Attached | ||
03:22 4mPaper | On the Naturalness and Localness of Software Logs Technical Papers Pre-print | ||
03:26 4mTalk | How Do Software Developers Use GitHub Actions to Automate Their Workflows? Technical Papers Timothy Kinsman University of Adelaide, Mairieli Wessel University of Sao Paulo, Marco Gerosa Northern Arizona University, USA, Christoph Treude University of Adelaide Pre-print | ||
03:30 30mLive Q&A | Discussions and Q&A Technical Papers |
Go directly to this room on Clowdr