Identifying multiple authors in a binary program

Xiaozhu Meng, Barton P. Miller, Kwang Sung Jun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Scopus citations

Abstract

Knowing the authors of a binary program has significant application to forensics of malicious software (malware), software supply chain risk management, and software plagiarism detection. Existing techniques assume that a binary is written by a single author, which does not hold true in real world because most modern software, including malware, often contains code from multiple authors. In this paper, we make the first step toward identifying multiple authors in a binary. We present new fine-grained techniques to address the tougher problem of determining the author of each basic block. The decision of attributing authors at the basic block level is based on an empirical study of three large open source software, in which we find that a large fraction of basic blocks can be well attributed to a single author. We present new code features that capture programming style at the basic block level, our approach for identifying external template library code, and a new approach to capture correlations between the authors of basic blocks in a binary. Our experiments show strong evidence that programming styles can be recovered at the basic block level and it is practical to identify multiple authors in a binary.

Original languageEnglish (US)
Title of host publicationComputer Security – ESORICS 2017 - 22nd European Symposium on Research in Computer Security, Proceedings
EditorsSimon N. Foley, Dieter Gollmann, Einar Snekkenes
PublisherSpringer-Verlag
Pages286-304
Number of pages19
ISBN (Print)9783319663982
DOIs
StatePublished - 2017
Externally publishedYes
Event22nd European Symposium on Research in Computer Security, ESORICS 2017 - Oslo, Norway
Duration: Sep 11 2017Sep 15 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10493 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd European Symposium on Research in Computer Security, ESORICS 2017
Country/TerritoryNorway
CityOslo
Period9/11/179/15/17

Keywords

  • Binary code authorship
  • Code features
  • Software forensics

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Identifying multiple authors in a binary program'. Together they form a unique fingerprint.

Cite this