Polyglot: Automatic extraction of protocol message format using dynamic binary analysis

Juan Caballero, Heng Yin, Zhenkai Liang, Dawn Song

Research output: Chapter in Book/Report/Conference proceedingConference contribution

242 Scopus citations

Abstract

Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, with- out access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic analysis and is based on a unique intuitionthe way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Poly- glot and evaluated it extensively using real-world implemen- tations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of- the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between imple- mentations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.

Original languageEnglish (US)
Title of host publicationCCS'07 - Proceedings of the 14th ACM Conference on Computer and Communications Security
Pages317-329
Number of pages13
DOIs
StatePublished - Dec 1 2007
Event14th ACM Conference on Computer and Communications Security, CCS'07 - Alexandria, VA, United States
Duration: Oct 29 2007Nov 2 2007

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
ISSN (Print)1543-7221

Other

Other14th ACM Conference on Computer and Communications Security, CCS'07
CountryUnited States
CityAlexandria, VA
Period10/29/0711/2/07

Keywords

  • Binary analysis
  • Protocol reverse engineering

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Polyglot: Automatic extraction of protocol message format using dynamic binary analysis'. Together they form a unique fingerprint.

Cite this