TY - GEN
T1 - Polyglot
T2 - 14th ACM Conference on Computer and Communications Security, CCS'07
AU - Caballero, Juan
AU - Yin, Heng
AU - Liang, Zhenkai
AU - Song, Dawn
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2007
Y1 - 2007
N2 - Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, with- out access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic analysis and is based on a unique intuitionthe way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Poly- glot and evaluated it extensively using real-world implemen- tations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of- the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between imple- mentations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
AB - Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, with- out access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic analysis and is based on a unique intuitionthe way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Poly- glot and evaluated it extensively using real-world implemen- tations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of- the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between imple- mentations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
KW - Binary analysis
KW - Protocol reverse engineering
UR - http://www.scopus.com/inward/record.url?scp=77952403312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952403312&partnerID=8YFLogxK
U2 - 10.1145/1315245.1315286
DO - 10.1145/1315245.1315286
M3 - Conference contribution
AN - SCOPUS:77952403312
SN - 9781595937032
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 317
EP - 329
BT - CCS'07 - Proceedings of the 14th ACM Conference on Computer and Communications Security
Y2 - 29 October 2007 through 2 November 2007
ER -