The Malware Cluster Lab
A Forensic Behavioral Analysis of Live Internet Malware Infections


last updated: 15 Jul 2008
Contact Us/Feedback
Welcome to the SRI Malware Cluster Laboratory:

The SRI Malware Clustering Lab is exploring the use of behavioral attribute clustering as a method to automatically categorize common malware patterns under one forensic model description, and to help us rapidly identify new malware behavioral patterns. We explore malware clustering based on a multi-perspective collection of infection attributes (network communications, sensor alarms, binary attributes, host forensic changes) captured from our live Internet honeynet.   
NoticeThe data on this website is for research purposes only.  It is provided for your personal use only and is supplied AS IS, without warranty of any kind.  Use or reliance on this data is at your own risk.
May through June 2008: 
6941 Malware Infections Analyzed
32 Behavioral Profiles


Behavioral Clusters (with more than 30 members)
Cluster A
A: 857 samples
Win2K-f (81%)
AV:  Unknown

Cluster  B
B: 812 samples
Win XP/2K (61%/39%)
AV:  Unknown

Cluster C
C: 656 samples
WinXP (100%)
AV: Korgo

Cluster D
D: 519 samples
Win 2K/XP (70%/30%)
AV: Wootbot

Cluster E
E: 387 samples
Win XP/2K (54%/46%)
AV: Wootbot

Cluster F
F: 331 samples
WinXP (98%)
AV: Virut

Cluster G
G: 316 samples
WinXP (100%)
AV: Sasser

Cluster H
H: 333 samples
Win 2K/XP (54%/46%)

Cluster I
I: 310 samples
Win 2K/XP (60%/40%)
AV: Unknown

Cluster J
J: 314 samples
Win 2K/XP (59%/41%)
AV: Unknown (65%) (28%)

Cluster K
K: 311 samples
Win 2K/XP (64%/36%)
AV:  Unknown (75%) (63%)

Cluster L
L: 291 samples
AV: Unknown

Cluster M
M: 220 samples
Win2K/XP (65%/35%)
AV: Virut/Nachi (50%) (33%)

Cluster N
N: 172 samples
Win 2K/XP (57%/43%)
AV: Unknown (76%)

Cluster O
O: 159 samples
WinXP (100%)
AV: Padobot

Cluster P
P: 117 samples
Win 2K/XP (53%/47%)
AV: Unknown
US: (100%)   

Cluster Q
Q: 112 samples
WinXP (98%)
AV: Unknown

Cluster R
R: 89 samples
Win 2K/XP (71%/29%)
AV: Unknown

Cluster S
S: 84 samples
WinXP (100%)
AV: Unknown

Cluster T
T: 78 samples
Win 2K/XP (59%/41%)
AV: SDBot (76%)

Cluster U
U: 41 samples
WinXP (100%)

Cluster V
V: 38 samples
Win2K-f (100%)
AV: Unknown (32%) (26%) (26%) (26%)

Cluster W
W: 33 samples
Win2K-f (88%)
AV: Virut (52%) (48%) (43%) (39%) (35%)

Behavioral Clustering
Similarity Matrix

The two figures plot the pair-wise similarity between the synopsis. You may view the image as a matrix. Pixel (i, j) represents the similarity measure between the i-th and the j-th synopsis. A red pixel means the similarity is close to 1. (i.e. the synopsis are close to identical.)  A blue pixel means small similarity.

Development Team
Arvind Naryanan (UTexas Austin), Phillip Porras (SRI),  Vinod Yegneswaran (SRI),   Jian Zhang (SRI)
Acknowledgements:   Special thanks to Cliff Wang at Army Research Office (ARO) and Karl Levitt at the National Science Foundation for their sponsorship of this research.