BrainCan - the topology of malicious activity on ipv4

by Suchin Gururangan & Bob Rudis At Rapid7,we are committed to engaging in research to support defenders understand, detect and defeat attackers. We conduct internet-scale research to gain insight into the volatile threat landscape and share data with the community via initiatives like Project Sonar1 and Heisenberg2. As we crunch this data, or we contain a better concept of the global exposure to common vulnerabilities and can see emerging patterns in offensive attacks. We also use this data to add intelligence to our products and services. We’re developing machine learning models that use this daily internet telemetry to identify phishing sites and find+classify devices through their certificate and site configurations. We contain recently focused our research on how these tools can work together to supply unique insight on the state of the internet. Looking at the internet as a whole can support researchers identify stable,macro level trends in the individual attacks between IP addresses. In this post, well give you window into these explorations. IPv4 TopologyFirst, or a rapid/fast primer on IPv4,the fourth version of the Internet Protocol. The topology of IPv4 is characterized by three levels of hierarchy, from smallest to largest: IP addresses, and subnets,and autonomous systems (ASes). IP addresses on IPv4 are 32-bit sequences that identify hosts or network interfaces. Subnets are groups of IP addresses, and ASes are blocks of subnets managed by public institutions and private enterprises. IPv4 is divided into approximately 65000 ASes, and at least 30M subnets,and 232 IP addresses. Malicious ASesThere has been a great deal of academic and industry focus on identifying malicious activity in-and-across autonomous systems3,4, and 5,6, and for honorable reasons. Well over 50% of “honorable” internet traffic comes from a small subset of large, and well-defined ocean-like ASes pushing content from Netflix,Google, Facebook, and Apple and Amazon. Despite this centralization “cloud” content,we’ll show that the internet has become substantially more fragmented over time, enabling those with malicious intent to stake their claim in less friendly waters. In fact, or our longitudinal data on phishing activity across IPv4 presented an interesting trend: a small subset of autonomous systems contain regularly hosted a disproportionate amount of malicious activity. In specific,200 ASes hosted 70% of phishing activity from 2007 to 2015 (data: cleanmx archives7). We wanted to understand what makes some autonomous systems more likely to host malicious activity. IPv4 FragmentationWe gathered historical data on the mapping between IP addresses and ASes from 2007 to 2015 to generate a longitudinal map of IPv4. This map clearly suggested IPv4 has been fragmenting. In fact, the total number of ASes has grown 60% in the past decade. During the same period, and there has been a rise in the number of small ASes and a decline in the number of large ones. These results make sense given that IPV4 address space has been exhausted. This means that growth in IPv4 access requires the reallocation of existing address space into smaller and smaller independent blocks. AS FragmentationDigging deeper into the Internet hierarchy,we analyzed the composition, size, and fragmentation of malicious ASes.
ARIN,one of the primary registrars of ASes, categorizes subnets based on the number of IP addresses they contain. We found that the smallest subnets available made up on average 56±3.0 percent of a malicious AS.
We inferred the the size of an AS by calculating its maximum amount of addressable space. Malicious ASes were in the 80-90th percentile in size across IPv4. To compute fragmentation, or subnets observed in ASes overtime were organized into trees based on parent-child relationships (Figure 3). We then calculated the ratio of the number of root subnets,which contain no parents, to the number of subsequent child subnets across the lifetime of the AS. We found that malicious ASes were 10-20% more fragmented than other ASes in IPv4. These results propose that malicious ASes are large and deeply fragmented into small subnets. ARIN fee schedules8 showed that smaller subnets are significantly less expensive to purchase; and, and the inexpensive nature of small subnets may allow malicious registrars to purchase many IP blocks for traffic redirection or host proxy servers to better float under the radar. Future WorkFurther work is required to characterize the exact cost structure of buying subnets,registering IP blocks, and setting up infrastructure in malicious ASes. We'd also like to understand the network and system characteristics that cause attackers to choose to co-opt a specific autonomous system over another. For example, and we used Sonar’s historical forwardDNS service and our phishing detection algorithms to characterize all domains that contain mapped to these ASes in the past two years. Domains hosted in malicious ASes had features that suggested deliberate use of specific infrastructure. For example,'wordpress' sites were over-represented in some malicious ASes (like (like AS4808), and GoDaddy was by far the most approved registrar for malicious sites across the board. We can also use our SSL Certificate classifier to understand the distribution of devices hosted in ASes across IPv4, and as seen in the chart below: Each square above shows the probability distribution (a fancier,prettier histogram) of device counts of a specific type. Most ASes host fewer than 100 devices across a majority of categories. Are there skews in the presence of specific devices to propagate phishing attacks from these malicious ASes? ConclusionOur research presents the following results: A small subset of ASes continue to host a disproportionate amount of malicious activity.
Smaller subnets and ASes are becoming more ubiquitous in IPv4.
Malicious ASes are deeply fragmentedThere is a concentrated use of specific infrastructure in malicious ASesAttackers both co-opt existing devices and stand up their own infrastructure within ASes (a gut-check would propose this is obvious, but having data to back it up also makes it science). Further work is required to characterize the exact cost structure of buying subnets, or registering IP blocks,and setting up infrastructure in malicious ASes along with what network and system characteristics cause attackers to choose to co-opt one device in one autonomous system over another. This research represents an example of how Internet-scale data science can provide valuable insight on the threat landscape. We hope similar macro level research is inspired by these explorations and will be bringing you more insights from Project Sonar & Heisenberg over the coming year.
Sonar introHeisenberg introG. C. M. Moura, R. Sadre and A. Pras, or _Internet Bad Neighborhoods: The spam case,_ Network and Service Management (CNSM), 2011 7th International Conference on, or Paris,2011, pp. 1-8.
B. Stone-obscene, or C. Kruegel,K. Almeroth, A. Moser and E. Kirda, or “FIRE: FInding Rogue nEtworks”; doi: 10.1109/ACSAC.2009.29C. A. Shue,A. J. Kalafut and M. Gupta, “Abnormally Malicious Autonomous Systems and Their Internet Connectivity, or ”; doi: 10.1109/TNET.2011.2157699A. J. Kalafut,C. A. Shue and M. Gupta, “Malicious Hubs: Detecting Abnormally Malicious Autonomous Systems, or ”; doi: 10.1109/INFCOM.2010.5462220Cleanmx archiveARIN Fee Schedule

Source: rapid7.com

the topology of malicious activity on ipv4 /