Why Don’t Harmful Websites Disappear?: A Connected Crime Ecosystem
Recently, a wide range of harmful websites, including illegal gambling platforms that seriously affect adolescents and military personnel as well as illegal content piracy and distribution sites such as N○N○ TV and NewT○, have gone far beyond simple illegal operations. These websites now cause severe economic losses, social disruption, and serious addiction-related and crime-related harm to individuals’ lives. Rather than existing as isolated websites, harmful sites are typically operated as interconnected networks. In many cases, a single operating group manages multiple types of harmful websites at the same time. These sites commonly target vulnerable groups, generate illegal profits, and continue to survive by constantly bypassing blocking and enforcement measures. As a result, harmful websites are no longer an issue of individual cases or simple crackdowns. They have become a structural threat that continuously evolves, remains interconnected, and spreads throughout the entire cyber ecosystem like a disease, pushing the severity of cybercrime to an extreme level.
To respond effectively to these threats, our research team is developing the Cyber Crime Tracker (CCT) framework, which is an AI-based proactive detection system designed to reflect the real characteristics of harmful websites. The framework analyzes website features to identify groups of sites managed by the same operating organizations and to infer and visualize the relationships between them, thereby revealing the underlying structure of crime networks. In addition, it automatically analyzes technical vulnerabilities in harmful websites and generates LLM-based analytical reports that can be used as intelligence to support cybercrime investigations.
Through this blog series, we aim to introduce the overall research behind the Cyber Crime Tracker (CCT) framework in detail. In this first post, we briefly examine the key characteristics of harmful websites and explain why technical approaches for detecting this ecosystem and analyzing relationships between websites are essential.
| Harmful Websites from the CCT Perspective: Illegal gambling, Adult content, Drugs and illegal drug trading, Illegal webtoons, Illegal torrent websites, Illegal streaming services, Websites used to promote illegal platforms |
“We’re All in This Together”: Harmful Websites that Move as a Group
Harmful websites are not operated as isolated web pages. Instead, they function as organically connected networks. Through shared external traffic channels such as social media and text messages, multiple site addresses are distributed at the same time. Banner ads and referral links encourage users to move from one site to another, allowing traffic and illegal profits to circulate within the network. In addition, websites with similar structures are operated as bundled groups and often share overseas cloud infrastructure. Even if one site is blocked, users are quickly redirected to another site, or a replacement site is rapidly launched to keep the overall network running. This structure clearly shows why responding to individual websites alone is not enough to disrupt the harmful website ecosystem.

<Figure 1. An example of harmful websites with different names that share the same structure>
Connected Relationships Beyond Individual Harmful Websites
Because of the characteristics described earlier, harmful websites should not be seen as isolated points. Instead, they form a complex, web-like network composed of nodes, which represent individual harmful websites, and edges, which represent the connections between them. This means that analyzing harmful websites requires an approach that considers not only individual sites but also the relationships between sites and the structure of the organizations operating behind them.
For intuitive and quick analysis, tools such as VirusTotal, SHODAN, and Criminal IP can be effective. These tools allow analysts to rapidly check information such as Whois data for specific domains or IP addresses, malicious script analysis, and phishing domain detection. However, while these tools are strong at analyzing individual websites, they have clear limitations when it comes to understanding the overall connectivity of the harmful website ecosystem, identifying relationships between operating organizations, and conducting content-based analysis that reflects the unique characteristics of harmful websites.
When real harmful websites are analyzed, clear signs of connections between sites often emerge. These include banner advertising links between websites, shared code and template structures, and the reuse of registration or referral codes. Manual analysis of these elements reveals that multiple harmful websites, rather than operating independently, form a single interconnected network.
For example, as shown in Figure 2, a gambling website B that is advertised on an illegal content-sharing website A uses the registration code “NEW.” Another gambling website C that is also advertised on website A uses the same registration code “NEW.” Although websites B and C have different names, key elements such as banner image paths, menu structures, icon images, overall web layout, and HTML and JavaScript code are identical when accessed. Based on these findings, B and C may appear to be separate websites, but they can be considered to be managed by the same operating organization. Furthermore, considering that website A also uses the same registration code, it can be inferred that website A is either operated by the same organization or is closely cooperating with the same group at an operational level.

<Figure 2(a). An example of different gambling websites using the same registration code>

<Figure 2(b). Similarities between different illegal gambling websites>
Because of the operational patterns and structural characteristics of harmful websites, existing analysis tools need to be complemented with additional approaches. These include code and template analysis that compares HTML, JavaScript, and resource paths; inter-site and advertising network analysis based on banner ads, external links, and redirection paths; social media promotion channels and ID analysis focusing on shared accounts or behavioral patterns; and behavior-based clustering that aggregates operational elements such as registration codes and site policies. These analyses should be conducted step by step, and by integrating their results, it becomes possible to build cybercrime intelligence that goes beyond individual websites and identifies the same operating organizations and their extended structures.
The Cybercrime Ecosystem Tracked by CCT (The Harmful Website Crime Ecosystem)
When the cybercrime ecosystem is viewed from a new analytical perspective, the way we respond must also change. Starting from this awareness, CCT goes beyond simple post-incident blocking. Its ultimate goal is to proactively detect harmful websites, analyze them in a structured manner, and build intelligence that can be used to track criminal organizations.
First, CCT aims to build a URL collection process and an AI-based proactive detection system based on the characteristics of harmful websites. By automatically determining whether collected websites are harmful, CCT does not stop at tracking websites that have already spread. Instead, it focuses on identifying newly created harmful websites at an early stage. Through this approach, CCT moves away from traditional response models that rely on after-the-fact actions and provides a more active and forward-looking detection framework.
Second, CCT analyzes harmful websites at the level of operating organizations rather than as individual sites. A single operating organization often manages multiple websites in parallel, and these sites are connected through various elements such as banner advertisements, website structure, and source code similarity. This research integrates such information to infer website operation groups using AI and to visualize the relationships between websites in the form of a crime network map. This allows investigators to more clearly understand which groups of websites are operated by the same organization and how the crime ecosystem expands structurally.
Finally, CCT focuses on the technical vulnerabilities embedded in harmful websites. By accumulating and structuring the results of automated vulnerability analysis, CCT builds actionable cybercrime response intelligence. Furthermore, by using large language models to automatically generate vulnerability analysis reports for each website, CCT aims to provide practical decision-support intelligence for next-generation cybercrime investigations.
In this post, we provided an overview of the goals of CCT and the perspective it takes in analyzing harmful websites. In Part 2, we will take a closer look at how effective data for understanding the cybercrime ecosystem is collected and how it is transformed into intelligence through the analysis process. We appreciate your continued interest.

<Figure 3. Detailed Overview of the CCT Framework>
References
[1] 이경석, “디지털 질병! 불법 도박사이트의 특징을 파헤치다.”, CSRC Weblog, 2022
[2] 임규민, 이경석, “온라인 불법 도박 사이트!! 어떻게 효과적으로 탐색하고 분류할 수 있을까?”, CSRC Weblog, 2022
[3] 이경석, “도박사이트 파헤치기 1부 (네트워크형 사이버 불법/유해 사이트 특징)”, CSRC Weblog, 2023
[4] 이경석, “도박사이트 파헤치기 2부 (불법 도박사이트의 기술적 진화)”, CSRC Weblog, 2023
[5] 최규현, “대규모 언어 모델을 이용한 유해사이트 분류하기 1부”, CSRC Weblog, 2023
[6] 최규현, “대규모 언어 모델을 이용한 유해사이트 분류하기 2부”, CSRC Weblog, 2024
[7] 박상류, “유해사이트 군집화를 통한 유해사이트 조직적 운영 특징 분석”, CSRC Weblog, 2023

KAIST 사이버보안연구센터에서 CCT, 이메일 이상행위탐지, 보이스피싱 등 사이버범죄와 연관된 다양한 연구를 수행하고 있다.

KAIST 사이버보안연구센터 사이버위협분석팀 연구원으로 악성코드 분석 연구를 수행하고 있다.