Cybercrime is ever present in the increasingly digital landscape. Experts say that almost six ransomware attacks will occur every minute and back in November, 2021, our data science team reported a 280% increase in ransomware groups within the last year. But that’s only ransomware, cybercrime in general has also been increasing at a rapid pace. It’s more important than ever to understand cybercriminals so that they can be more easily profiled and ultimately, shut down.
To give you a better understanding of cybercriminals, we have outlined 8 passive OSINT methods for profiling cybercriminals on the dark web. With these methods, your cybersecurity team can be equipped to be proactive in their activities and stop hackers before they attack.
Table of Contents
What is Passive OSINT?
We can roughly divide investigation methods for malicious actors in the criminal underground in two categories: passive and active. Active would include any kind of Human Intelligence (HUMINT) activity, where we interact with actors, infiltrate communities, request information, etc. Passive, in this context, means the collection of information without the actor’s knowledge, and leveraging this information to extract additional insights.
The value of a passive profile approach is the fact that it can be done on a large scale, where technology can help by enabling highly automated crawling, scraping and analysis processes.
Given the large scale, malicious actors tend to use multiple usernames on different markets, forums and chat rooms.
Similar usernames are often used by vendors that are looking to build or maintain a reputation. Similar to mainstream ecommerce platforms, trust in a vendor is important and actors want to keep that trust to keep selling products, sometimes at a premium price due to a high quality or reputation. Of course, this dynamic significantly simplifies tracking malicious actors from one platform to another. Large-scale dark web collections tools can almost instantly correlate these profiles together and surface insights coming from multiple sources.
On the other hand, actors can look to change usernames, for a number of reasons:
- Escaping a ruined reputation: Whether it was legitimately destroyed or not, a destroyed reputation can prevent an actor from achieving his desired goals
- Hiding types of activities from other dark web actors: A user may want to hide certain types of activities from others – he could for example use one moniker* for fraud-related activities, and another for drug-related ones.
- Hiding activities from law enforcement: An actor may want to cover his tracks between different campaigns or attacks to reduce the risk of getting caught.
In this situation, it becomes much more difficult to track these pseudonyms and build a strong profile on the individual. That being said, there are a number of methods that can be applied to help the identification of those profiles.
8 Methods For Profiling Cybercriminals
Method #1 – Usernames
Username correlation is extremely straightforward, but only works when the actor is not attempting to hide himself or his low level of sophistication keeps him from knowing that this information can be leveraged by external parties.
As described below, it works well for popular actors and can quickly provide intelligence on the scope of their activities.
On the other hand, for simple monikers, it can cause significant false positives that make this technique only slightly useful, since additional analysis is required to validate that it is in fact the same actor.
The example below shows a situation where simple username correlation, with an added analysis of the profile description on each website, helps us understand that they seem to be the same actor.
Method #2 – Vocabulary, Writing Styles, Typos
Many actors don’t communicate in their native language. They usually use English, at least on large mainstream dark web markets, to increase the reach of their offering. From this second-language communication (among other things) comes many unique ways of writing, specific words and unique mistakes. Some of these will be regional, and entire group of actors will use the same vocabulary (such as the word methode instead of method used by many seemingly-Francophone actors writing in English), but in other cases the uniqueness will be limited to a single individual, which can be tracked across multiple pseudonyms.
In other cases, actors will copy-paste content that contains either a very specific string of text, or a very precise typo. Although it can simply mean that an actor copied another one, it can also be an indicator that the same individual created multiple listings or ads on different platforms.
These capabilities are difficult to run manually but are quite straightforward through any search engine that enables searching across this dataset of crawled data.
Method #3 – Types of Items
Another way to link profiles together is by looking at the type of items that are posted on the different platforms. This can include listings, forum posts, chatroom ads, etc. Malicious actors will generally be selling or advertising the same type of products or services across the different platforms. For example, a malicious actor may be publishing phishing kits for Canadian financial institutions and creating a list of the malicious actors that post that kind of products is a good starting point to investigate based on additional criteria. Natural language processing can be especially useful here to help run the process on large amounts of data spread across multiple sources.
An interesting dynamic at play here is the desire for malicious actors to look and sound unique when they sell products and services. With any mainstream legitimate vendor, they want their offering to be differentiated to attract buyers. A malicious actor that has seen a successful differentiation technique (with certain memorable listing titles for example) may apply it again and again, even if using different pseudonyms, at different points in time.
The accuracy of this technique by itself is limited, can surface multiple false positives and should be combined with others to increase accuracy.
Method #4 – PGP Keys
Actors tend to list their PGP (Pretty Good Privacy) key on their profile to prove their legitimacy and enable encrypted communication with their peers. Out of laziness or by lack of sophistication, they will often use the same key on different platforms. By comparing the keys that are posted, we can link together these different profiles.
This technique is one of the most reliable approaches to matching profiles since there is very little benefit for a malicious actor to publicize another actors’ key (with the exception of a counter-intelligence approach).
Method #5 – Email Addresses
Investigating email addresses is a widely different subject than what we’ve discussed here, there are certain elements to keep in mind when looking at emails of dark web actors.
It should stay a continuous objective to find these addresses due to their value. Even without the legal access to query ISP or email providers for additional information, we can run searches on a number of OSINT different tools to gather additional intelligence.
There are a number of places where we can encounter email addresses. Of course, the most straightforward places are when malicious actors post them on forum conversations, in chat rooms, or on their profiles on any of these platforms. By using a large scale index of dark web data that supports regex-type queries, we can easily search for email addresses that are in the same or close to the pseudonyms that we are investigating. Depending on where the result is found, it may require more or less additional validation before being attached to a malicious actor, since the context of the information can vary (an actor might be posting the address of another actor for example).
An unusual place where email addresses can be found, related to the previous point, is in their PGP key. An address is included when generating a key and, once again generally by lack of sophistication, malicious actors will include a legitimate address. This may be an anonymous address they have access to such as a protonmail account, or it may actually be a real email address from a known email provider. If the investigator has the power to request information, through a warrant or other, from organizations like Google or Microsoft, this can quickly lead to more insights on the malicious actor. If not, the information can still be used to pivot and gather more intelligence (more below).
Method #6 – Contact Information
On top of email addresses, other elements can be used to pivot on other sources. These include telegram handles, usernames, nicknames, phone numbers, Discord usernames, etc. Leveraging these is usually quite straightforward, but depends on the public information available on the third-party platform. Some chat platforms, for example, support a search by username, while others will not. The minimum that should be done with any information of that type is a simple search on scraped dark web data to see if the information surfaces anywhere else.
Method #7 – Cryptocurrency Addresses
Cryptocurrency transactions might be a little out of scope, but still relates to the dark web due to their ominous use in transactions between malicious actors. At a minimum, an address can help, though a simple search, find other monikers that are requesting funds to be sent to that address. This technique has limited use on cryptomarkets since the transaction happens behind closed doors (addresses are generally not publicly visible since the market handles the transaction and takes a commission). On chat rooms or forums, there is a higher chance that malicious actors will actually publicize an address.
A highly reliable result found with this technique is if funds are requested by a malicious actor and are to be sent to an email address since only a malicious actor can access his own address. Requesting funds to be sent to an address he does not control serves no purpose, except counter-intelligence motives if the malicious actor has a serious enough reason to accept to lose funds to make the stratagem work.
A first way to leverage an address is to, once again, simply run a search and see if the address is mentioned anywhere else in the underground communities and if the context enables us to assign it to a malicious actor (as explained above, fund requests are highly accurate).
A second way to leverage an address is to explore the relevant blockchain to find related information such as other addresses, amounts and transaction dates. Highly anonymous cryptocurrencies such as Monero have limited use here, but blockchains such as Bitcoin or Ethereum do open up some possibilities.
Simple exercises involve looking at the origin and destination of the transactions, and performing additional searches on any related addresses.
More advanced approaches involve looking at clustering data, especially for Bitcoin, and identifying other addresses that may belong to the same actor.
Exploring the blockchain can quickly become overwhelming, and there are therefore commercial solutions that enable stronger and better intelligence gathering, analysis and tracking.
Method #8 – Passwords
Password reuse is certainly an issue in the general population, but can also be a tool to track malicious actors. As any modern user, they also reuse passwords across different platforms and with the growing number of credential breaches, searching for leaked passwords can help identify additional usernames related to a malicious actor. This technique is quite accurate, due to the sensitive nature of passwords (by definition, passwords are not shared between individuals). Of course, the more unique a password is, the more accurate the results will be (the password love123 may be used by a large number of non-related individuals).
The technique is quite simple: based on a username or email address, we can search through leaked credential databases to see if there are any matches. We can run another search for any password that is found and see if additional usernames or addresses are matched. This technique is especially powerful through the use of a platform that provides search access to these billions of leaked credentials through a search bar or an API (note that for privacy reasons haveibeenpwned.com does not provide this functionality, but a number of commercial solutions do).
Overall, we should keep in mind that as actors gain in sophistication, they will inevitably start thinking about countering known investigation methods. As we look into more mature actors, we should ensure we are gathering more robust indicators and building stronger cases around them.
Technology can improve the results and the efficiency of what is described above. From tracking cryptocurrency wallets to matching writing styles, any operation that is effective at a large scale is very hard, if not impossible, to achieve without technological solutions. These can be as simple as in-house scripts that execute operations over large amounts of text or simple crawlers that monitor certain websites for changes, or it can be full commercial solutions that include advanced algorithms, machine learning and large scale collection and analysis. With the large scale of passive OSINT, using a tool that can transform intelligence into prioritized contextualization will allow for better coverage and team efficiency for your organization.
Stay tuned for more insights about AI from our AI Expert, Olivier Michaud. Subscribe to our newsletter to stay in the know.
*a name or nickname