Most organizations face a barrage of attacks every day from threat actors around the globe. Among the various vectors, attackers have found relatively high degree of success by (spear) phishing employees of the organization. This allows attackers to bypass perimeter defences and gain a foothold in the internal network.
SOC teams have multiple approaches to detect such phishing attempts. Most common ones are listed below:
- An alert user notifies them of receiving suspicious email
- Email gateway detects and notifies them of suspicious email
- External threat intelligence feed providers
- Brand/Domain monitoring services which notify if anyone is trying to create phishing pages or has bought a domain similar in name to the organization’s primary domain.
- A threat detected internally – which when investigated might reveal a phishing email as the culprit
Most of these approaches are static in nature. For example, if a previous incident response had concluded that abcdbank.com was a malicious domain, the same is added in the blacklist for email and proxy filters. However, if the attacker simply changes the domain name to abcbdank.com, the filters will fail.
To improve efficiency in detecting such phishing domains being used to target your users, we explore the use of the ELK stack. Elasticsearch is a very powerful ‘search engine’ and ‘data-store’. It has gained popularity in the info-sec community due to it’s capabilities in extending data analytics and visualization via Kibana.
We will use three search options supported by Elasticsearch which will enable us to hunt for domains which are similar to the primary domain of the organization.
- Regular Expressions (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html)
- Wildcard (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html)
- Fuzzy search (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html)
Assuming you receive your Proxy and DNS logs into ELK, we will demonstrate the efficiency of each search operation on Elasticsearch.
To generate a sample set of such phishing domains, which an attacker may use to trick employees, we will use the tool ‘URLCRAZY’ available in KALI Linux. As an example, we shall consider niiconsulting.com as the primary domain of the organization
DNSTwist and other online tools can also be used to generate such domains.
URLCrazy generates such domains using multiple ‘typo algorithms’
Typo Type | Example |
Character Omission | niconsulting.com |
Character Repeat | niicconsulting.com |
Character Swap | nicionsulting.com |
Character Replacement | miiconsulting.com |
Double Character Replacement | nuuconsulting.com |
Character Insertion | niicoinsulting.com |
Missing Dot | wwwniiconsulting.com |
Singular or Pluralize | niiconsultings.com |
Vowel Swap | niiconsalting.com |
Homophones | nayeayeconsultayeng.com |
Bit Flipping | naiconsulting.com |
Homoglyphs | nilconsultlng.com |
Wrong TLD | niiconsulting.es |
As can be seen above, the attacker has many avenues to trick a user who isn’t aware of the subtle changes in the domain names
From URLCrazy, we get a list of 184 such domains, which we ingest into Elasticsearch using Logstash.
A sample standard output is provided below:
We can view the ingested events in Kibana dashboard as well.
Using the ‘Dev Tools’ in Kibana 5.3 we can run different search queries against the data ingested in Elasticsearch.
Using the three search operators, we have different levels of success in finding the above ‘phish’ domains.
For regexp, we randomly select a substring of our primary domain and append regular expression syntax to the query. A sample search gives us a result of 103 such domain. However, it should be noted that in a large data-set, such a query will generate a lot of false-positive.
The above query could have returned a positive hit for domains like abcconsulting.com, associateconsult.com etc.
The second search operator is a simple wildcard search.
The above query returns the domain names with non-standard TLDs like .ch, .ca, .nl etc.
This is a useful query to identify any TLDs which users in your organization have visited which do not actually belong to the organization. There is a high probability of an attacker hosting phishing pages on such domains.
Lastly, we shall look at fuzzy search operator. This is a useful search capability of Elasticsearch which can help us hunt for domains created in the manner like URLCrazy mentioned above. Fuzzy search returns positive results for strings which are similar (not an exact match) to our original query.
So, while REGEXP and WILDCARD searches may not help us identify niiiconsulting.com, FUZZY search will be able to search for this domain and present us the result. From the primary domain, we generated 184 domains through URLCrazy, that we ingested in Elasticsearch. Using a fuzzy search query for the primary domain, we could successfully identify 163 domains.
The 21 domains that were missed consisted of 12 non-dot-com TLDs (which we can hunt using the wildcard search operator) and 9 domains which even fuzzy search was unable to find.