HIHAT - High Interaction Honeypot Analysis Tool

This section explains why transparent links -sometimes also referred to as "invisible links"- are essential for the deployment of web-based honeypots and discusses the problems that arise.

With the growing popularity of web applications attackers also realized search engines like Google as very powerful tools which they could use for their evil activities. Search engines provide the possibility for attackers to look for exactly that type and version of an application they are interested in. Instead of performing random scans, they can precisely focus their efforts on targets which match to their criteria. Therefore attackers are able to conduct the attacks in a much more efficient way.

In order to attract attackers with our honeypot we need to catch their attention and interest. As most attacks on web-based applications use search engines in order to find their victims, we want our honeypot to be listed by the indices of the search engines. Once the honeypot is indexed, all attackers that use search engines can recognize the system, which results in more traffic being driven to the honeypot.
The next question would be how to add the honeypot to the index of a search engine. Nowadays the search index gets constructed automatically with the help of so called web spiders. Web spiders are programs which crawl the World Wide Web in a methodical and automated manner with the intent of creating an index about the crawled contents.
Usually search engine companies do not allow to manipulate the result of the search index manually, independent from the aim to help research and to improve security. As a manual extension is not possible, we have to use the behaviour of the web spiders in order to complement the search index with information about our honeypot.

This results in two major problems:

Unfortunately the details about the exact behaviour of the web spiders are usually kept secret as well, in order to avoid abuse or distortion. Neither the exact construction criteria for the ranking of the index are public, nor information about if and how the content of a web page gets rated. Whereas different ways exist in order to create a link to a honeypot, it is even unknown which kind of link will be crawled and indexed by a web spider at all.
However, we also cannot just place arbitrary links to our honeypot on a website. This is due to the fact that not only web spiders or attackers may follow the link, but probably also many benign users who are just surfing the webpage. By following the link these users would cause many false positives in our logfiles and incidentally also increase the chance that an attacker reveals the true purpose of this link for our honeypot.

In order to tackle these problems the following solution is chosen: A specially crafted link is required, which complies to two requirements: At first it needs to be invisible to a normal Internet user surfing the web page, but still is recognized by web spiders crawling the page. A link of this type is named transparent link.
One has to keep in mind that transparent links represent an issue where web-based honeypots strongly differ from non web-based honeypots: Usually every access to a honeypot is considered to be an illicit use of that resource. Instead, web-based honeypots need to be indexed by web-spiders in order to catch a reasonable amount of interest and to work properly as we explained. Hence, in this point web-based honeypots pursue a different concept than other honeypots. Nevertheless, the main value for both types of honeypots lies in the unauthorized or illicit use of that resource.
Experience shows, that a high position in the ranking is not required in order to attract malicious tools, which are acting in an automated way in order to find their targets. However, if a web spider does not follow a specific link at all, of course it is useless for this purpose.

continue: Link Types

Transparent Linking