Transparent Linking


I. Background of Transparent Linking
II. Link Types
III. Test Setup
IV. Test Results and Conclusion

In order to verify which of the twelve given link types are followed by web spiders, we established the following test setup: A homepage starting with index.html was created which served as an entry portal to the test area. From there two more layers of links links were referring to PHP documents below. The document contents were generated randomly, in order to emulate a regular web page. The index.html and the second-layer documents contained further connections, one for each type of transparent link. This results in a total number of 157 files included in the test. Figure T.1 shows the layout of the test area. Every access to one of these files was monitored and logged. The given setup allows to determine which types of transparent links are recognized and followed by web spiders. Furthermore it does not only show if spiders quickly check a link, but also if they continue crawling the entire content of a page and subpages. In comparison to low-interaction systems this can be a very important aspect for the deployment of a high-interaction honeypot because the latter one offers much more than just a simple frontend. Not getting the complete decoy indexed would be inefficient and attract fewer attackers to the honeypot.

Figure T.1: Layout of Test Setup for Transparent Links

