MSN Index Spam Forschung
Viele Spam Seiten werden automatisch generiert und der Versuch Content massenweise automatisch zu produzieren hinterläßt leicht zu erkennende Spuren. Während MSN immer noch Spam verseucht ist unternehmen sie doch Anstrengungen um Spam zu erkennen.
Our approach is to treat each spam page as a dynamic program rather than a static page, and utilize a “monkey program” to analyze the traffic resulting from visiting each page with an actual browser so that the program can be executed in full fidelity. Many successful, large-scale spammers have created a huge number of doorway pages that either redirect to or fetch ads from a single domain that is responsible for serving all target pages. By identifying those domains that serve target pages for a large number of doorway pages, we can catch major spammers’ domains together with all their doorway pages and doorway domains.