NTS Forums

Please login or register.

Login with username, password and session length
 

News:

Welcome to the Newtek Technology Services Forum!


Author Topic: compute-1.amazonaws.com is this a bot or what?  (Read 43489 times)

Offline italait

  • Sr. Member
  • ****
  • Posts: 917
  • Karma: +126/-4
  • My Desktop
    • YouCouldGetMe
compute-1.amazonaws.com is this a bot or what?
« on: March 06, 2009, 05:20:34 PM »
I noticed in one of my reports Visitors -> IP Address, I am getting a fair amount of traffic from what appears to be a bot. The entries use a number of IP Addresses and are similar to:

[75.101.137.206] ec2-75-101-137-206.compute-1.amazonaws.com
[174.129.234.8] ec2-174-129-234-8.compute-1.amazonaws.com
[67.202.28.213] ec2-67-202-28-213.compute-1.amazonaws.com

If this is a bot then it should be included amongst 'bots & spiders' stats, yet it is currently being counted as normal traffic.

Havinging googled compute-1.amazonaws.com, I am still not sure what it is.  Is it harmful and should it be permitted?

The entries below also looks like a bots but are NOT being classified as such in my stats.
[69.58.178.29] ips-crawl4.colo-fo.ilg1.verisign.com
[64.56.65.37] istargeted.com
[89.151.116.52] bear.favsys.net
[91.121.89.155] mojito.smartwebsearching.be
[87.230.94.188] s3.popurls.com
[89.149.244.57] 89-149-244-57.internetserviceteam.com
[12.5.10.153] firewall.wescodist.com
Colin

italait...    it takes as long as it takes...
For those who have infinite patience everything happens immediately.

Offline Sol P

  • CrystalTech
  • Full Member
  • *****
  • Posts: 317
  • Karma: +20/-2
Re: compute-1.amazonaws.com is this a bot or what?
« Reply #1 on: March 11, 2009, 04:11:13 PM »
If i remember right amazon AWS is amazon web services.

as for why crawling, amazon owns alexa, a collection of site statistics - might have something to do with it.

Why stats doesn't pick it up:
Don't quote me on this, but I think smarterstats runs by bots that tag itself as a spider and follow the rules of a robots file.
Check the IIS logs and see what it announces itself as (or pm me site details and a day to check and i'll take a look at it)
usually in iis logs you'll see the bot as a browser type
ex:
HTTP/1.1 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)
HTTP/1.1 msnbot/1.1+(+http://search.msn.com/msnbot.htm)

Abuse bots, scrapers, etc will usually show up as just a generic browser type

-Sol




Offline italait

  • Sr. Member
  • ****
  • Posts: 917
  • Karma: +126/-4
  • My Desktop
    • YouCouldGetMe
Re: compute-1.amazonaws.com is this a bot or what?
« Reply #2 on: March 12, 2009, 05:00:48 AM »
Thanks Sol (I did create a support ticket [331-1144C4EE-EF19])

I have a new site, which currently gets fairly low traffic levels.  With so many bots visiting the site (which are not being excluded by smarterstats) it makes getting a true reading very difficult.  I have set the stats to resolve IP addresses. So when I run the visitors -> IP Address report I can quickly identify some bots such as:

spider3.mail.ru
ips-crawl4.colo-fo.ilg1.verisign.com
mojito.smartwebsearching.be
ec2-67-202-28-213.compute-1.amazonaws.com
istargeted.com

Smartertools asks on their website: Would you like to measure true visitor traffic without spider and bot pollution?

Well my answers is yes of course I would, but it appears that smartertools cannot deliver on their promise.

I did ask for more information on the smartertools forum Spiders, Bots and the Unreal, but have had no reply.  I really do hope that CT can influence Tim and smartertools to update their product so that it does do what it says on the tin:

Understand Real Traffic
As much as 15% of the traffic on larger sites—and up to 50% of the traffic on smaller sites—may be directly related to automated hits generated by spiders and bots as they index a Web site. Unlike many competing analytics tools, SmarterStats separates real traffic from spider and bot traffic. The ability to identify real trends and habits will allow individuals and large organizations to make effective marketing decisions.


Colin

italait...    it takes as long as it takes...
For those who have infinite patience everything happens immediately.

Offline Ben Amada

  • Sr. Member
  • ****
  • Posts: 812
  • Karma: +52/-3
Re: compute-1.amazonaws.com is this a bot or what?
« Reply #3 on: March 12, 2009, 03:39:14 PM »
SmarterStats should probably be maintaining a list of known bot user agent strings.  I once downloaded and integrated the XML file of all known user agent strings from this site into my own application.  They regularly update their list.

If SmarterStats is only checking for friendly bots that identify themselves as bots, then I can definitely see that it would be miscategorizing lots of bot traffic as non-bot traffic.

SmarterStats should be checking against a list of known bots like the list I mentioned above, and it should also have some auto-update mechanism in place to keep its list of user agent strings up to date since new bot user agent strings are actually pretty frequent.
Follow me at allben.net

Offline italait

  • Sr. Member
  • ****
  • Posts: 917
  • Karma: +126/-4
  • My Desktop
    • YouCouldGetMe
Re: compute-1.amazonaws.com is this a bot or what?
« Reply #4 on: March 12, 2009, 05:29:04 PM »
SmarterStats should probably be maintaining a list of known bot user agent strings.  I once downloaded and integrated the XML file of all known user agent strings from this site into my own application.  They regularly update their list.

If SmarterStats is only checking for friendly bots that identify themselves as bots, then I can definitely see that it would be miscategorizing lots of bot traffic as non-bot traffic.

SmarterStats should be checking against a list of known bots like the list I mentioned above, and it should also have some auto-update mechanism in place to keep its list of user agent strings up to date since new bot user agent strings are actually pretty frequent.


Ben, I totally agree, I hope smarter tools answer my question on their forum and let us know how they distinguish between bot traffic and real traffic.

Failing that, I hope that CT can get an answer for us.
Colin

italait...    it takes as long as it takes...
For those who have infinite patience everything happens immediately.