[Pro] Adding robot.txt File

Hi, Gang –

I’ve read through some of the earlier posts about adding robot.txt files, but I was wondering if there’s any new information on how to do that.

The earlier posts were from a few years ago. My question is – is there an action or a simple way to add robot.txt files to a Freeway site? If so, what is it?

Thanks,
Jamie


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

I’m wondering about this as well. I just received an email from google saying they can’t access my robots.txt file and therefore “postponed their crawl” of my site. I don’t think I ever uploaded a robots.txt file. Seems strange.

Doty


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Doty and Jamie,

Although there are no actions that create the robots.txt file, it’s really simple to write up yourself then upload it to the root folder of the website.

This is what the syntax looks like:

user-agent:*
disallow: /dev/*

The * is a wildcard, so what these two lines do is tell all robots/crawlers that the only content off-limits to them is anything in the directory /dev/. Some people like to write positive stuff in it ( allow:/a/folder ) but in all reality that’s what the sitemap should be doing!


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

In the site folder: meta tags, I have robots: index, follow

I thought this was an acceptable way to dealing with the robots file. It’s what I’ve always done, but with google’s most recent email perhaps they’re asking me to manually upload a robots.txt file just as I did the sitemap.xml file. Wonder if something changed at google. It’s strange I’d just now be getting this error message.


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

The original version of the SiteMapper action had an option to automatically create and upload a robots.txt file for your site. Unfortunately this feature got dropped before the Actions shipped which is a shame as search engines often look in the file for the location of the sitemap.xml file.

You can easily create your own robots.txt file using any plain text editor and upload it using the Upload Stuff Action;
http://www.freewayactions.com/product.php?id=032

Specify the location of your Sitemap in your robots.txt file;

The Meta Plus Action can also define rules for search robots but these are added to the page as meta data rather than in a robots.txt file;
http://actionsforge.com/actions/view/229-meta-plus

Regards,
Tim.

On 30 Jan 2013, at 17:51, Doty wrote:

I thought this was an acceptable way to dealing with the robots file. It’s what I’ve always done, but with google’s most recent email perhaps they’re asking me to manually upload a robots.txt file just as I did the sitemap.xml file. Wonder if something changed at google. It’s strange I’d just now be getting this error message.


FreewayActions.com - Freeware and commercial Actions for Freeway Express & Pro - http://www.freewayactions.com


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Thanks Caleb & Tim. I think what happened with the google error is exactly as you mentioned. When I uploaded my sitemap.xml google then went looking for the robots.txt file that I never thought to upload. I wonder how much all this really improves rankings??? At least I understand better now. Thanks again.

Doty


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Doing a Google search on a site I found the following:

“A description for this result is not available because of this site’s robots.txt”

I thought I had Sit Mapper applied properly, but reading the above posts it would appear that I have to add a “robots.txt” to the site. Is that correct?

Robert


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

A robots file is usually set to disallow web crawlers (robots) from indexing a site or parts of a site.

So in the instance you quote a web crawler is being disallowed from crawling the site to index it.

D


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Robert,

Could we have a link? Usually, that error means that you aren’t allowing certain bots to have full run on your site, and this kind of stuff is set in the robots.txt file or as a meta tag in the .


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Caleb,

Head to <www.aVintageSole.com>


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

I think the issue may well be with your sitemap.xml file - head on over to http://avintagesole.com/sitemap.xml

That is not a sitemap xml page!

This page shows an example of what it should consist of

http://www.sitemaps.org/protocol.html#sitemapXMLExample

David


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Hi Robert,
You need to FTP into your site and remove the robots.txt file as it is blocking access to both the Google and BLEXBot - AKA WebMeUp (http://webmeup.com) crawlers;
http://avintagesole.com/robots.txt
Regards,
Tim.


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Tim,

Thanks. I found and deleted the robots.txt file. My quandary is — how the heck did it get there in the first place?

Dave,

Thanks as well. Apparently I have Site Mapper applying something wrong. The page reference at sitemaps.org is way above my pay grade. I’ll re-review the page on Softpress regarding Site Mapper and try to figure out what I did wrong.

If anyone can suggest what I screwed up and how the robots.txt file wound up on the site, I’d be very grateful.

Robert


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Hi Robert,
Check your hosting account cPanel (if you have one) and see if anything is set there. You can often find tools there that limit access to certain web crawlers.
Regards,
Tim.

On 31 Jan 2014, at 14:27, Robert wrote:

Thanks. I found and deleted the robots.txt file. My quandary is — how the heck did it get there in the first place?


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Tim, Dave, Caleb, et al,

I believe I’ve straightened out the problem. It appears that the hosting company added this file after I the site was subjected to an attack. It’s been removed and the Site Mapper action reapplied.

One last question (I hope), how long does it normally take for Google to rescan so that it drops the robots.txt notice and puts the sites description back?

Robert


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options