I’ve read through some of the earlier posts about adding robot.txt files, but I was wondering if there’s any new information on how to do that.
The earlier posts were from a few years ago. My question is – is there an action or a simple way to add robot.txt files to a Freeway site? If so, what is it?
I’m wondering about this as well. I just received an email from google saying they can’t access my robots.txt file and therefore “postponed their crawl” of my site. I don’t think I ever uploaded a robots.txt file. Seems strange.
Although there are no actions that create the robots.txt file, it’s really simple to write up yourself then upload it to the root folder of the website.
This is what the syntax looks like:
user-agent:*
disallow: /dev/*
The * is a wildcard, so what these two lines do is tell all robots/crawlers that the only content off-limits to them is anything in the directory /dev/. Some people like to write positive stuff in it ( allow:/a/folder ) but in all reality that’s what the sitemap should be doing!
In the site folder: meta tags, I have robots: index, follow
I thought this was an acceptable way to dealing with the robots file. It’s what I’ve always done, but with google’s most recent email perhaps they’re asking me to manually upload a robots.txt file just as I did the sitemap.xml file. Wonder if something changed at google. It’s strange I’d just now be getting this error message.
The original version of the SiteMapper action had an option to automatically create and upload a robots.txt file for your site. Unfortunately this feature got dropped before the Actions shipped which is a shame as search engines often look in the file for the location of the sitemap.xml file.
I thought this was an acceptable way to dealing with the robots file. It’s what I’ve always done, but with google’s most recent email perhaps they’re asking me to manually upload a robots.txt file just as I did the sitemap.xml file. Wonder if something changed at google. It’s strange I’d just now be getting this error message.
Thanks Caleb & Tim. I think what happened with the google error is exactly as you mentioned. When I uploaded my sitemap.xml google then went looking for the robots.txt file that I never thought to upload. I wonder how much all this really improves rankings??? At least I understand better now. Thanks again.
Doing a Google search on a site I found the following:
“A description for this result is not available because of this site’s robots.txt”
I thought I had Sit Mapper applied properly, but reading the above posts it would appear that I have to add a “robots.txt” to the site. Is that correct?
Could we have a link? Usually, that error means that you aren’t allowing certain bots to have full run on your site, and this kind of stuff is set in the robots.txt file or as a meta tag in the .
Hi Robert,
You need to FTP into your site and remove the robots.txt file as it is blocking access to both the Google and BLEXBot - AKA WebMeUp (http://webmeup.com) crawlers; http://avintagesole.com/robots.txt
Regards,
Tim.
Thanks. I found and deleted the robots.txt file. My quandary is — how the heck did it get there in the first place?
Dave,
Thanks as well. Apparently I have Site Mapper applying something wrong. The page reference at sitemaps.org is way above my pay grade. I’ll re-review the page on Softpress regarding Site Mapper and try to figure out what I did wrong.
If anyone can suggest what I screwed up and how the robots.txt file wound up on the site, I’d be very grateful.
Hi Robert,
Check your hosting account cPanel (if you have one) and see if anything is set there. You can often find tools there that limit access to certain web crawlers.
Regards,
Tim.
On 31 Jan 2014, at 14:27, Robert wrote:
Thanks. I found and deleted the robots.txt file. My quandary is — how the heck did it get there in the first place?
I believe I’ve straightened out the problem. It appears that the hosting company added this file after I the site was subjected to an attack. It’s been removed and the Site Mapper action reapplied.
One last question (I hope), how long does it normally take for Google to rescan so that it drops the robots.txt notice and puts the sites description back?