A website we’ve recently build using Freeway & WebYep has been indexed by Google. So far so good. But here’s the strange part; not only existing pages were indexed, also a secondary page that comes after the initial ‘index.php’ was indexed.
An url like this came up in Google’s results:
www.sitename.nl/index.php/test.php
The lay-out and structure of the page shown is identical to all other pages, however it looks blank because there is no css file linked to it. (there was actually a test.php file on the server)
Furthermore I can change the name ‘test.php’ to anything I like, the same result will be shown. As a matter of fact, only the forward slash after .php is enough to show a blank page. Is this normal?
Does anybody know how this is possible, and more inportant, how can I stop Google from showing this secondary file called ‘/test.php’ after the initial ‘index.php’?
Sometime around 9/2/10 (at 07:24 -0500) rogier said:
As a matter of fact, only the forward slash after .php is enough to
show a blank page. Is this normal?
Well, putting a slash after a name means that the named thing isn’t a
page, it is a folder (or directory in geek-speak). This is odd. Do
you have any folders that are named as if they are pages? Although
that’s probably not the cause of this. Hmm.
Thanks Keith. There’s no folder with that name, it really is a page. Somehow Google managed to index a test file which is located somewhere on the server in a directory containing some php and perl stuff. Why it is being shown in the search results is a real mistery. We’ve used a script to redirect it to the homepage, so problem solved.
This is related to the danger of double content issue. Same thing happens in EE driven sites. A page like mysite.com/index.php/templategroup/template/article.php is an existing page.
But mysite.com/index.php/templategroup/template/ is not. Nevertheless Google ‘sees’ a page there just because EE eh well, just works the way it works. It just spits out ‘pages’ when a browser (or bot) asks for it. In my case I had as pagetitle the title of that template. So what happens was that the titletag showed all of the titles, and I got a comment of Google that the title-tag contained too many characters. Obviously.