[Pro] Google index problem

A website we’ve recently build using Freeway & WebYep has been indexed by Google. So far so good. But here’s the strange part; not only existing pages were indexed, also a secondary page that comes after the initial ‘index.php’ was indexed.

An url like this came up in Google’s results:
www.sitename.nl/index.php/test.php

The lay-out and structure of the page shown is identical to all other pages, however it looks blank because there is no css file linked to it. (there was actually a test.php file on the server)

Furthermore I can change the name ‘test.php’ to anything I like, the same result will be shown. As a matter of fact, only the forward slash after .php is enough to show a blank page. Is this normal?

Does anybody know how this is possible, and more inportant, how can I stop Google from showing this secondary file called ‘/test.php’ after the initial ‘index.php’?

Thanks in advance for your help!

Best regards,
Rogier Luigjes


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Sometime around 9/2/10 (at 07:24 -0500) rogier said:

As a matter of fact, only the forward slash after .php is enough to
show a blank page. Is this normal?

Well, putting a slash after a name means that the named thing isn’t a
page, it is a folder (or directory in geek-speak). This is odd. Do
you have any folders that are named as if they are pages? Although
that’s probably not the cause of this. Hmm.

k


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Thanks Keith. There’s no folder with that name, it really is a page. Somehow Google managed to index a test file which is located somewhere on the server in a directory containing some php and perl stuff. Why it is being shown in the search results is a real mistery. We’ve used a script to redirect it to the homepage, so problem solved.


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

This is related to the danger of double content issue. Same thing happens in EE driven sites. A page like mysite.com/index.php/templategroup/template/article.php is an existing page.
But mysite.com/index.php/templategroup/template/ is not. Nevertheless Google ‘sees’ a page there just because EE eh well, just works the way it works. It just spits out ‘pages’ when a browser (or bot) asks for it. In my case I had as pagetitle the title of that template. So what happens was that the titletag showed all of the titles, and I got a comment of Google that the title-tag contained too many characters. Obviously.

I wrote a (before pageload) if-that-third-segment-is-missing redirect script that redirects the browser to the mysite.com/index.php/templategroup/template/index.php which is an existing page.

Google now notices that mysite.com/index.php/templategroup/template/ is the same as mysite.com/index.php/templategroup/template/index.php, which does no harm.

Long story, I know… :slight_smile:


freewaytalk mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options