PDA

View Full Version : Migrating links directories


Mark Brookes
09-06-2006, 07:25 AM
Hi

##1##
I have an old links script which displays category pages with url's in the form:
http://www.sparklingspeeches.com/cgi-bin/links/Wedding-Accessories.shtml

My new eSyndiCat script displays category pages with url's in the form:
http://www.sparklingspeeches.com/wedding-resources/index.php?category=3
or
http://www.sparklingspeeches.com/wedding-resources/Wedding-Links/wedding-accessories/

Many of my old categfory pages have google pagerank (even more true of my other site fine wedding speeches) which I do not want to loose


##2##
I understand that Search Engines & Google in particular DISlike duplicate content. At the moment the new category page duplicates the old category page

##3##
I understand that it is "good" practice to 'never' delete old pages / superceded pages because other places all across the internet might link to them , and the links will become broken.

##4##
I would like to keep my online files tidy by deleting files that are no longer "live"


All these seem to be in conflict.



===== Q 1 =====
May I please ask for opinions on the best way to migrate from old links pages to new link pages?



===== Q 2 =====
IF the next version of eSC allows us to fully define the url style of categories, would I be better to wait for that version so that I could "force" the url to remain the same as existing category pages?

e.g
if I could get the url to report that it was
http://www.sparklingspeeches.com/cgi-bin/links/Wedding-Accessories.shtml
even if in fact the eSC program was in folder...../public-html/esc/...

Then presumably Google would never even know that there had been a change?

Also link partners would never know that there had been a change either?


regards
Mark

Vincent Wright
09-06-2006, 08:34 AM
I think that slightly tweaking .htaccess, making some code modifications, and [probably] changing category names (e.g. first letters in upper case) will do the trick.

Mark Brookes
09-06-2006, 10:17 AM
* Tweaking .htaccess - ...erm... could you advise how I should tweak it?

* Code modifications - could you advise what modifications?

* Category Name changes - :) This one I can manage:
admin-panel > Browse > select category > category path

Vincent Wright
09-06-2006, 10:48 AM
Hmm, I have just noticed a potential problem.

As you see eSyndiCat uses FULL PATH when browsing categories. While your pervious script uses only category name.

E.g.

esc:
/wedding-links/wedding-accessories.html

old one:
/Wedding-Accessories.shtml

On the other hand this doesn't seem to be a huge problem.

Mark Brookes
09-07-2006, 11:58 AM
HEY :(

What happenned to my carefully written & thoughtful reply which I posted yesterday?
Bother! I can't remember what I said!

Basically If ESyndiCat could report a url style of my choice OR could write ordinary html pages whose url I could specify This would be VERY helpful.

It would massively simplify the business of upgrading to eSC and actually going live with the eSC script, because I would not have to worry about any effects on Google or existing link partners. I could consider 'nice' urls at a later date.

If it really is " On the other hand this doesn't seem to be a huge problem." PLEASE may I ask for details on how to do it?
( & will the next version have it included :) ?)

Vincent Wright
09-07-2006, 12:19 PM
At the time you were typing your post, I was typing possible solution.

Please check this thread:
http://www.esyndicat.com/forum/about6921.html

Mark Brookes
09-07-2006, 05:48 PM
Hey Vincent,
Thats pretty good:

Are we able to do something similar for the directory categories?

for example,
currently the eSC programs are in
www.sparklingspeeches.co.uk/wedding-resources/ (http://www.sparklingspeeches.co.uk/wedding-resources/),
and the categories with mod-rewrite have urls like
www.sparklingspeeches.co.uk/wedding-resources/wedding-cars/ (http://www.sparklingspeeches.co.uk/wedding-resources/wedding-cars/)

but I would like eSC to lie to visitor's browsers and say that the category url's are actually like
www.sparklingspeeches.co.uk/cgi-bin/links/wedding-cars.shtml (http://www.sparklingspeeches.co.uk/cgi-bin/links/wedding-cars.shtml)


To make this a general MOD I think it needs to collect some extra settings from the administrator:

Directory path for url display = /cgi-bin/links/
File name format (html, shtml,php etc) = shtml

Vincent Wright
09-12-2006, 09:40 AM
Greetings Mark,

you have overwhelmed me with your requests for a while :)

Well, let's resolve all the issues bit by bit.

I think the most urgent issue is pertaining old category paths since it keeps you from adding more partners.

Here I'm going to describe the steps which I did to achieve the desired result.

1. Disable ScriptAlias

ScriptAlias is a directive in Apache conf file that tells the Apache something like this: "look, this folder contains CGI scripts and they should be treated not as static but as dynamic pages, i.e. you should execute the scripts and then return the output to clients".

Why do you have to bother? Because your old paths contain /cgi-bin/ in them. And as I found out ScriptAlias directive takes precedence over mod_rewrite, it means the rules in the .htaccess file will be ignored and thus you will not be able to make apache convert paths like /cgi-bin/links/Cat-Name.shtml into paths that eSyndiCat can handle.

So, this way or the other you have to disable /cgi-bin/ folder as being a container for CGI scripts. If you are not sure how to do it consult with the hosting company (I can advise little here since the only way I found is to disable this for the whole web server by commenting out ScriptAlias directive; this is no-no if you are on shared hosting so you had better ask your hosting support guys).

2. Introduce new path field

In order not to accidentally break some current functionality I decided to introduce a new path field. Current implementation stores `path` field in Categories table. But it consists of the names of all the parent categories, e.g. GrandParentCat/ParentCat/ChildCat. But we need a "one-level" path, i.e. we only need to store ChildCat in this new field.

Well, to create this new field please execute the following query:


ALTER TABLE `dir_categories` ADD `new_path` VARCHAR( 255 ) NOT NULL ;


NOTE: Don't forget to adjust table prefix.

Now that you have created this additional field you have to put in some value for EVERY category in your database.

NOTE: The values should be unique among the whole table. In other words, you cannot have TWO categories named Wedding Accessories even if they are located in different parent categories. Otherwise this result in ambiguity.

The only way to do it now is manually. If you have too many categories to handle them manually I will write a simple script that will convert category names to one-level category paths.

3. Adjust database class

Since we introduced a new field we have to make this field available to the script by adjusting some queries in classes/Dir.php.

Open classes/Dir.php, and apply the replacements below.

a) function getCategoryByPath()


$sql .= "WHERE `path` = '{$aPath}' ";


replace with


$sql .= "WHERE `new_path` = '{$aPath}' ";


b) function getCategoriesByParent()

In the


/** get subcategories **/
if (!empty($categories))
{


replace


$sql .= "(SELECT `id_parent`,`id`,`title`, `path` ";


with


$sql .= "(SELECT `id_parent`,`id`,`title`, `path`,`new_path` ";


c) function getBreadcrumb()


$aBreadcrumb[] = array ('id' => $category['id'], 'title' => $category['title'],
'description' => $category['description'], 'path' => $category['path']);


replace with


$aBreadcrumb[] = array ('id' => $category['id'], 'title' => $category['title'],
'description' => $category['description'], 'path' => $category['path'], 'new_path' => $category['new_path']);


That's all.

Save changes and close the file.

4. Finally, adjust category URLs

This is the last step: now you have to change the actual category URLs that are displayed to end-users. As you might have guessed it involves modifying template file(s).

NOTE: I used GreentLeaves. So, if you use different template pieces of code may differ slightly.

Open template/<your_template>/Layout.php.

a) function print_breadcrumb()

Change this line (#123)


$link = $gDirConfig['mod_rewrite'] ? $url.$item['path'].'/' : $url."index.php?category={$item['id']}";


to this one


$link = $gDirConfig['mod_rewrite'] ? $url.'cgi-bin/links/'.$item['new_path'].'.shtml' : $url."index.php?category={$item['id']}";


b) function print_categories()

Chage this line (#197)


$url2 = $gDirConfig['mod_rewrite'] ? $url.$value2['path'].'/' : $url."index.php?category={$value2['id']}";


to this one


$url2 = $gDirConfig['mod_rewrite'] ? $url.'cgi-bin/links/'.$value2['new_path'].'.shtml' : $url."index.php?category={$value2['id']}";


Then, change this line (#216)


$url .= $gDirConfig['mod_rewrite'] ? $value['path'].'/' : "index.php?category={$value['id']}";


to this one


$url .= $gDirConfig['mod_rewrite'] ? 'cgi-bin/links/'.$value['new_path'].'.shtml' : "index.php?category={$value['id']}";


Save changes and close the file.

Try to apply this mod and see what happens.

Should you have any problems with it feel free to contact me.

Vincent Wright
09-12-2006, 11:30 AM
5. Tweak .htaccess

I completely forgot about modifying .htaccess file to make it all work.

Well, open .htaccess, find these rules:


# mod_rewrite rules for categories pages
RewriteRule ^(.*)/$ index.php?category=$1 [QSA,L]
RewriteRule ^(.*)/index([0-9]+).html$ index.php?category=$1&page=$2 [QSA,L]


and replace it with this one:


# mod_rewrite rules for categories pages
RewriteRule ^cgi-bin/links/(.+).shtml$ index.php?category=$1 [QSA,L]


Save changes and close the file.

Mark Brookes
09-12-2006, 12:18 PM
Hello Vincent.

Thanks very much for looking into this.... here are the resuklts:

1. Disable Script Alias - Host/server action

For testing purposes & because it takes time to get things changed by my host & because I don't want to do anything drastic to my other cgi scripts I have skipped this bit.

For testing I will aim for a display path of /links instead of /cgi-bin/links/ and worry about this aspest once we've got the rest working.

2. Add 'new_path' field to _categories table in database
Done

3. For each category create a new unique path name
Done

4. Changes to …/classes/Dir.php
Done

5. Changes to ….. / templates/…/ Layout.php
Done


Results
Parse error: syntax error, unexpected T_VARIABLE in /home/spa10001/public_html/wedding-resources/templates/Minimalistic/Layout.php on line 123

I have in fact tried it for both cgi-bin/links/ and just links/. In both cases it halts at line 123 of Layout.php

... Hoping this looks simple to you :huh:

regards
Mark

Vincent Wright
09-12-2006, 12:31 PM
Could you send me your Layout.php?

And tell me which template it is based on.

Mark Brookes
09-12-2006, 05:06 PM
Hi Vincent

I have emailed you my Layout.php
I am using minimalistic template

I have needed to temporarily reverse the mod so I can continue to work with the directory, so please let me know if/when you want me to re-apply it.

regards
mark

Vincent Wright
09-13-2006, 10:23 AM
Mark,

I have applied the mod for you -- please check your directory.

Mark Brookes
09-14-2006, 07:38 AM
To get the new url's to appear in emails

some changes are needed in DirMailer.php, as shown in the attachment

Mark Brookes
09-14-2006, 11:30 AM
Hi All

we have done a lot of experimenting with possible alternatives of the modification changes in this thread - listed below is the outcome of what the modification can & can't do.



Browser Displayed url adapted by administrator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~

This modification is about adapting the url which displays in the Address field of the internet browser when visitors view the links directory. This means that the url will also be adapted when search engines such as Google visit the site and add the pages to their S.E. Database.


Benefits
~~~~~~~~
* I can adapt the url to optimise for search engineers – e.g. I understand that they prefer ‘static’ pages (.html),

* I have links pages on my site prior to installing eSC, these pages have url’s which are
** already ranked by Google, and
** the url’s have been exchanged with link Partners

By using this modification I can cause the display-url’s to be identical to the existing url’s and therefore avoid complexities with about 150 link partners and also with Google.


Limitations:
~~~~~~~~~~~~
* This modification can alter the displayed url ‘within-the-eSC-root-Folder'. It cannot alter the eSC root folder itself.

e.g: If the installation folder is
http://www.sparklingspeeches.co.uk/wedding-resources/


This MOD will allow urls such as:
http://www.sparklingspeeches.com/wedding-resources/cgi-bin/links/Wedding-Accessories_wedding-decorations.shtml
or
http://www.sparklingspeeches.com/wedding-resources/links/Wedding-Accessories_wedding-decorations.shtml
or
http://www.sparklingspeeches.com/wedding-resources/wedding-pages/Wedding-Accessories_wedding-decorations.shtml


But cannot create url’s such as:
http://www.sparklingspeeches.com/cgi-bin/links/Wedding-Accessories_wedding-decorations.shtml
or
http://www.sparklingspeeches.com/links/Wedding-Accessories_wedding-decorations.shtml
or
http://www.sparklingspeeches.com/wedding-pages/Wedding-Accessories_wedding-decorations.shtml



* Therefore the ORIGINAL choice of what folder you installed eSC into, will determine the full display url.

** If you install eSC into the root folder of your web site, you will have complete control over the url display. But then you will have a web site root directory full of eSC files and folders which you may feel is messy.

** One possibility is to install eSC into the “desired” root-folder to begin with, unfortunately I did not become aware of this issue till much later ?

e.g
Pre existing link pages have urls such as:
http://www.sparklingspeeches.co.uk/cgi-bin/links/wedding-cars.shtml

so IF I had installed eSC into
http://www.sparklingspeeches.co.uk/cgi-bin/links/
then this would be the default ‘root’ folder for eSC and a major hastle would have been avoided


Except that:
*** The cgi-bin/links/ folder would be muddled up with both old link program files & new eSC files which would make maintaining the web site messy.

*** I would still need this modification, in part, to adapt the eSC default url of (eg):
http://www.sparklingspeeches.co.uk/cgi-bin/links/wedding-cars/
to:
http://www.sparklingspeeches.co.uk/cgi-bin/links/wedding-cars.shtml