Re: The Get Started with Wordpress Workshop
[quote=rawTOP;16883]In setting up my blog I’m thinking up a few things that aren’t in the default install that will help. In terms of controlling spiders and duplicate content issues, I’d recommend the following…
Create a robots.txt file and have it be something like this…
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-login.php
Basically that makes it so your admin area and your login page aren’t crawled at all. You don’t want a general block on all things /wp- because there are files like stylesheets that the spiders will want to crawl (I made that mistake and actually remembered it while writing this up)…
I created a physical robots.txt file on disk, but there has to be a way to do it using WordPress itself 'cause I then had to fight with WordPress 'cause it wanted to serve a robots.txt file that didn’t match mine. I wound up adding it to the rewrite rules in htaccess (actually VirtualHosts in my case) - that looked like…
RewriteCond %{REQUEST_URI} !^/robots.txt
I think if you’re using htaccess you’d leave off the / so it would just be !^robots.txt - the VirtualHosts file works differently than htaccess on stuff like that…
Does anyone know how to set up robots.txt in WordPress so you don’t have to go through that hassle?
The next thing I’d do is telling spiders not to index your archive pages. You want post pages to always be important, as well as your main blog, and category pages, but not things like month and date archive pages. Even on the main blog and category pages you don’t really want anything other than the first page indexed - otherwise you’ll have way too much duplicate content on your site.
Now, you don’t want to block the pages completely because then they’ll have link juice that they’re not passing on to other pages. You want the spiders to crawl them, but not index them. To achieve that, put the following in your header.php file, somewhere between and …
<?php if (is_day()) { ?>
<meta name="robots" content="noindex,follow" />
<?php } elseif (is_month()) { ?>
<meta name="robots" content="noindex,follow" />
<?php } elseif (is_year()) { ?>
<meta name="robots" content="noindex,follow" />
<?php } elseif (is_search()) { ?>
<meta name="robots" content="noindex,follow" />
<?php } elseif (is_author()) { ?>
<meta name="robots" content="noindex,follow" />
<?php } elseif (isset($_GET['paged']) && !empty($_GET['paged'])) { ?>
<meta name="robots" content="noindex,follow" />
<?php } ?>
I haven’t tested the paging part yet, but I’ve confirmed the other parts work…[/quote]
Have you looked into this robots text plugin? Robots Meta
If trying it, might want to check the author’s page… seems there are issues with 2.6 and the options. But also Platinum SEO has a lot of the robots meta data to add to a post/page that might make it easier to use than this Robots Meta plugin…