How to Fix Your ExpressionEngine RSS Template

I began investigating the fascinating minutia of RSS when I couldn't find a reasonable answer in the EE forums to why Google Reader was re-posting every updated post on my site even thought the entry dates hadn't changed. As I went through the prevalent templates floating around the EE community line-by-line I noticed several things that could be improved upon. The only critical fix, in my opinion, is removal of seconds from the dates in the <item /> section. If you want your feed to validate you'll want to add the atom namespace. The rest are optional improvements.

If you want to skip the wheres and whys, I've posted the the updated (fixed) and improved RSS templates on Snipplr.com and in their entirety at the end of this article.

Existing RSS templates for ExpressionEngine

Resources

Methodology

I'm not going for a PHD in online syndication, I just wanted my RSS feeds to be error-free, work as expected in major aggregators and to use best-practices which could be determined within a somewhat low threshold of research pain. In addition to referring first to the RSS 2.0 specifications, I used the RSS feeds of some really large websites to serve as examples, making the assumption (I know) that these sites have probably done thorough research on this topic. Often my choices were the result of seeing if these "big players" were all doing the same thing. All the feeds are major sites with the exception of the Flickr blog. The Flickr blog is using Wordpress.com. I figured with their huge user-base not only would the feeds have been thoroughly vetted, most aggregators will be able to read them due to the sheer volume of WordPress-powered sites out there. Also I chose a Wordpress.com feed instead of a self-hosted installation of WP to make sure it was the well-tested standard feed. The feeds I used are:

RSS Feed Breakdown

Feed Format: RSS vs. Atom

As of mid-2005, the two most likely candidates are RSS 2.0 and Atom 1.0. Google reader supports either fully and they suggest choosing one or the other (not both) because most RSS readers support all major formats and offering both can confuse users. The Atom syndication format, whose creation was in part motivated by a desire to get a clean start free of the issues surrounding RSS, has been adopted as IETF Proposed Standard RFC 4287 and is used by Google. However, RSS 2.0 was the first to support enclosures and has captured the podcasting audience and is the recommended format in the iTunes podcasts specs.

I generally do as Google does when it comes to web optimization, and I am a big fan of standards. In some regards I would call Atom "the higher path". That said, I am also a big fan of simplicity and ease-of-use so I'm going with RSS 2.0 because:

  • I already hand-built an RSS 2.0 feed for podcasting (well, for iTunes) so would rather learn one standard / keep all feeds similarly formatted.
  • One less term that could potentially confuse end-users and "web lite" folk who might inherit my work later on.
  • A lot of really big sites that have probably carefully considered this topic went with RSS 2.0, including NYTimes.com, AListApart.com, Ebay.com, news.BBC.co.uk, and CNN.com.

RSS XML Namespaces

Here's where my first change kicks in. If you aren't actually USING a namespace in your RSS feed there's no need to include it - it's just cruft.

Before

<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:admin="http://webns.net/mvcb/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:content="http://purl.org/rss/1.0/modules/content/">

After

<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/">

Even Better

<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom">

The namespaces sy and content aren't used at all and can just be removed. Optional: I chose to remove xmlns:admin="http://webns.net/mvcb/" and the only tag that used it, <admin:generatorAgent rdf:resource="http://expressionengine.com/" /> - removing the admin namespace also eliminated rdf, which admin namespace depends on.

I'm not saying any of the removed namespaces are bad, just unnecessary. For example, admin is a very common namespace, but if you leave it in you should update the URI to include the EE version number (dynamically) and add the <admin:errorReportsTo rdf:resource="URI"/> pointing to a valid email address for errors. It may be beneficial to aggregators in reporting statistical reporting of web frameworks and content management systems delivering aggregated RSS feeds.

I rather admire the efficency and simplicity of the namespace declarations used by NYTimes.com, CNN.com and others - so again, this is personal choice.

The third option (even better) adds the atom namespace. The default EE RSS feed will not validate because it lacks the required atom:link tag containing the URI of the feed itself. There is some debate on whether this is actually necessary (some say the validator is wrong, not the lack of the atom tag)- read the article Adding Atom:Link to Your RSS Feed for background on this.

Depending on your site's content, your SEO practices and target audience, it is likely that you may need additional namespaces. A media-rich site, for example, would benefit from the media namespace. The media namespace is used to syndicate video, images and other media and can open up your feed to consumption by media-rich aggregators and services like Cooliris and Yahoo Video Search.

Channel Area

Most of the default template makes perfect sense - just make sure to take a look at your feed output to make sure all the EE fields used have valid values. Also, make sure you want to the {weblog_foo} tags - if you are providing a site feed that combines multiple sections you will probably want to hand-edit many of the tags in the channel area.

Dates

Important tip: make sure your feed is correctly reporting the timestamp of your entry date. If the seconds are changing on an item whenever you make an update this will cause many aggregators, including Google Reader, to repost the entry to your RSS feed. Either take the seconds off the time or replace '%s' with an arbitrary static value like '15'.

A search for 'RSS Updates' in the EE Forums will reveal that many people have had this problem. I tested all the date fields in my feeds (see this thread for details) and found that although the entry date day, hour and minute doesn't change on update (as expected), the seconds do! This weird behavior has something to do with how the dates are stored in the database and/or how the date is interpreted by EEs date tags. What it means is that if you go back and edit a post from three years ago, some aggregators will repost the item to your RSS feed even though you did not change the entry date. This can be especially troubling if you like to go back and tweak a post a lot right after publishing - you may go to your feed reader to see it reposted several times in a row.

pubDate vs. dc:date

<pubDate> is part of the RSS 2.0 spec. A lot of feeds out there still use <dc:date> and this either because they kept it from their RSS 1.0 template (for which dc:date was the only option) or they really like the very popular Dublin Core namespace or they prefer it because of the ISO 8601 date format which is much more prevalent than the (really old, as in ARPANET old!) RFC 822 date format that <pubDate> uses. On one hand it makes sense to stay with the spec and pull in namespace elements only as required. On the other hand, it makes sense to provide output in the most reusable way (updated date format). Feed readers parse either just fine, so this is judgment call on your part. Here's an agrument for each:

Based on my own survey of the feeds referenced above, I opted to switch to <pubDate>, replacing <dc:date /> in the channel with:

<pubDate>{gmt_date format="%D, %d %M %Y %H:%i:%s %T"}</pubDate>

And replacing <dc:date /> in the item declaration(s) with:

<pubDate>{gmt_entry_date format=&qout;%D, %d %M %Y %H:%i %T"}</pubDate>

Item <title ... />

The default tag is fine, but if your content people keep putting special characters in their titles (like mine do) then you might want to add the protect_entities="yes" attribute to the {exp:xml_encode} tag. For example the main EE site I work on uses &#187; (») and &amp; (&) a lot in titles.

Even after protecting entities I was still having a heck of a time getting a trademark (™) symbol that is used on a site in many post titles and in a category to consistently display on both the webpage and in RSS feed aggregators - after some digging I realized the character entity that was being used (&#x2122;) for it was not the UTF-8 reference (&#8482;) specified as the encoding for both the RSS and XHTML. So, make sure you (or your content editors) use the correct character encoding entities for special characters!

Item <guid ... />

As formated in the official EE RSS template the <guid> is not a permalink, and therefore should have isPermaLink="false" attribute added to it. Of course you could use your actual permalink and then you could leave that off or change it to "true".

"We recommend the use of the Atom and RSS 2.0 elements to unambiguously identify items. An item that is updated should keep its original ID, and a new item should never reuse an older item's ID. Changing IDs unnecessarily may result in duplicate items, and reusing IDs may cause some items to be hidden. "Tag URIs" make good IDs, since they don't change even when you need to reorganize your links." - Google Reader Tips for Publishers > Implementing Feeds

The above recommendatio explains the multi-posting of an entry on update issue I referred to earlier. Because of this, you will probably want to remove the '%s' from the formatting attribute as well. So, change the gmt_entry_date format string in the <guid> line to "%H:%iZ".

Optionally, you could just use the actual URI of your article and change the isPermalink attribute to true. EE won't let you post two items with the same URL title within a weblog/channel, so you are pretty safe there (EE adds a number to the end of URL title if one already exists).

Item <description.../>

This line is technically fine, but most people will change this to allow HTML formatting of their entries: <description><![CDATA[{summary}{body}]]></description>. This is what displays the bulk of your entry item and where most of your site-specific customization will happen, Customizing ExpressionEngine RSS 2.0 Template on 'A Blog Not Limited' is a great resource for this.

Categories

The template uses <dc:subject>. Your feed will be more interoperable with other systems and make more sense programatically if each category is in its own tag. You can do this using the <dc:subject> format, or you can switch to using the <category> tag for each as provided for in the RSS 2.0 spec.

Original Template

Code:

<dc:subject>{exp:xml_encode}{categories backspace="1"}{category_name}, {/categories}{/exp:xml_encode}</dc:subject>

Result:

<dc:subject>Architecture, Science, Workplace</dc:subject>

Separate using <dc:subject>

Code:

{categories}<dc:subject>{exp:xml_encode}{category_name}{/exp:xml_encode}</dc:subject>{/categories}

Result:

<dc:subject>Architecture</dc:subject>
<dc:subject>Science</dc:subject>
<dc:subject>Workplace</dc:subject>

Separate using <category>

Code:

{categories}<category>{exp:xml_encode}{category_name}{/exp:xml_encode}</category>{/categories}

Result:

<category>Architecture</category>
<category>Science</category>
<category>Workplace</category>

Updated EE RSS 2.0 Template

This includes what I consider minimal mandatory fixes to ensure error-free code and to prevent (re)posting problems.

{assign_variable:master_weblog_name="blog"}
{assign_variable:master_weblog_status="open"}
{exp:rss:feed weblog="{master_weblog_name}" status="{master_weblog_status}"}

<?xml version="1.0" encoding="{encoding}"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:admin="http://webns.net/mvcb/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <channel>
    <title>{exp:xml_encode}{weblog_name}{/exp:xml_encode}</title>
    <link>{weblog_url}</link>
    <description>{weblog_description}</description>
    <dc:language>{weblog_language}</dc:language>
    <dc:creator>{email}</dc:creator>
    <dc:rights>Copyright {gmt_date format="%Y"}</dc:rights>
    <dc:date>{gmt_date format="%Y-%m-%dT%H:%i:%s%Q"}</dc:date>
    <admin:generatorAgent rdf:resource="http://expressionengine.com/" />
{exp:weblog:entries weblog="{master_weblog_name}" limit="10" rdf="off" dynamic_start="on" disable="member_data|trackbacks" status="{master_weblog_status}"}
    <item>
      <title>{exp:xml_encode}{title}{/exp:xml_encode}</title>
      <link>{title_permalink=site/index}</link>
      <guid isPermaLink="false">{title_permalink=site/index}#When:{gmt_entry_date format="%H:%iZ"}</guid>
      <description>{exp:xml_encode}{summary}{body}{/exp:xml_encode}</description>
      <dc:subject>{exp:xml_encode}{categories backspace="1"}{category_name}, {/categories}{/exp:xml_encode}</dc:subject>
      <dc:date>{gmt_entry_date format="%Y-%m-%dT%H:%i%Q"}</dc:date>
    </item>
{/exp:weblog:entries}
    </channel>
</rss>
{/exp:rss:feed}

Improved EE RSS 2.0 Template

This includes optional changes that I added as a result of various articles, the RSS 2.0 spec and by examining the feeds of major professional news sites.

{assign_variable:master_weblog_name="BLOG"}
{assign_variable:master_weblog_status="OPEN"}
{assign_variable:master_rss_uri="http://PATH/TO/THIS/RSS/FEED"}

{exp:rss:feed weblog="{master_weblog_name}" status="{master_weblog_status}"}
<?xml version="1.0" encoding="{encoding}"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
    <title>{exp:xml_encode}{weblog_name}{/exp:xml_encode}</title>
    <link>{weblog_url}</link>
    <description>{weblog_description}</description>
    <dc:language>{weblog_language}</dc:language>
    <dc:creator>{email}</dc:creator>
    <dc:rights>Copyright {gmt_date format="%Y"}</dc:rights>
    <pubDate>{gmt_date format="%D, %d %M %Y %H:%i:%s %T"}</pubDate>
    <atom:link href="{master_rss_uri}" rel="self" type="application/rss+xml" />   
{exp:weblog:entries weblog="{master_weblog_name}" limit="10" rdf="off" dynamic_start="on" disable="member_data|trackbacks" status="{master_weblog_status}"}
    <item>
      <title>{exp:xml_encode protect_entities="yes"}{title}{/exp:xml_encode}</title>
      <link>{title_permalink=site/index}</link>
      <guid isPermaLink="false">{title_permalink="site/index"}#id:{entry_id}#date:{gmt_entry_date format="%H:%i"}</guid>
      <description><![CDATA[{summary}{body}]]></description>
      {categories}<category>{exp:xml_encode protect_entities="yes"}{category_name}{/exp:xml_encode}</category>
      {/categories}
      <pubDate>{gmt_entry_date format="%D, %d %M %Y %H:%i %T"}</pubDate>
    </item>
{/exp:weblog:entries}
    </channel>
</rss>
{/exp:rss:feed}						

Feedback

Please let me know if you have any suggestions for improvements on the basic template. I have already submitted these suggestions (as have others) on the EE Forums and I hope this article will soon be out-dated. For further information on customizing your RSS feed including adding Google Analytics tracking and additional fields such as author name, see Customizing ExpressionEngine RSS 2.0 Template at 'A Blog Not Limited' (if you use their updated template don't forget to remove the seconds from date fields in the item section).

Comments (8)

Email protection using CSS

Email Link Protection - a graceful and user-friendly technique This post originally appeared on my personal blog in March of 2005. It is till useful a half decade later, so finally posting it here. This code is also available on Snipplr as well.

Any spambot worth its salt spawned by evil genius could "guess" email addresses by skimming for all 'alias@'s and automatically concatenate them with the site's url base and any other domains found on the page, so in the javascript I separate the '@' from the alias.

Enough about the wheres and whys, here's some cut n' paste fun for your use. This code will use javascript to display a normal, clickable mailto: link on your page and if javascript is turned off you still get a screen-readable, selectable email address displayed using a CSS trick.

<style type="text/css">
.backwards {
unicode-bidi:bidi-override; direction: rtl;font-weight:bold;
}
</style>
<script type="text/javascript">
<!--
linkAddy=('alias' + '@' + 'yourdomain.com')
document.write('<a href="mailto:' + linkAddy + '">' + linkAddy + '</a>')
//-->
</script>

<noscript>
<p>Please email me at
<strong><span class="backwards">moc.niamodruoy@saila</span>
</strong>.<br /></p>
</noscript>

Here's a version for you really paranoid folks that don't want live links at all (or maybe you just want a tiny user obstacle to ensure necessity of correspondence), but you do want your email address to appear correctly if possible but with better protection, plus insurance if javascript is turned off:

<style type="text/css">
.backwards {
unicode-bidi:bidi-override; direction: rtl;font-weight:bold;
}
</style>
Please email me at
<span class="backwards"><script type="text/javascript">
<!--
backAddy=('moc.niamodruoy' + '@' + 'saila')
document.write(backAddy)
//-->
</script></span>
<noscript><strong>alias at
yourdomain dot com</strong></noscript>

Here's a working example:

Please email me at

Here's a few other ways to protect email addresses from spam bots. If you have a preferred method not listed here, please let me know in the comments.

  • Graceful Email Obfuscation article from A List Apart takes it up a notch, with an htaccess rewrite and redirect. Could be worth setting up for a large site where you need to add email links quickly and easy.
  • Hivelogic Enkoder Form - use their form to generate some javascript. Works great but relies on javascript to work.
  • Addressmunger.com - uses ASCII, JavaScript, and scrambling of letters in your email address.
  • K'nechtology Email Encryption - encodes email addresses in to hexadecimal values. This isn't very secure as it would be extremely easy for a bot to decode hex.

Comments

Installing Using Drupal book source on Dreamhost

USE THIS INFORMATION AT YOUR OWN RISK. Any information found on this website is offered only as informational and includes no warranty, guarantees or support. The author claims no authority on any subject whatsoever.

Using Drupal, 1st Edition by OReilly

Using Drupal, 1st Edition by O'Reilly

I just started reading the recently published book Using Drupal, 1st Edition, published by trusty ol’ O’Reilly and written by a bunch of Lullabots.

So far the book is great, but the instructions on setting up a dev environment aren’t exactly crystal clear for those completely new to Drupal. I thought I’d help out the next geek that bothers to GTS to find pitfalls before starting.

First review the preface section 2.7 ‘Downloading Drupal’. If you’ve never installed Drupal before, or any web application on a web server before, it’s a good idea to check out Lullabot Addison Berry’s very easy to follow video, Installing Drupal 6.

These instructions are specific to the context of a shared hosting account on Dreamhost, but may work for your environment as well. Be sure to review the book’s errata – the ‘confirmed errata’ will let you know about any code mistakes or problems with source discovered since the book was published. For troubleshooting help related to the books exercises and installation issues, review the Using Drupal book forum.

THESE INSTRUCTIONS MAY CAUSE YOUR DATAZ TO BE LOST. In addition to setting up things quickly, these steps include how to quickly delete all the stuff in your database, without bothersome ‘back up your data first’ steps. The idea is you are just creating a sandbox and nuking your install is no big deal.

Enough blab, let’s do this:

  • Create a subdomain for your dev environment like drupal.yourdomain.com
    • Dreamhost panel > Domains > Mandage Domains > Add New Domain / Sub-Domain
  • Create a new database (on a new or existing subdomain)
    • Goodies > Manage MySQL, scroll to the bottom where it says ‘Create a new MySQL database:’
    • Enter something in all the fields:database name must be unique system wide so get creative, create a new hostname like mysql.drupal.yourdomain.com, you can use your own username but I created a special username and password that I’ll also be using for the admin user in the Drupal install (note: dev enviro, not recc’d for production), you might want to put something to remind you what this is later in the comment field like “Sandbox for Using Drupal Book”.
    • Write down all that DB stuff so you can use it later.
    • Wait a while for new database host name and/or new subdomain to propagate.
  • Download the latest source from the book site.
    • Extract the zip somewhere you can find it, like your desktop.
  • Change the database connection string in using_drupal_source\drupal\sites\default\default.settings.php
    • Open default.settings.php with your favorite text editor
    • change connection string stored in $db_url (line 92 at TOW) from mysql://username:password@localhost/databasename to something like mysql://name:psswd@mysql.drupal.yourdomain.com/dbname
  • Copy default.settings.php in the same folder and call it settings.php
    • There should now be 2 files in the default folder
  • Save a backup copy of your default folder somewhere
    • I just copied the default folder and renamed it ‘_default’, or save somewhere on your hard drive, thumb drive, whatever…the point is, settings.php and default.settings.php that will soon live on your web server are going to change and you’re going to want these files again someday.
  • Upload the contents of using_drupal_source\drupal to the root of your new subdomain
  • chmod /sites/default to 777
    • On you web server (via your FTP client) navigate to /drupal/sites/default
    • If you’re using FileZilla, right click (ctrl+click for macs) the default folder and select file permissions, this will allow you to enter the numeric value 777 or just check read/write/execute for all roles
  • Open http://drupal.yourdomain.com

Installation profile options on successful install

Installation profile options on successful install

You’ll be prompted to select an ‘Installation Profile’. The book source code includes scripts to automatically install a site with assets and modules used in their examples for you. If you are just starting Chapter 2. where they send you off to the Appendix for installation instruction, chose the last option – the generic/basic Drupal install.

Here’s how to “start over” so you can use a different installation profile. These steps will cause you to lose any data you entered in Drupal – you will end up with a brand new install and a chance to chose a diferent installation profile:

  • Nuke all your database tables

    • Note I said ‘tables’, not the database.
    • Go to http://mysql.drupal.yourdomain.com (phpMyAdmin screen) and login
    • Select you Drupal database from the databases list at left (e.g. NOT information_schema)
    • Scroll to the bottom of tables listing page and select ‘Check All’, change the ‘With selected:’ drop down to ‘Drop’.
    • You will see a screen asking if you really want to execute this command. Click ‘Yes’.
  • CHMOD /sites/default to 777 again
    • On your web server, go to /sites/default
    • CHMOD default to 777 (again, because the previous install process modified the permissions) and be sure to check ‘recursive’ (or use -R on commandline) because there’s new files in there and we need to blow everything in the default folder away.
  • Delete everything in /sites/default
  • Put a clean copy of default.settings.php and settings.php in /sites/default
    • Remember that saved copy of default.settings.php and settings.php?
  • Go to http://drupal.yourdomain.com/install.php and start all over again :)

Bold moves in phpMyAdmin. Select all  drop.

Bold moves in phpMyAdmin. Select all > drop.

I hope that was helpful for someone. I remember all to well not even knowing what chmod was and trying to write perl scripts, ouch! As you might guess, with as much time as I spend documenting when I should be doing book exercises I don’t really have time to offer support, but please let me know if anything I’ve written here needs to be corrected or elaborated on. Have fun in Drupal land!

Comments

Track user copying activity with Tracer

Tracer is a new tracking tool to see what people are copying on your website.

Tracer is a new tracking tool to see what people are copying on your website.

Tracer by Tynt.com is a new analytic tool for websites. Like Google Analytics, you install a link to an external javascript on your site and then the service tracks what images and words your visitor’s copy.

Read a product intro and clear step-by-step setup instructions at SideKickBlog.com. Tracer is installed there, so I’m using the site to test the script as well. I pasted some copied text from his post into this blog post while in HTML direct edit mode and this is what I got:

Tracer is a script designed to let you track this copied content and also provide all kinds of usage statistics for it. Best of all it will add an automatic attribution link back to your post so that content you have created can result in traffic to you regardless of where it resides. And to top it all it promises to let you do all this for free. Definately a must have for all the content creators out there.

Read more: "Track what’s been copied from your blog | SidekickBlog" - http://sidekickblog.com/track-whats-been-copied-from-your-blog-83.htm#ixzz09JKa0aSw

After pasting into a blank email in Outlook, a Gmail message, and a Word document, it’s clear that the script is adding the credit link to the copied item when it’s copied. Tracer will certainly fix the problem of an inadvertent failure to credit (due to laziness, lack of savvy, etc) but it will in no way prevent anyone intent on stealing your content from doing so.

The link adding is nice, but I’m hoping Tracer will really shine as I get real insight into what my sites’ visitors are copying. What do you think? Is Tracer worth its weight in script?

Comments (1)

Installing Netbeans PHP IDE on Ubuntu

Lizard Steals Green Bean

Nothing amazing going on, just a few tips that might save you some time:

  • You need java runtime installed and working, prolly apt-cache search to make
    sure you’re putting in the most recent version (6 as of this writing)

    • sudo apt-get update
    • sudo apt-get install sun-java6-jre sun-java6-plugin sun-java6-font
  • The netbeans in the repo is for the Java IDE, so don’t bother with apt-get
  • Download the install file here: http://www.netbeans.org/downloads/index.html
    • Be sure to pick the PHP bundle
  • If clicking on the netbeans-x.x-ml-php-linux.sh file gives you an error or tries to open in gedit or something, right-click > properties > permissions and check ‘allow executing file as a program’
  • Select Run (not ‘Run in Terminal’), running in terminal will throw some GTK errors

Next you might want to head over to the Netbeans website and watch the intro vid and orient yourself to the plethora of PHP-centric features.

As fair newb to programming in PHP I can’t say I’m qualified to suggest an IDE. So why Netbeans? It’s free. It porvides syntactic and semantic code highlighting for PHP and debugging through Xdebug. Folks in my Seattle PHP meetup group who know a lot more about programming than I do seem to really like it, every time I go to install Eclipse I am daunted by the website, instructions and innumerable options. Finally, it was recommended in the recent Smashing Magazine article The Big PHP IDE Test: Why Use One And Which To Choose (2009.02.11) so I stopped resisting.

Do you like it, recommend others over it?

Comments (1)

Google Doctype Screams “Fork ME!”

The newly released Google Doctype is intended to be the Wikipedia of web design. There’s a video introduction on the landing page of Mark Pilgrim explaining what Google has been internally calling the the “Hitch Hikers Guide to the Web”. He’s been working on Google Doctype, said it is supposed to be the cross-platform alternative to MSDN. MSDN? I don’t know any web designers that rely on MSDN as the go-to spot for quality cross-platform client-side code! Maybe they’re targeting ASP.NET developers…and that could explain the very un-wiki linear treestyle navigation.

Google Doctype Screenshot

The Good

My own private wiki, largely comprised of web development documentation for my own projects, code snippits and links to online resources, is invaluable to me – so the potential benefits of an open wiki of this nature is obvious and I’ve often wondered why there isn’t one (with critical mass) out there already. Certainly this project, or at least the idea of it, could be an invaluable tool to professional web designers and client-side developers. Some take-aways:

  • “Written by web developers, for web developers” and by that they mean client-side developers…most of the current content is specific to JavaScript DOM stuff and cross-browser CSS considerations. I think this fills a knowledge gap as a lot of CSS and even Ajax resources are designer-oriented (lacking meaty technical details) and many developer resources gloss over or ignore web standards or a lot of the details professional programmers take for granted (like finding a viewport or using javascript to manipulate classes)
  • It’s built on the Google Project framework so you can download the whole thing via SVN.
  • The licensing is pretty unrestrictive, so you could SVN everything and put it up on an intranet statically or keep an off line copy, as was mentioned in the intro video.
  • Discrete code snippets. Rather than a long tutorial with examples that are specific to a given situation, many of the HOWTOs are broken down into more abstracted uses. This style of documentation will help a lot when your stuck on specific area of a bigger project. Personally, I learn more this way – I like the big step-by-step tutorials but when I cut and paste a lot I don’t retain very much.

The Ugly

Google suffers from chronic ugliness (IMHO) and this project is no exception. Don’t get me wrong, I’m GOOG fangirl all the way, but there always seems to be some basic user interface and user experience problems with their apps/portals/projects/whatever. And here’s where I think Google Doctype has need of improvement:

  • No indication of off-site links. Not only does a link to MSDN look just like the internal links, there are links to other Google Code project without any indications that you’re leaving Google Doctype, in fact, the logo is still Google Code. Navigation is a little confusing in general.
  • Lack of Style Guidelines. There is something to “just putting it out there” and I’m glad they did, but if a lot of people do start adding to this resource it could turn into quite a mess. It would have been ideal to have a written style established that would make sense for an open wiki. For example, statements like “generally, we recommend the following…” and “I’m not sure if this works on IE”. This type of thing would never fly on Wikipedia – now that the docs are open to the whole internets, such statements are ambiguous, lack authority and create a bad example that others are sure to follow.
  • Not really a wiki. First there’s the linear tree/node navigation pane (which seems to collapse by itself and disappear or reappear for no apparent reason) . There is no discussion page (although there are comments, sort of like PHP.net), no page history (but you can manually add a free-form line to a log file, if you notice the option), there’s no obvious way to check to see what links to a page, the list goes on.
  • Screaming “Fork Me”. A fork may be inevitable, and if a fork emerges using MediaWiki or any of a myriad of much more robust wiki platforms, I would be more likely to invest my time in that in spite of the Google mind share.

A Web Reference To Rule Them All

When I first read that Google published a web design wiki I was thrilled. I tried to think of other, similar resources. There are some great blogs, lists and forums out there but I’ve yet to find the one web reference to rule them all. If you know of one, please let me know! In the meantime I’m looking for domains…webwiki.com is just a db error, webwiki.net is a half-baked attempt at a wiki version of the Million Dollar Homepage. Hrm. If I come up with a load of extra time and a brilliant idea I will let you know. In the mean time, here are a few of my favorite web coder sites:

  • W3C.org – start at the top, right?
  • HTML Dog – very well organized reference and tutorials for CSS and (x)HTML
  • A List Apart – high quality articles published by those web standards freaks at Happy Cog.

Comments

Drupal 5.x on Ubuntu LAMP

The quick and dirty dev install of Drupal on Ubuntu

USE THIS INFORMATION AT YOUR OWN RISK. Any information found on this website is offered only as informational and includes no warranty, guarantees or support. The author claims no authority on any subject whatsoever.

Why Drupal, Why Ubuntu?

For me it's all about community. I've always enjoyed apache web development in part because of the active and helpful user groups, forums, irc channels, etc. I use Ubuntu as the operating system for my LAMP because it's really popular right now - it has a very active forum and pretty good documentation. Drupal is an open-source content management system, or you could look at it as a framework since it was built to make it easy for coders to override almost anything it does without hacking the core. This means you could make it do anything you want if you happen to be good enough at PHP and still take advantage of core development and security updates no matter how much you modify the product.

Why write installation instructions?

Good question? Well, the installation instructions at Drupal.org are good but they cover all sorts of environments (who wants to slog through all that?) and those in the Ubuntu Community Docs are great and pretty specific but cover Drupal 4.6.7 and 5.1. I probably should update the docs at Ubuntu, perhaps I will after I hash it out here and after a few people let me know they worked or what to change. Also, I like to search for instructions specific to my situation whenever I approach a new installation. It's good to see what other people in similar circumstances have encountered, I call it due diligence. I would suggest any user doing this install review the documentation mentioned above thoroughly. Also see related links at the end of this article.

Environment

These instructions don't cover the setup of your server environment. Mine happens to be:
  • Ubuntu 6.06 LTS server
  • Apache 2.0.5.5
  • PHP 5.1.2
  • MySQL 5.0.22

Get Drupal

wget http://ftp.drupal.org/files/projects/drupal-5.7.tar.gz
tar -zxvf drupal-5.7.tar.gz

I'm a big fan of apt-get but there were a lot of issues in the forum started by people having problems with Drupal in the repositories. Community Docs recommend getting the latest package from Drupal.org, right now that happens to be Drupal 5.7. (Drupal 6 is out now as well, and is very cool, but CCK/Views aren't ready for prime-time and I'm installing for the purposes of following tutorials written for 5.x.)

Move Drupal

sudo mkdir /var/www/drupaltest
sudo mv drupal-5.3/* drupal-5.3/.htaccess /var/www/drupaltest
sudo mkdir /var/www/drupaltest/files

My apache install is pretty much setup to default config. /var/www is my web root, yours may vary. Because I'm just using this particular install as a test which I plan on destroying later I'm going to put it in the boring old subdirectory 'drupaltest', actually I named mine d57_test_01 but thought drupaltest would be more comprehendable in the example.

In the mv command we explicitly move .htaccess because it's a hidden file.

Database Setup

mysqladmin -u root -p create db_drupaltest
mysql -u root -p

Create the database for Drupal to use - you can replace 'db_drupaltest' with whatever you'd like to call the database. You'll need to enter your mySQL root password. If you get an access denied error make sure you're using the mySQL root password and not your login or Ubuntu root password. The second command puts you in mySQL monitor, the command line interface for managing your MySQL server. The commands in the next code section are SQL. You could also run this in phpMyAdmin if you'd rather have a GUI.

GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES ON drupaltest.* TO 'drupal_usr'@'localhost' IDENTIFIED BY 'secretpassword';
FLUSH PRIVILEGES; \q

Change the datebase name, the username 'drupal_usr' and 'secretpassword' to whatever you like. Just don't forget to write it down somewhere safe because you'll need to know it later.

Edit Settings.php

sudo vi /var/www/drupaltest/sites/default/settings.php

Using vi (or whatevs) change the $db_url line. Note: If you use a fancy charcters or dashes in your user, password or database names replace them URI hex encodings, this is detailed in database settings comments section in the settings.php file.

$db_url="mysql://drupal_usr:secretpassword@localhost/drupaltest";

Up Your PHP Memory Allocation

If you have a new LAMP install the default memory setting for scripts is 8M. This is redonkulous and Drupal will suck. Look for the 'Resource Limits' section and change memory_limit to 32M and then restart apache.

sudo vi /etc/php5/apache2/php.ini sudo /etc/init.d/apache2 restart

Final Steps

Go to http://localhost/drupaltest/install.php (or your servername instead of localhost if DNS is setup). You should see this:

Screenshot of Drupal Installationb Message

One last thing, if you click Administer you will probably get a 'one or more problems were detected' error message. Two things: your files directory isn't writable and you cron job hasn't run. The first one is easy - just make the files area writable by all:

sudo chmod 777 /var/www/drupaltest/files

As for cron, you can just click 'run cron manually' on the Status report page - but you'll need to do this anytime you want to update the index. For a quick dev install you're likely to trash soon it may not be necessary but for a production or long-term dev install you'll want to set up a cron job to hit http://localhost/drupaltest/admin/logs/status/run-cron every few minutes depending upon your site's traffic and requirements. See Configuring cron jobs in Drupal's Getting Started guide for more.

That's it. Good luck folks, now enjoy surfing the Drupal learning curve...heheh.

Related Links

Comments (1)

Kill is Your Friend

Ah, slowly I am earning my beans in Ubuntu land. Tonight I had my first legitimate need to use kill and even found out what a PID number was and EVEN read my LOG FILES!! Wow. If you’re just learning Apache admin stuff and just can’t reload or restart your web server, head on over to /var/log/apache2 and check out your error.log file. You prolly shouldn’t randomly kill .pids but if you’re getting a repeated httpd error like mine: “httpd (pid 5347?) not started” it could be that the server was manually shut down (oops, old Windows debugging tactic) and the process just needs to be murdered.

kill 5347

Worked like a charm. Another hint for you fellow newbs…maybe reading an Apache admin book would actually SAVE you time, eh? All I needed to do to enable my dern rewrite was to change AllowOverride None to AllowOverride all in my sites-enabled config. Hmmm….I bet a quick doc on configuring sites in Apache2 would have saved me all that. But then I wouldn’t have ever found out how to kill stuff on linux…so then again, maybe it was all worth it.



Awesome drawing by pure evil bunny on Flickr.

Comments (7)

Installing Plone 3.0 on Ubuntu 6.06

Plone on Ubuntu
Plone 3.0.2 on an 32-bit Ubuntu 6.06 LTS ‘Dapper Drake’ LAMP

What is Plone?Plone is a pretty cool content management system (CMS). Well, I think it’s cool based on some videos and documentation on their site. After comparing a lot of CMS options (including some really expensive closed-source stuff and 2 weeks of developer training on Sharepoint) I’m trying out Plone because a) open source is cool. b) Google people are smart, c) it supports LDAP authentication which I need for my all-M$ work enviro, d) there’s training/conferences/books/commercial support!? available and d) I’ve read some great reviews and CMS comparisons that rated it #1.

That was the good, now the most common complaint: hellish learning curve. Well, if you’re a developer and you’re going to customize beyond the existing system and available plugins. Why? Well, you’ll have to learn some python and the database is the rather obscure Zope instead of MySQL or SQL. In fact, I just read this on 456 Berea St (one of my fav dev blogs) and it gave me the shivers:

And the inital learning curve, even though I was a fairly good python programmer, was insane.

Then again, people told me Perl was insane too and I loved working with Perl. Besides, we’re just playing right now, right?

What is Ubuntu? - Ubuntu is a community developed, linux-based operating system that I use for a web server. ‘Dapper Drake’ is the release name for 6.06 LTS. (Keyword: LTS – long term support). It’s surprisingly easy to install. If you’re bored, geeky and/or have extra VMs or boxes lying around you should try a LAMP install and get your web server up in 15 minutes!

There’s already a Plone From Scratch HOWTO on the Ubuntu Forums, however it is for a previous version of Plone (2.0.5) and uses the apt-get method which I’ve noticed a lot of people in the forums having trouble with. Disclaimer: I do not provide ‘undo/uninstall’ instructions, other than, whip up a a LAMP on VMware and test it first or backup your system. I would welcome uninstall directions if someone who knows what they are doing could provide them.

Doesn’t Plone provide installation instructions? Well, yes. But when I started to read them I got real nervous. What the heck is gcc? libssl? TLS? etc. Those poor geeks at Plone don’t realize how noob noob can be! I tried to find better instructions (for me) at the Ubuntu Forums and instead I found a lot of people having problems trying to install Plone via Synaptic or apt-get instead of using Plone’s instructions. So here ya go.

You should already have gcc, g++, make and tar installed by default, but if you want to be anal do this:

sudo apt-get update
sudo apt-get install gcc g++ make tar

Sidetrack > (Possibly totally unnecessary step!) If you’re a fair server noob like me you have no idea what libssl and readline libraries and development headers are and what the heck is TLS? Google suggests it has something to do with mail encryption. I read some confusing stuff on Ubuntu Forums about RPM and a program called Alien that converts RPMs to DEB packages seemed probably unrelated but like a handy utility to have and so:

sudo apt-get install alien

Download the Plone Installer Package. I saved it in my home folder – it doesn’t really matter where you put it so long as you can find it.

wget -c http://plone.googlecode.com/files/Plone-3.0.2-UnifiedInstaller-Rev2.tar.gz
tar zxf Plone-3.0.2-UnifiedInstaller-Rev2.tar.gz
cd Plone-3.0.2-UnifiedInstaller

Here’s where you have to pay attention. You can do a ZEO or Stand-Alone install. “WTF?” you say? Yeah, well just pick your poison after reading all about it. Me, I’m doing a Stand-Alone as root install.

According to the README.TXT in the installation package my choice of standalone instance installed as root will result in Plone being installed to /opt/Plone-3.0.2 and libz and libjpeg libraries getting built in /user/local. A “plone” user will be added and Zope will be configured to run under that user id. You need to start Zope as root user (via sudo).

sudo ./install.sh standalone

Then a whole bunch of stuff happens. Lots of gcc (compiling) and checking for things (lots of yes and a few no-s in my case). Just wait, watch, hail Mary and knock on wood. This is a good time to read about the rest of this tutorial or creating new Zope/Plone instances if you want to do that. Or just catch up on your RSS feeds and I’ll hold your hand some more ;o)

If you’re real lucky, this is what you’ll see when it’s all said and done:


#####################################################################
###################### Installation Complete ######################
Use the account information below to log into the Zope Management Interface
The account has full 'Manager' privileges.
Username: admin
Password: XXXXXXX
Before you start Plone, you should review the settings in:
/opt/Plone-3.0.2/zinstance/etc/zope.conf
Adjust the ports Plone uses before starting the site, if necessary
To start Plone, issue the following command in a Terminal window:
sudo /opt/Plone-3.0.2/zinstance/bin/zopectl start
To stop Plone, issue the following command in a Terminal window:
sudo /opt/Plone-3.0.2/zinstance/bin/zopectl stop
Plone successfully installed at /opt/Plone-3.0.2
See /opt/Plone-3.0.2/zinstance/adminPassword.txt
for password and startup instructions
Ask for help on plone-users list or #plone
Submit feedback and report errors at http://dev.plone.org/plone
This installer was created by Kamal Gill (kamalgill at mac.com)
Maintainers for Plone 3 are Kamal Gill and Steve McMahon (steve at dcn.org)

First, write down your admin password!! Then, check zope.conf to ‘review settings’. I’m not familiar with them, so I just scanned for the port. Found it on line 969 (YMMV), set to the default Zope/Plone standalone install grabs, port 8080.

sudo vi /opt/Plone-3.0.2/zinstance/etc/zope.conf

Use :q! to quit vi without messing anything up

Use netcat to see what ports you have open already. (Netcat comes with Ubuntu install)

nc -z -v -w2 localhost 1-65535

After confirming that 8080 is avail (or changing it in zope.conf if it is not available) continue following directions.

sudo /opt/Plone-3.0.2/zinstance/bin/zopectl start #that last part is an L not a 1, took me a while...

Then go to: http://localhost:8080 (or sub servername for localhost to test from other computer). If you’re lucky AGAIN you’ll see the Zope Quick Start page!! Wahoo! Hii Fiiiiveh to self! Then look at the example site: http://localhost:8080/Plone, and then check out the management interface at http://localhost:8080/manage. Have fun. I hope it was as good for you as it was for me!

BTW: I suggest the Ubuntu Support Forum and Plone Support Forum for help. I am totally new at server administration, I prolly shouldn’t even publish this and I definitely can’t support your lazy arse!

Comments (12)

Website Contact Pages

Contact Page vs. Mailto Link

Oh the contact page. So boring, so obligatory. And not as simple as it may seem. I was hoping to jazz up the contact page at BuildCarbonNeutral.org with some sort of slick Ajax contact form. You see, I built the site in a really big hurry could spare not time for extras like protecting raw email addresses. By the way, email address protection is not an extra, usually! It’s something I meant to rectify as soon as possible and sure enough, our general contact alias is already receiving spam. I thought I might take the email address off completely and post a contact form instead.

The Problem with Contact Forms

Even the best form is an obstacle. Users don’t like filling out forms and what’s more, you introduce an opportunity for error. Everyone commits a typo now and again, and what if someone sends you information you’d really like to follow up on but lo and behold, their email address bounces. Even if you add the extra email confirmation input (make the user enter it twice), there’s still the case of people using an incorrect email address just to harass you. But really it all comes down to user experience. Don’t make your user fill out a form if they don’t have to.

The Simple and Sincere Mailto Link

So it’s back to the good ‘ol mailto link for me. The added benefit is people can save the email address in their contact list of choice and can format the email and send attachments if they choose. An email link is more personal, less corporate. Of course you all know that any email address present in the code of a public website is crawlable by spambots. Therefore be sure to put measures in place to protect all email addresses!

There’s Always an Exception

Sometimes you really should use a form. A common use for them is on high-traffic sites where they actually want to make it a little harder for users to get in contact. This approach is especially prevalent on sites that offer a product or service that results in a lot of support email and they want to encourage users to troubleshoot their own problem using existing documentation (FAQs, support forums, etc) before contacting the company/authors directly. Some sites don’t provide contact info at all for this reason. Chances are though, if your site is for a small business or is personal, you want to make it easier for people to contact you.

Comments

« Previous entries Next Page » Next Page »