Earlier this week I was trying to wrangle a new nonprofit search engine. Since I did, I've felt vaguely uncomfortable about the whole thing, as I'm a nonprogrammer in a programmer's land. But I've also decided that one of the values of my non-programming background in playing around with new technology tools is that I'm not a techie and if I can do something, then other non-tech types can, too. Plus, I love a challenge, even if I end up looking idiotic in the end.
At any rate, Allen of Nonprofit Tech Blog has been working on a search engine too, with much better results, I think because Pligg also allows for community ratings, which I think brings a richer search experience. Also, Allen knows better what he's doing.
In that process, he's uncovered some challenges to creating a search engine and suggested some parameters that we should all be considering in any tool that gets created:
- "We should have a common easily accessible database of tagged sites and resources. As long as we use del.icio.us as one of our main tag databases, we will not be able to achieve this end. Basically, I want to traverse the entire database without using awkward programming workarounds that limit the ability of our peers who are not programmers from accessing or analyzing that data.
- Any API to the data source should allow us to extract the following pieces of information from each tag:
- URL of the resource
- tagger ID or username
- date tagged
- multiple tag list but with nptech as a the sole required tag
- It should be just as easy and as well-supported as our current workflow. This means support for newly tagged resources going into the database and RSS feeds coming out of the database in the same manner as we are currently working. It would also allow for enhancements to our workflow such as filtering and the opportunity to edit the tag after the tag has been presented to the community.
- Tagged site lists should be easily converted into XML or OPML.
- We should have a way to provide for the re-editing of nptech tags and probably, a Digg-like interface to help us evaluate the more popular of these tags."
All of that should be done WITHOUT screenscraping methods and just through an API method call alone.
I highly suggest a trip to Allen's place for a read of the rest and to make your contributions to his suggested list.