Blog Archives

Want Open Search Integration in Your Website…?

Over the past few weeks, Tatham Oddie, Damian Edwards and myself have been working on publishing a framework/toolkit for integration OpenSearch into any ASP.NET search enabled website. I’m pleased to announce we have finally hit a release!

The project is available at opensearchtoolkit.codeplex.com. Tatham has a great post on how to integrate it into your site on his blog

OpenSearch is a technology that already has widespread support across the web and is now getting even more relevant with Internet Explorer 8’s Visual Search feature and the Federated Search feature in the upcoming Windows 7 release.

Now it’s time to make it even easier. Ducas Francis, one of the other members of my team, took on the job of building out our JSON feed for Firefox as well as our RSS feed for Windows 7 Federated Search. More formats, more fiddly serialization code. Following this, he started the OpenSearch Toolkit; an open source, drop-in toolkit for ASP.NET developers to use when they want to offer OpenSearch.

Today marks our first release.

So get on over to codeplex, hit up Tatham’s blog for instructions and drop the toolkit into your web site so you can take advantage of all the coolness that is OpenSearch.

Advertisements

Discovering Search Terms

More trawling through old code I had written brought this one to the surface. One of the requirements of the system I’m working on was to intercept a 404 (Page Not Found) response and determine if the referrer was a search engine (e.g. google) to redirect to a search page with the search term. Intercepting the 404 was quite easily done with a Http Module…

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Web;

namespace DemoApplication
{
    public class SearchEngineRedirectModule : IHttpModule
    {
        HttpApplication _context;

        public void Dispose()
        {
            if (_context != null)
                _context.EndRequest -= new EventHandler(_context_EndRequest);
        }

        public void Init(HttpApplication context)
        {
            _context = context;
            _context.EndRequest += new EventHandler(_context_EndRequest);
        }

        void _context_EndRequest(object sender, EventArgs e)
        {
            string searchTerm = null;
            if (HttpContext.Current.Response.StatusCode == 404
                && (searchTerm = DiscoverSearchTerm(HttpContext.Current.Request.UrlReferrer)) == null)
            {
                HttpContext.Current.Response.Redirect("~/Search.aspx?q=" + searchTerm);
            }
        }

        public string DiscoverSearchTerm(Uri url)
        {
            …
        }
    }
}

Implementing DiscoverSearchTerm isn’t that difficult either. We just have to analyse search engine statistics to see which ones are most popular and analyse the URL produced when performing a search. Luckily for us, most are quite similar in that they use a very simple format that has the search term as a parameter in the query string. The search engines I analysed included live, msn, yahoo, aol, google and ask. The search term parameter of these engines was either named “p”, “q” or “query”.

Now, all we have to do is filter for all the requests that came from a search engine, find the search term parameter and return its value…

public string DiscoverSearchTerm(Uri url)
{
    string searchTerm = null;
    var engine = new Regex(@"(search.(live|msn|yahoo|aol).com)|(google.(com|ca|de|(co.(nz|uk))))|(ask.com)");
    if (url != null && engine.IsMatch(url.Host))
    {
        var queryString = url.Query;
        // Remove the question mark from the front and add an ampersand to the end for pattern matching.
        if (queryString.StartsWith("?")) queryString = queryString.Substring(1);
        if (!queryString.EndsWith("&")) queryString += "&";
        var queryValues = new Dictionary<string, string>();
        var r = new Regex(
        @"(?<name>[^=&]+)=(?<value>[^&]+)&",
        RegexOptions.IgnoreCase | RegexOptions.Compiled
        );
        string[] queryParams = { "q", "p", "query" };
        foreach (var match in r.Matches(queryString))
        {
            var param = ((Match)match).Result("${name}");
            if (queryParams.Contains(param))
                queryValues.Add(
                ((Match)match).Result("${name}"),
                ((Match)match).Result("${value}")
                );
        }
        if (queryValues.Count > 0)
            searchTerm = queryValues.Values.First();
    }
    return searchTerm;
}

The above code uses two regular expressions, one to filter for a search engine and the other to separate the query string. Once it’s decided that the URL is a search engine’s, it creates a collection of query string parameters that could be search parameters and returns the first one.

Unfortunately, there wasn’t enough time in the iteration for me to properly match the search engine with the correct query parameter, but as is most commonly the parameter comes into the query string quite early so it’s fairly safe to assume that the first match is correct.

Randomly Sorting a List using Extension Methods

I was trawling through some old code I had written while doing some “refactoring” and came across this little nugget. I wanted to sort a list of objects that I was retrieving from a database using LINQ to SQL into a random order. Seeing as extension methods are all the rage, I decided to use them…

public static class ListExtensions { 
  public static IEnumerable<T> Randomise<T>(this IEnumerable<T> list) { 
    Random rand = new Random();
    var result = list.OrderBy(l => rand.Next());
    return result; 
  } 
}

How does it work…? It adds the Randomise() extension method to the end of any IEnumerable<T> (e.g. List<T>) and uses the OrderBy function to change the sort order based on a randomly generated number.

var randomCategories = context.Categories.Randomise();

The above code will execute the Randomise function to reorder the list of Category objects retrieved from the context randomly and assign the result to randomCategories.