Web Software Architecture and Engineering – Life on the Bleeding Edge

I’m a bit behind, but it seems a couple weeks ago Ray Camden finally approved my two CFLib entries for working with Solr.

As I mentioned in my previous posts, this is critical if you are moving from Verity to Solr, or you happen to come up against many of Solr little quirks.

I’m going to take a moment, and go through the UDFs here for your benefit.

First, grab them here: http://cflib.org/udf/solrClean, and http://cflib.org/udf/uCaseWordsForSolr.

Second, you’ll note that SolrClean sounds like the venerable VerityClean UDF. Its basically meant to do something similar: take your input and sanitize it for Solr. Also, SolrClean relies on uCaseWordsForSolr.

SolrClean essentially takes your text and does the following:

  • replaces any commas with OR – so happy,sad => happy OR sad.
  • strips any double spaces
  • strips bad characters
  • cleans up sequences of space characters
  • upper cases reserved words

The last one is especially critical since Solr can treat reserved words differently based on the case used. So we change and to AND, or to OR, and that is what the uCaseWordsForSolr is all about.

As a caveat – I am seeing some issues with the code, and it may or may not have to do with the UDFs. If there are any updates, I’ll let you know. My plans is to put everything up on GitHub anyways. I am also planning to work with a vendor who will take our Solr install to a whole new level implementing, among others: synonyms, field weighting, master/slave setup with replication, upgrading to the latest Solr version, “More Like This” functionality, caching/performance tweaks, paging search!, and so much more – so stay tuned.


Comments on: "Lessons Learned: Moving from Verity to Solr (Part 8)" (6)

  1. Thanks for your marvelous posting! I definitely enjoyed
    reading it, you might be a great author.I will be sure to bookmark your blog
    and may come back someday. I want to encourage you to definitely continue your great writing, have a nice weekend!

  2. Did you get any further into “More Like This” functionality? I’d be very interested in reading about that

  3. Now that Solr 4.9 is out there, have you made any progress with any 4.* version of Solr?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: