Web Software Architecture and Engineering – Life on the Bleeding Edge

In our Verity days, we used a UDF called VerityClean (still available at CFLib), that did a lot of grunt work of cleaning keywords for Verity. In fact, if you read the description of the UDF, it says: “strips all invalid characters and word combinations from a search strign
to prevent verity from crashing.” Awesome, right?
Well, in moving to Solr, there was no equivalent. Solr can be very picky, it rocks when you have a UDF that:

  • Replaces comma with OR
  • Strips double spaces
  • Strips bad characters
  • Cleans up sequences of space characters
  • Uppercases Solr terms like AND, OR, etc.

I just submitted SolrClean and a sister UDF uCaseWordsForSolr to CFLib. Enjoy!

UPDATE: The submissions to CFLib were never approved or simply disappeared. I’ll bring it to GitHub.

UPDATE 2: The submissions are now available on CFLib. I am preparing a separate post on them.

Advertisements

Comments on: "Lessons Learned: Moving from Verity to Solr (Part 7)" (4)

  1. Gabriel said:

    Is it possible to post “SolrClean” and “uCaseWordsForSolr”
    on your site or elsewhere? I don’t see them on CFLIB. Thank you.

    • Yeah, somehow they never made it past Ray. I’m going to make some additional improvements and put them up on GitHub. Stay tuned Gabriel.

  2. Dean Lawrence said:

    Sami, I’d be very interested in this once you are able to get it posted. Sounds like exactly what I was looking for.

    Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: