In our Verity days, we used a UDF called VerityClean (still available at CFLib), that did a lot of grunt work of cleaning keywords for Verity. In fact, if you read the description of the UDF, it says: “strips all invalid characters and word combinations from a search strign
to prevent verity from crashing.” Awesome, right?
Well, in moving to Solr, there was no equivalent. Solr can be very picky, it rocks when you have a UDF that:
- Replaces comma with OR
- Strips double spaces
- Strips bad characters
- Cleans up sequences of space characters
- Uppercases Solr terms like AND, OR, etc.
I just submitted SolrClean and a sister UDF uCaseWordsForSolr to CFLib. Enjoy!
UPDATE: The submissions to CFLib were never approved or simply disappeared. I’ll bring it to GitHub.
UPDATE 2: The submissions are now available on CFLib. I am preparing a separate post on them.
Comments on: "Lessons Learned: Moving from Verity to Solr (Part 7)" (4)
Is it possible to post “SolrClean” and “uCaseWordsForSolr”
on your site or elsewhere? I don’t see them on CFLIB. Thank you.
Yeah, somehow they never made it past Ray. I’m going to make some additional improvements and put them up on GitHub. Stay tuned Gabriel.
Sami, I’d be very interested in this once you are able to get it posted. Sounds like exactly what I was looking for.
Thanks.
CFLib now has my UDFs.