Web Software Architecture and Engineering – Life on the Bleeding Edge

I’m a bit behind, but it seems a couple weeks ago Ray Camden finally approved my two CFLib entries for working with Solr.

As I mentioned in my previous posts, this is critical if you are moving from Verity to Solr, or you happen to come up against many of Solr little quirks.

I’m going to take a moment, and go through the UDFs here for your benefit.

First, grab them here: http://cflib.org/udf/solrClean, and http://cflib.org/udf/uCaseWordsForSolr.

Second, you’ll note that SolrClean sounds like the venerable VerityClean UDF. Its basically meant to do something similar: take your input and sanitize it for Solr. Also, SolrClean relies on uCaseWordsForSolr.

SolrClean essentially takes your text and does the following:

  • replaces any commas with OR – so happy,sad => happy OR sad.
  • strips any double spaces
  • strips bad characters
  • cleans up sequences of space characters
  • upper cases reserved words

The last one is especially critical since Solr can treat reserved words differently based on the case used. So we change and to AND, or to OR, and that is what the uCaseWordsForSolr is all about.

As a caveat – I am seeing some issues with the code, and it may or may not have to do with the UDFs. If there are any updates, I’ll let you know. My plans is to put everything up on GitHub anyways. I am also planning to work with a vendor who will take our Solr install to a whole new level implementing, among others: synonyms, field weighting, master/slave setup with replication, upgrading to the latest Solr version, “More Like This” functionality, caching/performance tweaks, paging search!, and so much more – so stay tuned.

Short 5 minute video.

Our ColdFusion-based SaaS application sends out roughly 8,000 – 10,000 emails per day, and even more on the weekends.

I wrote the code in 2007, and since then worked hard on improving memory usage. We optimized queries, re-worked string concatenation to use JAVA’s StringBuffer and StringBuilder classes, and so much more. But to tell you the truth, we always wanted to work with some other software to do the hard work for us. We have templates, to which we pass in variables and shoot that off to a slice of our clients.

Originally, we worked with WhatCounts back in 2008 to get this done. Unfortunately, their API wasn’t as mature as we thought, and batch processing was a pain. I spoke with the CEO on the phone back them, and we decided after working together for months to part ways.

I have to say our current solution is a work of art, but I’d like more. It would be nice to get email analytics – information on opens, clicks, etc. We could do that internally, but its not our core competency. So for a long time, I kept my eyes open for any new companies I could work with.

Originally, I thought that maybe I could make due with the ConstantContact and VerticalResponse’s of the world. But they are not built for this type of work, and are geared more for marketing campaigns.

Fortunately, there are some new players in the field. I narrowed the field down to half a dozen players, and did proofs of concept with the top 3: SendGrid, PostmarkApp, and PostageApp. They all feature APIs and could meet some of my needs. What wasn’t clear, was how mature the feature set was. But the great thing about them was they allowed for no-cost trials, so I went ahead and tried them.

My primary use case was the following. Remember when you first learned Microsoft Word, and that one of the neat features was doing merging… you could create a template or form, and passing an address book for example, and it would create a ton of letters? Well, that is the sort of use case I had with email. I wanted the 3rd-party system to house my templates, and for me to pass in via API – users who would receive the emails and variables with the content they would receive. Simple I thought – I mean I could write something like that in ColdFusion if I had to. The great plusses were the deliverability improvements, anti-SPAM measures, and of course the analytics and logging.

I’ll be featuring a few posts on my experience working with the 3. There was a clear winner, and some astonishing things that became quite apparent with a few – especially those written in Rails.

Does something like this interest you?

I have a love/hate relationship with MS SQL Server. While its easy to use and setup, its a pain to grow and manage. The tools seems to cover only 20% of the normal use case.

Anyways, wouldn’t it be nice if MS SQL Server actually told you where you are missing indexes, and where performance can be improved.

Well, it can! Actually, it won’t tell you, unless you ask… so to ask, run this script, and it will even have a column with the CREATE INDEX script for you!

 SELECT

      migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,

      'CREATE INDEX [missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle)

      + '_' + LEFT (PARSENAME(mid.statement, 1), 32) + ']'

      + ' ON ' + mid.statement

      + ' (' + ISNULL (mid.equality_columns,'')

        + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

        + ISNULL (mid.inequality_columns, '')

      + ')'

      + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,

      migs.*, mid.database_id, mid.[object_id]

    FROM sys.dm_db_missing_index_groups mig

    INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

    INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

    WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

    ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC
    

More details found here: http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx

Funny YouTube Error

Last night I was surfing the web, and a link to YouTube produced this error. I didn’t know YouTube had a sense of humor.

500 Internal Server Error

Sorry, something went wrong.

A team of highly trained monkeys has been dispatched to deal with this situation.

If you see them, show them this information:

rrZoqK6l_OdI2aD1dfUKpDwIdFdKIqrsB5EiOabd9b6sZftRKsQ_ez4bbHX5
enfo3gc1yBk9FSlv4xPMszzkH57tvn7obwqf4c2m7arpc0l62SWCD7OYOS8N
oP63B1vnQ9-cF4Z2LSEULxIRPXTIqcFViH6mKFEkUESAJkk-XAVQIRyFl_dR
30-T_NpmaRnQpp8F0G8Er6gGr9UWAvMRf-0xs9NaJvlrDNaPQxD5usrrFlK1
sAQL348hfn63L32YNjwPq9UUaWhYKTDVTTtZIEAhLleVdQDyHviMJnvw5NXn
v0Obd71d-EvDr06f_U6cok2qf-GAsQK1QmnJFdBb7oZCds6h5tf6Xg2WBgoo
UtQ4UMPwMmwMt2ZulpBO3n_VczRTefRgtQWqdafCKeQrFesLRwR5Zz_93ieG
PlsQHS5TYFJKOZV6z-RHwYbpEuec-BTCxzTnB897-z82Yrj25P3zHEf_TMLu
EZbQLPfam9WvVyAu8zqfkWOnHC09SqvyHogT9K943D_6NMupprDjpA9w6Scy
1QWqLqCHDmmn6trQtcVnXd4CSCk1teIehd40W2XJ86OLl8nRFvQRiYJm2bFq
KcILGQz9LlGkpC2111qtzJEcy-5Kx5I-mEU2CKfzPaRAHluXXIKp7ipiFUFA
aTmgZjJ-X8cNDQZsJbJTzt8ZQkOpRVUKZC9tvTQPQI51aF7kKBwme91NuIF3
1lX33yWZnyzvFBV7QUhliR4S7J8L2NDzBYCBYO4vhJI_T3Ms_JpKAk_aVeH6
YNRNLqRpbn1ZPNukPv7ZGgV9SLn2vCLphsga8ueUzK-bPYbqBBykPkZEOQQA
CgxfHuZIJYwEOEPO5cfH9UPI2qgXpv4yNOdu-fyDONxeU98R5a6FCyc0vBZy
ZAvCPduakOGJ2cbSl7MxQzVwzr-g0l-LnUCe7TU6_OObVyTvzbIqR1cvksqJ
bq17NxKVDjROt8VkAq-WlyWYpy3ZvRQb9XCsEdm7WM68-RbOTzCy3fp87s6a
DzbzqZCLdJ7UL9SuTLtd0sZN22VauO30xIoFGb4SDvJcWtq9g5XPJGqVvHp-
--RJR7dbyAaJlWZmHamlj4XvSATX1MEXIA07Yrp5gjBwtoJGKVxppwKOerbx
2qf9R-26Ju6dxg1I4DnXw56sDmdcsLI2_V_pJhquhJdTGsMkS9VZdXku-cJR
ijOfTKjxk9ZsLu_QK6yNKg874Re_yR8H-8p6OiURYB830Z9I7JVfAFrv_Tef
Mea_uyrxV-mK_gJ3P_7eXNv1qVxPOSiqtwM1sCnUuqHaaczWX7fUD43-lYXr
CYNhjI8gA2PBcCgIDAKlY93b2xmCrvGTnX55goUYBR_fKLd3hOV1R1-k4lUY
YugyE0dFhHOXeiPB0AcRY4SaS148qix-mRUkYkElgu9wXjvaCqNIgpwBSAgN
ceibhkg906ekhc0iPYcgr5y5DG2hUQ5jqBOnDA9LXfQLifWLA1ou7BMHoko9
8UmnJLbz44U_FGdXLy4u2dJKMluUgFEwvolmZxU4rGAIh1HBr3wcdzzNV4x5
EGgbUEQFa8w5YpQZ-z_wIiev4gbtx7qmCXWih1mDVZq42-ReJR56IQBqx_Un
3LVFFT739PhYlRfh3QG3Miw=

I’ve moved my blog! Please make sure to update your RSS/ATOM feeds. More details forthcoming.

Like JSLint? Try JSHint!

JSHint is a new fork of JSLint.
To quote them:
“JSHint is a fork of Douglas Crockford’s JSLint that is designed to be more flexible than the original. Our goal is to make a tool that helps you to find errors in your JavaScript code and to enforce your favorite coding style. We realize that people use different styles and conventions, and we want our tool to adjust to them. JSHint will never enforce one particular convention. JSHint is developed and supported by the JavaScript developer community and we welcome feedback from everybody who cares about the language.”
We’re definitely going to start working on auditing and enforcing JS code through a mechanism like JSHint.

We have settled really well on Git, and have branches for all developers and each environment, like I described earlier.
However, what we don’t have is an easy way to deploy a git branch, or simply execute a git pull command remotely. I don’t want to do it using SSH and Git BASH, which seems to be what Google turns up.
I looked to see if I could whip up a home grown ColdFusion based solution. I looked at Git.cfc, however it lack any teeth. You can execute a “git pull” command, but the code doesn’t have any mechanism to load git keys, so the code errors out. I also looked at the jGit JAR file, and interfacing with that, but there is no documentation – only a link to forums.
Has anyone built anything that you found useful, or if you use Git, how do you effectively issue an update command for example, to a dev or QA server?

In our Verity days, we used a UDF called VerityClean (still available at CFLib), that did a lot of grunt work of cleaning keywords for Verity. In fact, if you read the description of the UDF, it says: “strips all invalid characters and word combinations from a search strign
to prevent verity from crashing.” Awesome, right?
Well, in moving to Solr, there was no equivalent. Solr can be very picky, it rocks when you have a UDF that:

  • Replaces comma with OR
  • Strips double spaces
  • Strips bad characters
  • Cleans up sequences of space characters
  • Uppercases Solr terms like AND, OR, etc.

I just submitted SolrClean and a sister UDF uCaseWordsForSolr to CFLib. Enjoy!

UPDATE: The submissions to CFLib were never approved or simply disappeared. I’ll bring it to GitHub.

UPDATE 2: The submissions are now available on CFLib. I am preparing a separate post on them.

These past 4 months have been quite arduous.
We launched a whole new upgrade to our ColdFusion-based SaaS product that included a new UX/UI re-do (10 month project!), moving from Verity to Solr (6 months!), and so much more.
After many restless night, numerous 60-hour weeks, I’m back. Will be posting more shortly!

Follow

Get every new post delivered to your Inbox.

Join 377 other followers