Testing ThinkingSphinx with Test::Unit and Transactional Fixtures
February 13, 2010
I first started using Sphinx about a year ago. At The Auteurs we use it to solve some thorny localization and filtering issues, and we have been pushing its limits almost since the beginning. Our film index in particular is quite complex, hitting ten tables and defining two dozen attributes. I’ve been impressed with how flexible it is with a pretty basic set of primitives. But one thing that never had an obvious solution was testing. I just finished the third revision on our test harnesses, and I thought I’d share my approach because it runs a bit contrary to the current zeitgest in Ruby testing circles.
Integration vs Unit Testing
I’m a big believer in integration testing, and when we first setup Sphinx we really wanted to test it end-to-end. Our first two harnesses reflected this approach. We ended up with a fairly elegant method to start and stop sphinx on classes that needed it. This gave us good test coverage, but eventually cracks started to show. We already needed to speed up our tests as I outlined a couple weeks ago, and integrated Sphinx support was one of the worst offenders:
Startup time for sphinx is on the order of a couple seconds, this adds up over multiple tests
Indexing time is also significant, and it adds up even faster if you reindex for every individual test
We’re regularly adding new models to be indexed by Sphinx which is magnifying these sources of overhead
Our indexes are getting more complex, which means exercising them well at the integration level is too slow
Rails foxy fixtures create sparse ids which Sphinx is not in love with
Even worse, if the ID overflows Sphinx internal key size (which I think depends on whether it is compiled as 32 or 64-bit) then those records are silently dropped.
We rely on transactional fixtures for performance reasons, which are not efficient to turn off on a case by case basis to solve the previous problems
Designing Sphinx Tests for Performance
Clearly Sphinx can be quite complex and good test coverage is critical. But what constitutes ‘good’? The meat of Sphinx is simply querying some efficiently indexed data. Verifying the results that ThinkingSphinx#search returns allows you to test that everything between the index definition and the query is working as expected. There are a lot of moving parts under there, so it’s reassuring to have a solid sanity check that your indexes are doing what you think they’re doing.
Given that Sphinx is a high performance fulltext search, there’s no reason you can’t run a lot of tests quickly. I decided that for the health of our test suite and testing discipline to stub out Sphinx in all our tests, and move the Sphinx tests into its own suite of highly efficient tests covering the actual indexes. In the end we lose a bit of regression coverage over the actual Sphinx calls in the application, but that is a comparatively small surface area for potential problems in exchange to encouraging better testing of the complex part, plus it’s not likely to be subtly broken by some distant change and if breakage does make it to production exception notification has our back.
Step One: New Test Suite
In order to isolate these tests from fixtures and minimize startup overhead, I moved all Sphinx tests to test/sphinx and wrote this Rake task:
namespace :testdo
task :sphinx => 'ts:config_test'do
puts "! Starting Sphinx by rake"
silence_stream(STDOUT){ Rake::Task["thinking_sphinx:start"].invoke }
begin
Rake::Task["test:sphinx_without_daemon"].invoke
ensure
puts "! Stopping Sphinx by rake"
silence_stream(STDOUT){ Rake::Task["thinking_sphinx:stop"].invoke }
endend
Rake::TestTask.new(:sphinx_without_daemon) do |t|
t.libs <<"test"
t.pattern ='test/sphinx/*'
t.verbose =trueendend
namespace :tsdo
task :config_testdoENV["RAILS_ENV"] ='test'# This is a horrible sin against humanity, but it ensures that A) the test config is# always up to date, B) doesn't require devs remembering to add RAILS_ENV=test when# they call rake and C) doesn't incur a penalty of shelling out to 2nd env
Rake::Task["environment"].invoke # This must be run after the test environment (as opposed to as a prereq) is# forced or sphinx gets the wrong env.
puts "! Configuring Sphinx by rake"
silence_stream(STDOUT){ Rake::Task["thinking_sphinx:configure"].invoke }
endend
task "test" => ["test:sphinx"]
The main task here is test:sphinx which first ensures that the test config is up to date (by means of some fairly brutal methods as noted in the comments) and starts and stops Sphinx before running the actual tests.
Step Two: SphinxTestCase
To finish the optimization and make the test-writing experience smooth, a custom test case was required. The following has 3 purposes:
Start and stop Sphinx, but only if it’s not already started globally by Rake
TRUNCATE tables this test uses so that left over fixture remains get cleared and ids start at 1 again
Instantiate an set of test data, but only once per test class
The result:
class SphinxTestCase < ActiveSupport::TestCase
cattr_accessor :sphinx_started_within_test_case
class_inheritable_accessor :sphinx_tables_to_indexdef self.indexes_tables(*args)
self.sphinx_tables_to_index = args.map(&:to_s).map(&:tableize)
end# We use inherited hook to define this only on the concrete test class otherwise the method gets called twice.def self.inherited(subclass)
def subclass.suite(*args)
mysuite =superdef mysuite.run(*args)
unless ThinkingSphinx.sphinx_running? # When running via rake we only start sphinxd once
puts "! Starting Sphinx on per-test basis"# If running by rake and somehow sphinx doesn't start we should know about it
silence_stream(STDOUT){ ThinkingSphinxTestHelper.start! }
SphinxTestCase.sphinx_started_within_test_case =trueendsuperif SphinxTestCase.sphinx_started_within_test_case
print "\n! Stopping Sphinx on per-test basis"
silence_stream(STDOUT){ ThinkingSphinxTestHelper.stop! }
endend
mysuite
endend# Instance variables set here are automatically set on each individual test instancedef self.setup_database(&block)
@database_setup_block= block
enddef setup_fixtures
unless sphinx_tables_to_index.present?
raise"Tables to be cleared must be defined using indexes_tables on the test class"end@@sphinx_test_case_already_loaded||= {}
unless@@sphinx_test_case_already_loaded[self.class]
sphinx_tables_to_index.each do |table|
ActiveRecord::Base.connection.execute("TRUNCATE #{table}")
end
db_setup_block =self.class.instance_variable_get(:@database_setup_block)
raise"setup_database was not called for #{self.class}"unless db_setup_block
db_setup_block.call
ThinkingSphinxTestHelper.index!
@@sphinx_test_case_already_loaded[self.class] =trueendself.class.instance_variables.each do |ivar|
self.instance_variable_set(ivar, self.class.instance_variable_get(ivar))
endenddef teardown_fixtures
# Do nothing.endend
Test class methods
There are two class methods defined indexes_tables and setup_database that a SphinxTestCase uses to define its environment in lieu of fixtures. These methods respectively take a list of symbols ala standard Rails class methods and a block inserting a set of database rows.
Once-per-class hook
The self.inherited block is a bit hard to parse, but it’s essentially just a way to run something once per test class, similar to RSpec’s before(:all). It has to be called in the inherited callback because otherwise the super call ends up recursing and the code executes twice.
This idea was adapted from work by James Adam, thanks James!
Override fixtures setup
The setup_fixtures and teardown_fixtures are methods defined by ActiveSupport::TestCase. They are completely overriden here since there’s nothing efficient that can be done with transactional fixtures enabled anyway. Instead we just verify that some tables are specified, truncate them, setup the data, and index it. But this only happens once per test class, therefore we have to copy over the instance variables to provide the usual semantics.
Isn’t this data sharing A Bad Thing?
This design means that adding or deleting data within tests will be error prone since it can affect other tests. However for Sphinx you would need to reindex anyway, and because Sphinx is all about querying full datasets, it made sense to me to set up one big set of test data and then right multiple tests to query against it. The worst case is that you simply need to create a new test class to setup a new context, which personally is no less distasteful than contexts nested 3 or 4 levels deep.
Why not just truncate all tables?
That’s not a bad idea actually. Forgetting to specify tables tends to lead to bizarre duplicate key errors. However in our database we have 150 tables, so I decided to require selectivity. YMMV.
Wheres the code?
This seemed like too specific a use-case to release in any formal fashion, but let me know if you disagree.
Step 3: Write Your Tests
Here’s an example using FactoryGirl and Shoulda:
class UserSphinxTest < SphinxTestCase
indexes_tables :users, :emails
setup_database do@user=Factory(:user, :first_name => "Will", :last_name => "Fulignoranz")
@user2=Factory(:user, :first_name => "Farmer", :last_name => "William")
@user3=Factory(:user, :first_name => "Bubby", :last_name => "")
end
should "find will"do
results = User.search('will')
assert results.include?(@user)
assert results.include?(@user2)
assert ! results.include?(@user3)
end
should "find ignorance"do
assert_equal @user, User.search('fulignoranz').first
end
should "not find the drummer"do
assert_equal 0, User.search('friendly fred').size
endend
Results
19 tests, 36 assertions, finishes in 7 seconds, full test suite 20 seconds faster with +25 assertions and marginal cost for additional Sphinx tests is now negligible.
Testing ThinkingSphinx with Test::Unit and Transactional Fixtures
I first started using Sphinx about a year ago. At The Auteurs we use it to solve some thorny localization and filtering issues, and we have been pushing its limits almost since the beginning. Our film index in particular is quite complex, hitting ten tables and defining two dozen attributes. I’ve been impressed with how flexible it is with a pretty basic set of primitives. But one thing that never had an obvious solution was testing. I just finished the third revision on our test harnesses, and I thought I’d share my approach because it runs a bit contrary to the current zeitgest in Ruby testing circles.
Integration vs Unit Testing
I’m a big believer in integration testing, and when we first setup Sphinx we really wanted to test it end-to-end. Our first two harnesses reflected this approach. We ended up with a fairly elegant method to start and stop sphinx on classes that needed it. This gave us good test coverage, but eventually cracks started to show. We already needed to speed up our tests as I outlined a couple weeks ago, and integrated Sphinx support was one of the worst offenders:
Designing Sphinx Tests for Performance
Clearly Sphinx can be quite complex and good test coverage is critical. But what constitutes ‘good’? The meat of Sphinx is simply querying some efficiently indexed data. Verifying the results that
ThinkingSphinx#searchreturns allows you to test that everything between the index definition and the query is working as expected. There are a lot of moving parts under there, so it’s reassuring to have a solid sanity check that your indexes are doing what you think they’re doing.Given that Sphinx is a high performance fulltext search, there’s no reason you can’t run a lot of tests quickly. I decided that for the health of our test suite and testing discipline to stub out Sphinx in all our tests, and move the Sphinx tests into its own suite of highly efficient tests covering the actual indexes. In the end we lose a bit of regression coverage over the actual Sphinx calls in the application, but that is a comparatively small surface area for potential problems in exchange to encouraging better testing of the complex part, plus it’s not likely to be subtly broken by some distant change and if breakage does make it to production exception notification has our back.
Step One: New Test Suite
In order to isolate these tests from fixtures and minimize startup overhead, I moved all Sphinx tests to
test/sphinxand wrote this Rake task:The main task here is
test:sphinxwhich first ensures that the test config is up to date (by means of some fairly brutal methods as noted in the comments) and starts and stops Sphinx before running the actual tests.Step Two: SphinxTestCase
To finish the optimization and make the test-writing experience smooth, a custom test case was required. The following has 3 purposes:
TRUNCATEtables this test uses so that left over fixture remains get cleared and ids start at 1 againThe result:
Test class methods
There are two class methods defined
indexes_tablesandsetup_databasethat a SphinxTestCase uses to define its environment in lieu of fixtures. These methods respectively take a list of symbols ala standard Rails class methods and a block inserting a set of database rows.Once-per-class hook
The
self.inheritedblock is a bit hard to parse, but it’s essentially just a way to run something once per test class, similar to RSpec’sbefore(:all). It has to be called in the inherited callback because otherwise the super call ends up recursing and the code executes twice.This idea was adapted from work by James Adam, thanks James!
Override fixtures setup
The
setup_fixturesandteardown_fixturesare methods defined byActiveSupport::TestCase. They are completely overriden here since there’s nothing efficient that can be done with transactional fixtures enabled anyway. Instead we just verify that some tables are specified, truncate them, setup the data, and index it. But this only happens once per test class, therefore we have to copy over the instance variables to provide the usual semantics.Isn’t this data sharing A Bad Thing?
This design means that adding or deleting data within tests will be error prone since it can affect other tests. However for Sphinx you would need to reindex anyway, and because Sphinx is all about querying full datasets, it made sense to me to set up one big set of test data and then right multiple tests to query against it. The worst case is that you simply need to create a new test class to setup a new context, which personally is no less distasteful than contexts nested 3 or 4 levels deep.
Why not just truncate all tables?
That’s not a bad idea actually. Forgetting to specify tables tends to lead to bizarre duplicate key errors. However in our database we have 150 tables, so I decided to require selectivity. YMMV.
Wheres the code?
This seemed like too specific a use-case to release in any formal fashion, but let me know if you disagree.
Step 3: Write Your Tests
Here’s an example using FactoryGirl and Shoulda:
Results
19 tests, 36 assertions, finishes in 7 seconds, full test suite 20 seconds faster with +25 assertions and marginal cost for additional Sphinx tests is now negligible.
Questions?