Picky: Plumbing Overview Tweet
This is a (admittedly a bit ranty and chaotic, but bear with me – recipes will follow) post in the Picky series on its workings.
I’ve gotten a lot of feedback on Picky. Many people write in to tell me how cool everything looks, but often I don’t hear how it is working out later.
This led to me wondering if Picky is initially attracting users, but then losing them due to missing simple recipes on how everything is put together.
Out of thin air I get this feedback:
“for those just looking to get a glance at how the model, view and controller layers are set up for Picky there isn’t much in your docs to give that high-level glance. […] but there wasn’t anything in there […] detailing the actual plumbing that ties the app and data to picky.” (ellipses mine)
He’s right.
There is the overview image on the getting started page, but it isn’t very clear on how everything fits together.
There is also the best practices setup in the Wiki, but that does not really show any code, just how it is connected on an abstract level.
So, let me clear up a few things. This is the current state of how Picky is used:
We have multiple areas:
- The Picky server (gem
picky
) is a standalone server. You can send it HTTP requests and it will return HTTP responses with a JSON body. - The Picky Client (gem
picky-client
) is a way to query the server comfortably using Ruby instead of having to put together the queries yourself. - You use this Picky Client in your webapp to get result ids from the server.
- Picky also offers a Javascript interface that can display rendered results and a result count. The results need to be rendered in the webapp, the server only returns result ids.
The absolute best way to see all this in code and in action is to try the getting started. If you haven’t tried it, do so now, run it, and take a look at the code (especially in the server app/application.rb
, in the client app.rb
, the Sinatra app).
Picky is ORM agnostic
(This part is divided into my reasoning/ranting ;) for not offering ORM support and code examples on how to handle this)
The ORM rant
Most people trying Picky for the first time are expecting some sort of ActiveRecord or other ORM integration.
Let me tell you upfront: There is none. Yes, no requiring a gem and slapping on a module in Picky.
Why? Many other search engine Ruby adapters offer some sort of nice ORM support, which lets me easily search and find data.
While I would love to provide some sort ORM integration, let me tell you why I don’t support an ORM (yet):
It costs a lot of effort/resources to do right and I wanted to spend that time for making Picky good and have a great Javascript user interface.
Since for me the hard part is not the loading the data from some model into the index (that is mostly easy), but making a really good user interface and having the data indexed and searched really correctly.
I always felt that comfortable ORM integrations, while being comfortable, mostly hide the way your data is indexed.
They provide you an easy solution to an easy problem.
If your data is hard to index, your data might be too complicated, too normalized.
Picky on the other hand, gives you the power of doing searching right. In Ruby.
Because search engines never work the same:
- The last search engine you built simply had different data.
- There always will be edge cases, people not finding their data. Ever ran
rake 'try[some words]'
in the server directory? This will tell you exactly how Picky indexes these words, or preprocesses them before searching. - There always will be the pointy haired boss finding the way to your desk, asking why his best friend doesn’t find X, but Y instead. This can be shown, integration tested and fixed in minutes. Result: Friend finds X.
Although it might be enticing to have a search set up really fast, it is most of the time paid later: When all is about making the search work really well and edge cases crop up (due to the fact that most data is rather freeform).
Then again, you might not care about all these edge cases or having a really good search. Then again, why are you reading this exactly?
BIG BUT
Let me say though that I see the appeal of having an ORM integration, and the next few months may see our efforts shifted towards having a Picky ORM integration. This is a result of a long discussion with Karel Minařik, aka Mr. Tire.
It will probably take place first in the form of having a flexible external interface in the server through which data is sent and indexed.
The indexing definition would still be in the server, but the selection and sorting of data would be in the Rails / Sinatra etc. application.
In short:
- Your webapp selects and sorts the data, sending it to the server.
- The Picky server indexes your data.
But I need to think about this – your feedback is much appreciated!
How to index your Rails data
There are many ways to index your data. See the part under Flexible Sources which explains how to use the #each
method on your models to index.
Whatevs, pickle face! I want to index my models!
Don’t give in to the rage. Ruby is your Jedi weapon.
A few suggestions.
You have a model Book
in your Rails app.
class Book < ActiveRecord::Base
# your supermodel
end
and you’d like to reuse this in Picky.
Try this:
# Get the model.
#
require "#{PICKY_ROOT}/../rails_app/app/models/book"
# Get the database configuration from the Rails app.
#
db_config = YAML.load(File.open("#{PICKY_ROOT}/../rails_app/config/database.yml"))
# Establish a connection using the right environment.
#
Book.establish_connection db_config[PICKY_ENVIRONMENT]
# Utilize the #each method on e.g. Book.some_named_scope to index.
#
book_index = Index::Memory.new :book_each do
source Book.order('title ASC')
category :title
category :author
# ...
end
Yes, sometimes the models are much more complicated, using acts_as_something
(or the modern versions thereof) and class methods from them.
In that case, either require your rails app/environment, or just load the data from the database:
Relationship status: It’s complicated
Sometimes you need to index a complex combination of data (with a JOIN
or so). For this you can use a database source in the server:
book_index = Index::Memory.new :book_each do
source Sources::DB.new(
'SELECT b.id, b.title, a.name
FROM books b INNER JOIN authors a
ON a.id = b.author_id',
:file => "#{PICKY_ROOT}/rails_app/config/#{PICKY_ENVIRONMENT}/db.yml"
)
category :title
category :author
# ...
end
The Picky server is a standalone server
The server (currently) is completely independent of your Rails / Sinatra / ActiveRecord application.
That means it lives in a separate directory. It does not use your Rails environment.
The server offers a HTTP interface, returning JSON payload.
Let’s look at an example. In the server configuration app/application.rb
you will have a route defined:
route %r{\A/media\Z} => Search.new(books_index, mp3_index)
This does exactly what it says and will route search requests on /media
to a search using the books_index
and the mp3_index
.
To directly query the server, you can use curl
.
So, curl 'localhost:8080/media?query=Pirates&ids=20&offset=0'
will return e.g. the id of “Pirates of the Carribean”.
But it won’t be just a list of the ids, but a JSON response. Let’s look at it:
{
"allocations":[
["books",8.56,13,[["title","pirates","Pirates"]],[59,65,106,110,164,166,174,218,235,249,344,413,425]],
["mp3s",5.48,241,[["title","pirates","Pirates"]],[5,6,7,8,12,13,161]]
],
"offset": 0,
"duration": 0.009041,
"total": 254
}
We have several parts:
- allocations: In what index it was found, and also in what categories in that index, including the 20 top ids (in this example).
- offset: The offset that was used to search.
- duration: The time it took Picky to find the results.
- total: The total number of result ids.
Now, because it is a bit tedious to extract data from the JSON string, we wrote…
The Picky client gem
The Picky client handles the wrapping of the query and the unwrapping of the result JSON for you. For example, the command picky search some_url
or the integration tests use the client to make accessing the result data much easier.
gem install picky-client
First, configure the client. It is always configured to point at a specific search (path):
MediaSearch = Picky::Client.new :host => 'localhost', :port => 8080, :path => '/media'
Now you can use it like this:
results = MediaSearch.search 'some query text', :ids => 20, :offset => 0
The results
variable now simply holds a hash with the JSON data. Extend it with Picky::Convenience
to get a few nice methods on this hash.
results.extend Picky::Convenience
results.ids # => array of the ids
results.total # => amount of total ids (not just the 20)
results.empty? # => Do we have results?
Also nice is this one, which will take the result ids of the books, and load each corresponding Book model, then yield it to the block where you can render it:
results.populate_with Book do |book|
book.to_s
end
It’s best if you look at it in the Sinatra example application from the Getting Started.
Conclusion
So we’ve seen
- that Picky is a standalone server.
- that Picky does not yet offer an ORM integration.
- what you can do with the Picky client gem.
Hope you learnt something new!
Next Picky: Designing an ORM Integration 1Share
Previous Phony: Phone Numbers