Picky: Plumbing Overview

ruby / picky

This is a (admittedly a bit ranty and chaotic, but bear with me – recipes will follow) post in the Picky series on its workings.

I’ve gotten a lot of feedback on Picky. Many people write in to tell me how cool everything looks, but often I don’t hear how it is working out later.

This led to me wondering if Picky is initially attracting users, but then losing them due to missing simple recipes on how everything is put together.

Out of thin air I get this feedback:

“for those just looking to get a glance at how the model, view and controller layers are set up for Picky there isn’t much in your docs to give that high-level glance. […] but there wasn’t anything in there […] detailing the actual plumbing that ties the app and data to picky.” (ellipses mine)

He’s right.

There is the overview image on the getting started page, but it isn’t very clear on how everything fits together.

There is also the best practices setup in the Wiki, but that does not really show any code, just how it is connected on an abstract level.

So, let me clear up a few things. This is the current state of how Picky is used:

We have multiple areas:

The absolute best way to see all this in code and in action is to try the getting started. If you haven’t tried it, do so now, run it, and take a look at the code (especially in the server app/application.rb, in the client app.rb, the Sinatra app).

Picky is ORM agnostic

(This part is divided into my reasoning/ranting ;) for not offering ORM support and code examples on how to handle this)

The ORM rant

Most people trying Picky for the first time are expecting some sort of ActiveRecord or other ORM integration.

Let me tell you upfront: There is none. Yes, no requiring a gem and slapping on a module in Picky.

Why? Many other search engine Ruby adapters offer some sort of nice ORM support, which lets me easily search and find data.

While I would love to provide some sort ORM integration, let me tell you why I don’t support an ORM (yet):

It costs a lot of effort/resources to do right and I wanted to spend that time for making Picky good and have a great Javascript user interface.

Since for me the hard part is not the loading the data from some model into the index (that is mostly easy), but making a really good user interface and having the data indexed and searched really correctly.

I always felt that comfortable ORM integrations, while being comfortable, mostly hide the way your data is indexed.

They provide you an easy solution to an easy problem.

If your data is hard to index, your data might be too complicated, too normalized.

Picky on the other hand, gives you the power of doing searching right. In Ruby.

Because search engines never work the same:

Although it might be enticing to have a search set up really fast, it is most of the time paid later: When all is about making the search work really well and edge cases crop up (due to the fact that most data is rather freeform).

Then again, you might not care about all these edge cases or having a really good search. Then again, why are you reading this exactly?

BIG BUT

Let me say though that I see the appeal of having an ORM integration, and the next few months may see our efforts shifted towards having a Picky ORM integration. This is a result of a long discussion with Karel Minařik, aka Mr. Tire.

It will probably take place first in the form of having a flexible external interface in the server through which data is sent and indexed.

The indexing definition would still be in the server, but the selection and sorting of data would be in the Rails / Sinatra etc. application.

In short:

But I need to think about this – your feedback is much appreciated!

How to index your Rails data

There are many ways to index your data. See the part under Flexible Sources which explains how to use the #each method on your models to index.

Whatevs, pickle face! I want to index my models!

Don’t give in to the rage. Ruby is your Jedi weapon.

A few suggestions.

You have a model Book in your Rails app.

class Book < ActiveRecord::Base
  # your supermodel
end

and you’d like to reuse this in Picky.

Try this:

# Get the model.
#
require "#{PICKY_ROOT}/../rails_app/app/models/book"

# Get the database configuration from the Rails app.
#
db_config = YAML.load(File.open("#{PICKY_ROOT}/../rails_app/config/database.yml"))

# Establish a connection using the right environment.
#
Book.establish_connection db_config[PICKY_ENVIRONMENT]

# Utilize the #each method on e.g. Book.some_named_scope to index.
#
book_index = Index::Memory.new :book_each do
  source     Book.order('title ASC')
  category   :title
  category   :author
  # ...
end

Yes, sometimes the models are much more complicated, using acts_as_something (or the modern versions thereof) and class methods from them.

In that case, either require your rails app/environment, or just load the data from the database:

Relationship status: It’s complicated

Sometimes you need to index a complex combination of data (with a JOIN or so). For this you can use a database source in the server:

book_index = Index::Memory.new :book_each do
  source     Sources::DB.new(
               'SELECT b.id, b.title, a.name
                FROM books b INNER JOIN authors a
                ON a.id = b.author_id',
               :file => "#{PICKY_ROOT}/rails_app/config/#{PICKY_ENVIRONMENT}/db.yml"
             )
  category   :title
  category   :author
  # ...
end

The Picky server is a standalone server

The server (currently) is completely independent of your Rails / Sinatra / ActiveRecord application.

That means it lives in a separate directory. It does not use your Rails environment.

The server offers a HTTP interface, returning JSON payload.

Let’s look at an example. In the server configuration app/application.rb you will have a route defined:

route %r{\A/media\Z} => Search.new(books_index, mp3_index)

This does exactly what it says and will route search requests on /media to a search using the books_index and the mp3_index.

To directly query the server, you can use curl.

So, curl 'localhost:8080/media?query=Pirates&ids=20&offset=0' will return e.g. the id of “Pirates of the Carribean”.

But it won’t be just a list of the ids, but a JSON response. Let’s look at it:

{
 "allocations":[
  ["books",8.56,13,[["title","pirates","Pirates"]],[59,65,106,110,164,166,174,218,235,249,344,413,425]],
  ["mp3s",5.48,241,[["title","pirates","Pirates"]],[5,6,7,8,12,13,161]]
 ],
 "offset": 0,
 "duration": 0.009041,
 "total": 254
}

We have several parts:

Now, because it is a bit tedious to extract data from the JSON string, we wrote…

The Picky client gem

The Picky client handles the wrapping of the query and the unwrapping of the result JSON for you. For example, the command picky search some_url or the integration tests use the client to make accessing the result data much easier.

gem install picky-client

First, configure the client. It is always configured to point at a specific search (path):

MediaSearch = Picky::Client.new :host => 'localhost', :port => 8080, :path => '/media'

Now you can use it like this:

results = MediaSearch.search 'some query text', :ids => 20, :offset => 0

The results variable now simply holds a hash with the JSON data. Extend it with Picky::Convenience to get a few nice methods on this hash.

results.extend Picky::Convenience
results.ids # => array of the ids
results.total # => amount of total ids (not just the 20)
results.empty? # => Do we have results?

Also nice is this one, which will take the result ids of the books, and load each corresponding Book model, then yield it to the block where you can render it:

results.populate_with Book do |book|
  book.to_s
end

It’s best if you look at it in the Sinatra example application from the Getting Started.

Conclusion

So we’ve seen

  1. that Picky is a standalone server.
  2. that Picky does not yet offer an ORM integration.
  3. what you can do with the Picky client gem.

Hope you learnt something new!

Next Picky: Designing an ORM Integration 1

Share


Previous

Comments?