Picky APIs

ruby / picky / api

A few examples of how to inject your own functionality into Picky.

We’re going to look at a simple example and how to customize it with Picky 4.0!

The Copy & Paste Example

The example is simple. We have an index of 4 persons (you might recognize the two famous ones). Each person has a first and a last name. Then we use a Search object on the index to search on it.

Go ahead, copy it into TextMate 2 Alpha or similar!

require 'picky'

Person = Struct.new :id, :first, :last

data = Picky::Index.new :people do
  category :first
  category :last
end

data.replace Person.new(1, 'Donald', 'Knuth')
data.replace Person.new(2, 'Niklaus', 'Wirth')
data.replace Person.new(3, 'Donald', 'Worth')
data.replace Person.new(4, 'Peter', 'Niklaus')

people = Picky::Search.new data

results = people.search 'donald'

p results.ids
p results.allocations

This returns ids [3, 1] and the allocations [ [:people, 0.0, 2, [ [:first, "donald", "donald"] ], [3, 1]] ]. That might look a little funny, so let me explain: :people is the index name where it was found. 0.0 is the total weight. 2 is the total number of ids in this “allocation” (combination of categories). [:first, "donald", "donald"] is the category the query word was found in, together with the token and the original.

All clear?

Try searching for “Niklaus”:

results = people.search 'niklaus'

You should find ids [2, 4] and two allocations now, first in the first name, then in the last name.

What if you want to find the last name first? We add some weight to it!

Adding weight

By default, Picky already weighs the categories with a logarithmic weight. That is, the more a token occurs in a category, the “heavier” it is.

So this:

category :last

is actually

category :last, weight: Weights::Logarithmic.new

However, for “Niklaus”, that resolves to a weight of 0.0.

So let’s add our own weight object. It just needs to respond to #weight_for(amount_of_ids) and return a float.

We ignore the amount and return a flat 12.3. Copy this in your example:

Weight = Class.new do
  def weight_for amount
    12.3
  end
end

and replace

category :last, weight: Weight.new

Now the last name comes first, with a weight of 12.3, not surprisingly.

[[:people, 12.3, 1, [[:last, "niklaus", "niklaus"]], [4]], [:people, 0.0, 1, [[:first, "niklaus", "niklaus"]], [2]]]

Picky provides a few weights itself:

What if we want “Wirth” and “Worth” be found at the same time?

Adding similarity

By default, Picky does not look for similar words.

This:

category :last

is actually

category :last, similarity: Similarity::None.new

Now, look for “warth~” (the ~ tells Picky to look for similar words):

results = people.search 'warth~'

You found nothing, right?

Picky only looks for similar words if the category enables it!

Let’s write a similarity such that both will be found. Copy this in your example:

Similarity = Class.new do
  def encode text
    text.gsub /[aeiou]/, ''
  end
  def prioritize ary, encoded

  end
end

We encode a text such that its vowels are removed. This will make “wirth” and “worth” resolve both to “wrth”, and that makes them similar. (The prioritize method allows you to sort and trim the similars list)

and replace

category :last, similarity: Similarity.new

Again, search for “warth~”.

results = people.search 'warth~'

This time you found both, right?

Picky offers Similarity::Soundex.new(amount_of_similar), Similarity::Metaphone.new(amount_of_similar) and Similarity::DoubleMetaphone.new(amount_of_similar). But rolling your own is easy, as you have seen.

Adding partial searching

Can you find Donald Knuth by entering “Donal”?

results = people.search 'donal'

You can. But why?

The word “donal” finds something because this:

category :first

is actually

category :first, partial: Partial::Postfix.new(from: -3)

That means it finds “dona”, “donal”, “donald”. Try them all!

Does it find “don”? Try it:

results = people.search 'don'

No, it doesn’t! We could use Partial::Postfix.new(from: -4) to include this case, but let’s write our own :)

Partial = Class.new do
  def each_partial text
    text = text.dup
    (text.size - 1).times do
      yield text.chop!
    end
  end
end

and replace

category :first, partial: Partial.new

Try again:

results = people.search 'don'

Now we find Donald. You can even do this with our partial code:

results = people.search 'd'

We still find him.

Now, Picky already offers a few partial behaviours:

One important note: Picky always searches for the last token in the partial index, even without the asterisk next to the word. If it’s not the last word, you need an asterisk: “Don* Knuth”.

Boosting

To move an allocation up in the ranking, we used weights.

Picky knows a trick that almost no search engine knows. It can boost combinations!

Look for:

results = people.search 'Donald Knuth'

Looking at the allocations, we see that Picky tells us that Donald was found in a first name, and Knuth in a last name:

[[:people, 0.693, 1, [[:first, "donald", "donald"], [:last, "knuth", "knuth"]], [1]]]

That’s pretty useful to know what was found where.

As people usually look for the first name, then the last name, we want to give this more boost.

Replace this:

people = Picky::Search.new data

with this

people = Picky::Search.new data do
  boost [:first, :last] => +3
end

Now try again:

results = people.search 'Donald Knuth'

A whole 3 points more! Try it the other way around:

results = people.search 'Knuth Donald'

We don’t get the boost. This is incredibly useful: If you look at how people search and then support them this way, they will find relevant results even easier!

But how about we want to boost in a specific way?

Custom Boosting

Copy this into the example:

Boosts = Class.new do
  def boost_for combinations
    @map ||= {
      [:first, :last] => +5
    }
    @map[combinations.map(&:category_name)] || -20
  end
end

(A combination is basically a tuple of category and token)

and replace:

people = Picky::Search.new data do
  boost Boosts.new
end

Now try again:

results = people.search 'Donald Knuth'

A whole 5 points more! Try it the other way around:

results = people.search 'Knuth Donald'

A whopping -20, which would send this allocation back to the end of the list, was there more data.

Conclusion

I hope you’re going to try Picky in your next project.

See the next post for some fancy search options.

Next Picky Search Options

Share


Previous

Comments?