Experimental Features for Picky 5 Tweet
This is a quick post about two experimental features in Picky 4.11+ that will be available stably in Picky 5.
Picky is very much driven by its users.
After adding stemming in Picky 4.6.6 from a push I got by John Barton and Glen Maddern of goodfil.ms fame, Andy Kitchen supplied a piece of code for automatic word segmentation, while also mentioning that he needs a range query.
They are now both available as experimental features.
Let’s say you’d like to find all people born in 1977, 1978, and 1979. Previously, this was not too easy to do in Picky.
Now you can. Let’s look at a full copy-and-paste-able example:
require 'picky' index = Picky::Index.new :people do key_format :to_s category :year end Person = Struct.new :id, :year index.add Person.new('Picky', 2008) index.add Person.new('Kaspar', 1978) index.add Person.new('Florian', 1977) index.add Person.new('Joe', 1955) people = Picky::Search.new index p people.search('1977-1979').ids p people.search('year:1977-1979').ids p people.search('year:1900-2010').ids
The first result will be
since I was born in 1977, and Kaspar was born in 1978. If you categorize it with
year:1977-1979 it will yield the same result. If you only want results for a specific category, remember to categorize it by prefixing a search term or range
By going over the whole range, as in the third result, you’ll get
["Joe", "Florian", "Kaspar", "Picky"]
as the range
year:1900-2010 includes all the results.
Range queries the Ruby way
Picky internally uses
Enumerable#inject, so any range will work. For example,
initial:a-d will yield results for each
"a", "b", "c", and "d". Cool, eh?
Not impressed? Read on…
Andy Kitchen was happy with the range queries, however he needed range queries that were wrapping. If somebody wanted to find eg. an event that was on between 10pm and 2am in the morning, the current range query implementation did not allow that, as
event_start:10-2 did not work (
#inject will yield nothing).
Because Picky accepts any kind of range, he implemented a wrapping range (the version here is a slight rewrite of the original):
class Wrap12Hours include Enumerable def initialize(min, max) @hours = 12 @min = min.to_i @top = max.to_i @top += @hours if @top < @min end def each @min.upto(@top).each do |i| yield (i % @hours).to_s end end end
This is then passed into an index category like this
category :hour, ranging: Wrap12Hours
to make Picky use this “ranging” for that category.
The result: If
Wrap12Hours is given a range like
10-2, it will
[10, 11, 0, 1, 2], which is exactly what he needed.
Picky range queries use
#inject, but there is no
Wrap12Hours – so why does it work? Note that Andy does an
Enumerable#inject uses the
#each method which is already there to implement
#inject and some other methods. Pretty snazzy! (And, I might add, the Ruby way of doing things)
The ability to implement custom ranges is very powerful and underlines the flexibility of Picky.
Automatic word segmentation
Just a quick note on this as it is just a sketch, currently. A fully functional sketch, though.
What if you want to not split on a regexp as you would usually, but you’d like Picky to split on words in the index.
So if you had “purple”, “rainbow”, and “pony” (don’t ask) in your index, then you’d want Picky to automatically split a query like “purplerainbowpony” into “purple”, “rainbow”, “pony”.
This can be achieved by giving the search category option
splits_text_on an automatic splitter rather than a regexp. The automatic splitter is initialized with the index category you’d like to use for the splitter.
automatic_splitter = Picky::Splitters::Automatic.new index[:text] some_search = Picky::Search.new index do searching splits_text_on: automatic_splitter end
Note that if you want to test the spitter itself you can simply call
#split on it, as this is the method called by the Picky
Tokenizer to split incoming queries:
automatic_splitter.split 'hellopicky' # => ['hello', 'picky']
Please give it a go and report back!
The partial option
The automatic splitter supports a
partial option. This will make Picky also use the partial index.
automatic_splitter = Picky::Splitters::Automatic.new index[:text], partial: true
What does it mean? It means that it will
automatic_splitter.split 'hellopic' # => ['hello', 'pic']
correctly split off the partial ‘pic’. The non-partial version would simply split off ‘hello’:
automatic_splitter.split 'hellopic' # => ['hello']
As Picky grows and grows, I am especially happy that Picky is fed well by its enthusiastic and helpful users.
This is much appreciated, amigos! Keep it coming :D
Outlook for Picky 5
The above features will – after some polishing and feedback – be included into Picky 5.
Have you ever asked yourself if you really need environments?
I hope to cover this topic in the next post.
Cheers, and have (pink, tentacly) fun!Next Picky Tutorial: Rails 3.2
Previous Picky Stemming