<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>code is code</title>
 <link href="http://florianhanke.com/blog/atom.xml" rel="self"/>
 <link href="http://florianhanke.com/blog/"/>
 <updated>2012-12-10T22:21:13+11:00</updated>
 <id>http://florianhanke.com/blog/</id>
 <author>
   <name>Florian Hanke</name>
   <email>florian.hanke@gmail.com</email>
 </author>

 
 <entry>
   <title type="html">Picky&amp;nbsp;Tutorial&amp;#58;&amp;nbsp;Rails&amp;nbsp;3.2</title>
   <link href="http://florianhanke.com/blog/2012/12/09/picky-tutorial-rails-3.html"/>
   <updated>2012-12-09T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/12/09/picky-tutorial-rails-3</id>
   <content type="html">&lt;p&gt;A quick sidenote: The main Picky site is now running at &lt;a href=&quot;http://pickyrb.com&quot;&gt;pickyrb.com&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Update: Thanks to Gleb Mazovetskiy (@glebm) on his input on ActiveRecord.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;You&amp;#8217;d like to integrate a small Picky server directly in the Rails 3.2 Rails app you are running?&lt;/p&gt;
&lt;p&gt;This is the tutorial for you.&lt;/p&gt;
&lt;p&gt;To make things a bit more interesting, I want to be able to filter a query with the current user – and also have an &lt;span class=&quot;caps&quot;&gt;AJAX&lt;/span&gt; search interface.&lt;/p&gt;
&lt;p&gt;Note that the indexes for this search will be created on startup and that they will live in your app. If you need big indexes, or a more elaborate search you should go for a separate Picky search server.&lt;/p&gt;
&lt;p&gt;The code pieces below are quite large mostly because of the elaborate comments. In reality, the whole search clocks in at about 30 lines – and could be further reduced to about 15, without any configuration.&lt;/p&gt;
&lt;h2&gt;Files We Will Touch&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;Gemfile&lt;/li&gt;
	&lt;li&gt;initializers/picky.rb&lt;/li&gt;
	&lt;li&gt;model.rb&lt;/li&gt;
	&lt;li&gt;controller.rb&lt;/li&gt;
	&lt;li&gt;views/JavaScript&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Gemfile&lt;/h2&gt;
&lt;p&gt;First of all, we start out by adding picky and the picky-client to the Gemfile, like so:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;gem 'picky', '~&amp;gt; 4.9'
gem 'picky-client', '~&amp;gt; 4.9'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The spermy operator &lt;code&gt;~&amp;gt;&lt;/code&gt; results in versions &lt;code&gt;4.9&lt;/code&gt; up to but not including &lt;code&gt;5.0&lt;/code&gt; being used, at which point the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; changes which might result in your application not running anymore.&lt;/p&gt;
&lt;p&gt;Then do a&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;bundle install&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;like the latest code preachers tell us to.&lt;/p&gt;
&lt;h2&gt;initializers/picky.rb&lt;/h2&gt;
&lt;p&gt;Here&amp;#8217;s where you define the actual indexes and configure Picky. This is an example where we use a very generic model, imaginatively named &amp;#8220;things&amp;#8221;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Silence Picky, as an example.
#
Picky.logger = Picky::Loggers::Silent.new

# We create a new index and store it in the constant ThingsIndex.
#
ThingsIndex = Picky::Index.new :things do
  # Our keys are integers.
  # Use :to_s if you have strings.
  #
  key_format :to_i

  # Default indexing options.
  # Please see: https://github.com/floere/picky/wiki/Indexing-configuration
  # for more information.
  #
  indexing removes_characters: /[^a-z0-9\s\/\-\_\:\&quot;\&amp;amp;\.]/i,
           stopwords:          /\b(and|the|of|it|in|for)\b/i,
           splits_text_on:     /[\s\/\-\_\:\&quot;\&amp;amp;\/]/,
           rejects_token_if:   lambda { |token| token.size &amp;lt; 2 }

  # We can search on the titles of the thing.
  #
  # We use postfix partials which means a word can
  # be found if only part has been entered (from the beginning).
  #
  category :title, :partial =&amp;gt; Picky::Partial::Postfix.new(:from =&amp;gt; 1)

  # We should also be able to search the years that the things have.
  #
  # We want the exact year, so no partial searching.
  #
  category :year,
           :partial =&amp;gt; Picky::Partial::None.new

  # We should be able to restrict searches to a specific user.
  #
  # This needs to be an exact (non-partial) search, as we don't 
  # want user 15 to be found when searching for user 1.
  #
  # The :from designates the message used to get the user_ids.
  #
  category :user,
           :partial =&amp;gt; Picky::Partial::None.new, 
           :from =&amp;gt; :user_ids_as_string

end

# ThingsSearch is the search interface
# on the things index.
#
# See https://github.com/floere/picky/wiki/Searching-Configuration
# for some tokenizing options.
#
ThingsSearch = Picky::Search.new ThingsIndex

# We are indexing at the end of this method
# using explicit indexing.
#
# Feel free to run the initial indexing somewhere else.
#
Thing.order('title ASC').each do |thing|
  ThingsIndex.add thing
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next up is the model.&lt;/p&gt;
&lt;h2&gt;model.rb&lt;/h2&gt;
&lt;p&gt;The model is straightforward: we want to index when saving a model, or delete the model from the index.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# After committing, index.
#
after_commit :picky_index

# Index correctly, depending on whether it
# was destroyed or updated/created.
#
def picky_index
  if destroyed?
    ThingsIndex.remove id
  else
    ThingsIndex.replace self
  end
end

# Since we want to index all users that have something to
# do with this thing together with it, we return a string
# of space separated user ids.
# (Picky version 5 will be able to use user_ids directly)
#
def user_ids_as_string
  user_ids.join ' '
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we didn&amp;#8217;t have the special case with the user ids, we&amp;#8217;d only have two lines in the model.&lt;/p&gt;
&lt;p&gt;Now, the controller is a bit bigger…&lt;/p&gt;
&lt;h2&gt;controller.rb&lt;/h2&gt;
&lt;p&gt;Create a controller action and wire it up in the routes.rb correctly. For example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;resources :things do
  collection { get :search }
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, back to the &lt;code&gt;search&lt;/code&gt; action.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def search
  # This line prepends the current user to the query.
  #
  # Since we have indexed the thing's user in the
  # user category, we can prepend a filter to the
  # currently received query.
  #
  # A query like
  #   &quot;one two three&quot;
  # will be transformed into
  #   &quot;user:15 one two three&quot;
  # which will result in things only
  # being found if it is associated to the current user.
  #
  query = &quot;user:#{current_user.id} #{params[:query]}&quot;

  # Perform the search.
  #
  results = ThingsSearch.search query, params[:ids] || 20, params[:offset] || 0
  
  # Render each thing in the results nicely as a partial.
  #
  # (You need to have a &quot;thing&quot; partial file)
  #
  results = results.to_hash
  results.extend Picky::Convenience
  results.populate_with Thing do |thing|
    render_to_string :partial =&amp;gt; &quot;thing&quot;, :object =&amp;gt; thing
  end
  
  # We respond with a nice JSON result.
  #
  respond_to do |format|
    format.html do
      # Homework: Make this a nice HTML results page.
      #
      render :text =&amp;gt; &quot;Deal result ids: #{results.ids.to_s}&quot;
    end
    format.json do
      render :text =&amp;gt; results.to_json
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;JavaScript&lt;/h2&gt;
&lt;p&gt;The javascript is a bit more elaborate.&lt;/p&gt;
&lt;p&gt;The picky-client helper method &lt;code&gt;.cached_interface&lt;/code&gt; (&lt;a href=&quot;https://github.com/floere/picky/blob/master/client/lib/picky-client/helper.rb#L51-L55&quot;&gt;code&lt;/a&gt;) gives you the &lt;span class=&quot;caps&quot;&gt;HTML&lt;/span&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_html&quot;&gt;&lt;code&gt;&amp;lt;%= Picky::Helper.cached_interface %&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Picky comes with its own JS library (&lt;a href=&quot;https://github.com/floere/picky/blob/master/client/javascripts/picky.min.js&quot;&gt;code&lt;/a&gt;, 12kB), and lots of configuration options (&lt;a href=&quot;https://github.com/floere/picky/issues/98&quot;&gt;list&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;It knows two modes of searching: full and live. Full searching is run on pressing enter and expected to return rendered results, to show them in a results list. Live searching runs while typing and only updates the counts next to the input box.&lt;/p&gt;
&lt;p&gt;This example is a bit special as it renders live searches as if they were full ones. It&amp;#8217;s like pressing enter while typing.&lt;/p&gt;
&lt;p&gt;So in a JS file – or coffeescript, if you like that – insert this:&lt;/p&gt;
&lt;pre class=&quot;sh_js&quot;&gt;&lt;code&gt;$(window).load(function() {
  pickyClient = new PickyClient({
    full: '/things/search',  // The URL that maps to our search action.
    fullResults: 50,         // Default is 20.
    live: '/things/search',  // Use the same URL as the full search.
    liveResults: 20,         // Default is 0.
    liveRendered: true,      // Render live results as if they were full ones.
    liveSearchInterval: 166, // Time between keystrokes before it sends the query.
    searchOnEmpty: true,     // Search even when the query field is empty.
    
    // beforeInsert: function(query) {  },   // Optional. Before a query is inserted via pickyClient.insert(...).
    // before: function(query, params) {  }, // Optional. Before Picky sends any data. Return modified query.
    // success: function(data, query) {  },  // Optional. Just after Picky receives data. (Get a PickyData object)
    // after: function(data, query) {  },    // Optional. After Picky has handled the data and updated the view.
  });
};&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you can see, the Picky JS interface offers you four callbacks that are called: before inserting a query (sanitize a query), before sending the query (add any filters from radio buttons, checkboxes etc.), just after receiving the data (modify the incoming data as you wish), and after updating the view (make modifications and necessary updates to the view).&lt;/p&gt;
&lt;p&gt;This is pretty handy and is used in the &lt;a href=&quot;http://cocoapods.org&quot;&gt;cocoapods.org&lt;/a&gt; search (&lt;a href=&quot;https://github.com/CocoaPods/cocoapods.org/blob/master/views/index.erb#L214-L265&quot;&gt;example code&lt;/a&gt;) to add the OS filter to the query without it being visible in the search field (but in the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;).&lt;/p&gt;
&lt;h2&gt;End&lt;/h2&gt;
&lt;p&gt;I hope this helps getting Picky into your Rails app :)&lt;/p&gt;
&lt;p&gt;Finally, if you don&amp;#8217;t want to index each time your app is started, you could use load and dump on the index. Perhaps like this…&lt;/p&gt;
&lt;p&gt;In the initializer, to save the index:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;at_exit do
  ThingsIndex.dump
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To load the index:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;tries = 0
begin
  exit 1 if tries &amp;gt; 1
  ThingsIndex.load
rescue
  tries = tries + 1
  ThingsIndex.index
  retry
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Cheers and have fun!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Experimental&amp;nbsp;Features&amp;nbsp;for&amp;nbsp;Picky&amp;nbsp;5</title>
   <link href="http://florianhanke.com/blog/2012/11/20/experimental-features-for-picky-5.html"/>
   <updated>2012-11-20T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/11/20/experimental-features-for-picky-5</id>
   <content type="html">&lt;p&gt;This is a quick post about two experimental features in Picky 4.11+ that will be available stably in Picky 5.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;Picky is very much driven by its users.&lt;/p&gt;
&lt;p&gt;After adding &lt;a href=&quot;http://en.wikipedia.org/wiki/Stemming&quot;&gt;stemming&lt;/a&gt; in Picky 4.6.6 from a push I got by &lt;a href=&quot;http://twitter.com/johnbarton&quot;&gt;John Barton&lt;/a&gt; and &lt;a href=&quot;http://twitter.com/glenmaddern&quot;&gt;Glen Maddern&lt;/a&gt; of &lt;a href=&quot;http://goodfil.ms&quot;&gt;goodfil.ms&lt;/a&gt; fame, &lt;a href=&quot;http://twitter.com/auastro&quot;&gt;Andy Kitchen&lt;/a&gt; supplied a piece of code for &lt;a href=&quot;http://norvig.com/ngrams/ch14.pdf&quot;&gt;automatic word segmentation&lt;/a&gt;, while also mentioning that he needs a range query.&lt;/p&gt;
&lt;p&gt;They are now both available as experimental features.&lt;/p&gt;
&lt;h2&gt;Range queries&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s say you&amp;#8217;d like to find all people born in 1977, 1978, and 1979. Previously, this was not too easy to do in Picky.&lt;/p&gt;
&lt;p&gt;Now you can. Let&amp;#8217;s look at a full copy-and-paste-able example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'picky'
  
index = Picky::Index.new :people do
  key_format :to_s
  category :year
end

Person = Struct.new :id, :year

index.add Person.new('Picky',   2008)
index.add Person.new('Kaspar',  1978)
index.add Person.new('Florian', 1977)
index.add Person.new('Joe',     1955)

people = Picky::Search.new index

p people.search('1977-1979').ids
p people.search('year:1977-1979').ids
p people.search('year:1900-2010').ids&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first result will be&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[&quot;Florian&quot;, &quot;Kaspar&quot;]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;since I was born in 1977, and Kaspar was born in 1978. If you categorize it with &lt;code&gt;year:1977-1979&lt;/code&gt; it will yield the same result. If you only want results for a specific category, remember to categorize it by prefixing a search term or range &lt;code&gt;category_name:&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;By going over the whole range, as in the third result, you&amp;#8217;ll get&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[&quot;Joe&quot;, &quot;Florian&quot;, &quot;Kaspar&quot;, &quot;Picky&quot;]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;as the range &lt;code&gt;year:1900-2010&lt;/code&gt; includes all the results.&lt;/p&gt;
&lt;h2&gt;Range queries the Ruby way&lt;/h2&gt;
&lt;p&gt;Picky internally uses &lt;code&gt;Enumerable#inject&lt;/code&gt;, so any range will work. For example, &lt;code&gt;initial:a-d&lt;/code&gt; will yield results for each &lt;code&gt;&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, and &quot;d&quot;&lt;/code&gt;. Cool, eh?&lt;/p&gt;
&lt;p&gt;Not impressed? Read on…&lt;/p&gt;
&lt;h2&gt;Custom ranges!&lt;/h2&gt;
&lt;p&gt;Andy Kitchen was happy with the range queries, however he needed range queries that were wrapping. If somebody wanted to find eg. an event that was on between 10pm and 2am in the morning, the current range query implementation did not allow that, as &lt;code&gt;event_start:10-2&lt;/code&gt; did not work (&lt;code&gt;#each&lt;/code&gt; or &lt;code&gt;#inject&lt;/code&gt; will yield nothing).&lt;/p&gt;
&lt;p&gt;Because Picky accepts any kind of range, he implemented a wrapping range (the version here is a slight rewrite of the original):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Wrap12Hours
  include Enumerable

  def initialize(min, max)
    @hours = 12
    @min   = min.to_i
    @top   = max.to_i
    @top   += @hours if @top &amp;lt; @min
  end

  def each
    @min.upto(@top).each do |i|
      yield (i % @hours).to_s
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is then passed into an index category like this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :hour, ranging: Wrap12Hours&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to make Picky use this &amp;#8220;ranging&amp;#8221; for that category.&lt;/p&gt;
&lt;p&gt;The result: If &lt;code&gt;Wrap12Hours&lt;/code&gt; is given a range like &lt;code&gt;10-2&lt;/code&gt;, it will &lt;code&gt;#each&lt;/code&gt; this: &lt;code&gt;[10, 11, 0, 1, 2]&lt;/code&gt;, which is exactly what he needed.&lt;/p&gt;
&lt;p&gt;Picky range queries use &lt;code&gt;#inject&lt;/code&gt;, but there is no &lt;code&gt;#inject&lt;/code&gt; on &lt;code&gt;Wrap12Hours&lt;/code&gt; – so why does it work? Note that Andy does an &lt;code&gt;include Enumerable&lt;/code&gt;. &lt;code&gt;Enumerable#inject&lt;/code&gt; uses the &lt;code&gt;#each&lt;/code&gt; method which is already there to implement &lt;code&gt;#inject&lt;/code&gt; and some other methods. Pretty snazzy! (And, I might add, the Ruby way of doing things)&lt;/p&gt;
&lt;p&gt;The ability to implement custom ranges is very powerful and underlines the flexibility of Picky.&lt;/p&gt;
&lt;h2&gt;Automatic word segmentation&lt;/h2&gt;
&lt;p&gt;Just a quick note on this as it is just a sketch, currently. A fully functional sketch, though.&lt;/p&gt;
&lt;p&gt;What if you want to not split on a regexp as you would usually, but you&amp;#8217;d like Picky to split on words in the index.&lt;/p&gt;
&lt;p&gt;So if you had &amp;#8220;purple&amp;#8221;, &amp;#8220;rainbow&amp;#8221;, and &amp;#8220;pony&amp;#8221; (don&amp;#8217;t ask) in your index, then you&amp;#8217;d want Picky to automatically split a query like &amp;#8220;purplerainbowpony&amp;#8221; into &amp;#8220;purple&amp;#8221;, &amp;#8220;rainbow&amp;#8221;, &amp;#8220;pony&amp;#8221;.&lt;/p&gt;
&lt;p&gt;This can be achieved by giving the search category option &lt;code&gt;splits_text_on&lt;/code&gt; an automatic splitter rather than a regexp. The automatic splitter is initialized with the index category you&amp;#8217;d like to use for the splitter.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;automatic_splitter = Picky::Splitters::Automatic.new index[:text]

some_search = Picky::Search.new index do
  searching splits_text_on: automatic_splitter
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it!&lt;/p&gt;
&lt;p&gt;Note that if you want to test the spitter itself you can simply call &lt;code&gt;#split&lt;/code&gt; on it, as this is the method called by the Picky &lt;code&gt;Tokenizer&lt;/code&gt; to split incoming queries:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;automatic_splitter.split 'hellopicky' # =&amp;gt; ['hello', 'picky']&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Please give it a go and report back!&lt;/p&gt;
&lt;h3&gt;The partial option&lt;/h3&gt;
&lt;p&gt;The automatic splitter supports a &lt;code&gt;partial&lt;/code&gt; option. This will make Picky also use the partial index.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;automatic_splitter = Picky::Splitters::Automatic.new index[:text], partial: true&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What does it mean? It means that it will&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;automatic_splitter.split 'hellopic' # =&amp;gt; ['hello', 'pic']&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;correctly split off the partial &amp;#8216;pic&amp;#8217;. The non-partial version would simply split off &amp;#8216;hello&amp;#8217;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;automatic_splitter.split 'hellopic' # =&amp;gt; ['hello']&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Have fun!&lt;/h2&gt;
&lt;p&gt;As Picky grows and grows, I am especially happy that Picky is fed well by its enthusiastic and helpful users.&lt;/p&gt;
&lt;p&gt;This is much appreciated, amigos! Keep it coming :D&lt;/p&gt;
&lt;h2&gt;Outlook for Picky 5&lt;/h2&gt;
&lt;p&gt;The above features will – after some polishing and feedback – be included into Picky 5.&lt;/p&gt;
&lt;h3&gt;Environments&lt;/h3&gt;
&lt;p&gt;After a discussion with &lt;a href=&quot;http://twitter.com/kasparschiess&quot;&gt;Kaspar Schiess&lt;/a&gt; (my cofounder at &lt;a href=&quot;http://technologyastronauts.ch&quot;&gt;The Technology Astronauts&lt;/a&gt;), I am very inclined to drop environments (ie. &lt;em&gt;development&lt;/em&gt;, &lt;em&gt;test&lt;/em&gt;, &lt;em&gt;production&lt;/em&gt;) in the next Picky.&lt;/p&gt;
&lt;p&gt;Have you ever asked yourself if you really need environments?&lt;/p&gt;
&lt;p&gt;I hope to cover this topic in the next post.&lt;/p&gt;
&lt;p&gt;Cheers, and have (pink, tentacly) fun!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Stemming</title>
   <link href="http://florianhanke.com/blog/2012/10/15/picky-stemming.html"/>
   <updated>2012-10-15T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/10/15/picky-stemming</id>
   <content type="html">&lt;p&gt;This is a quick post about a new feature in Picky 4.6.6+: &lt;a href=&quot;http://en.wikipedia.org/wiki/Stemming&quot;&gt;stemming&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Stemming&lt;/h2&gt;
&lt;p&gt;Stemming is used in information retrieval, and basically serves the purpose of &amp;#8220;finding the thing&amp;#8221; in an index, even if the appearance of the thing was different in the original.&lt;/p&gt;
&lt;p&gt;In other words: if we had saved the word &amp;#8220;arguing&amp;#8221; in the index, then when somebody searches for &amp;#8220;argued&amp;#8221;, the saved document should still show up, even though &amp;#8220;arguing&amp;#8221; and &amp;#8220;argued&amp;#8221; are not exactly the same word. However, both are about the fact that somebody argued (a point, with somebody, themself or others). The words &amp;#8220;argued&amp;#8221; and &amp;#8220;arguing&amp;#8221; both resolve to the stem &amp;#8220;argu&amp;#8221;, which is not a word itself. This stem is what ends up in the index.&lt;/p&gt;
&lt;p&gt;This was not yet possible in Picky.&lt;/p&gt;
&lt;p&gt;And surprisingly, it did not seem urgent, as nobody complained.&lt;/p&gt;
&lt;p&gt;Until, of course, &lt;a href=&quot;http://goodfil.ms&quot;&gt;somebody&lt;/a&gt; did.&lt;/p&gt;
&lt;h2&gt;Usage&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s make this simple: how do you use this in Picky?&lt;/p&gt;
&lt;p&gt;(Look up &lt;a href=&quot;https://github.com/floere/picky/blob/c73bec8b01acb44dcb3a4a437fef1940fd03a08d/server/spec/functional/stemming_spec.rb&quot;&gt;the current spec&lt;/a&gt;, if that is most convenient to you.)&lt;/p&gt;
&lt;p&gt;It is very easy. Both &lt;code&gt;Index#indexing&lt;/code&gt; and &lt;code&gt;Search#searching&lt;/code&gt; methods offer the option &lt;code&gt;stems_with&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You give it an object that responds to &lt;code&gt;stem(word)&lt;/code&gt;, which gets a tokenized word, and returns a stemmed word. One such stemmer is &lt;code&gt;Lingua::Stemmer&lt;/code&gt;. In the tokenization pipeline, &lt;a href=&quot;https://github.com/floere/picky/blob/c73bec8b01acb44dcb3a4a437fef1940fd03a08d/server/lib/picky/tokenizer.rb#L275&quot;&gt;it is the last step&lt;/a&gt; to be executed.&lt;/p&gt;
&lt;p&gt;Therefore, if you want stemmed words in the index, use this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Index.new :stemming do
  indexing stems_with: Lingua::Stemmer.new
  category :some_text_that_needs_to_be_stemmed
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Usually, if you use stemming, you also want search terms to be stemmed when searching (otherwise your search for &amp;#8220;arguing&amp;#8221; will not find &amp;#8220;argued&amp;#8221; in the index).&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Search.new index do
  searching stems_with: Lingua::Stemmer.new
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But as usual, the flexibility of Picky leaves that decision up to you: it could be that you are writing a stem-search, where you don&amp;#8217;t stem in the search. Or you already only get stems for the index, no stemming needed (or even allowed), and you only need to stem on the user&amp;#8217;s input.&lt;/p&gt;
&lt;h2&gt;A word of caution&lt;/h2&gt;
&lt;p&gt;If somebody searches for e.g. &amp;#8220;Arguing!&amp;#8221;, and you don&amp;#8217;t remove the &amp;#8220;!&amp;#8221; (either by declaring it illegal in the tokenizer, or split on it), then Picky won&amp;#8217;t stem it, since the stemmer doesn&amp;#8217;t know what to do with &amp;#8220;Arguing!&amp;#8221;. It, however, would be perfectly able to stem &amp;#8220;Arguing&amp;#8221;. Consider yourself warned so we don&amp;#8217;t have to argue later on.&lt;/p&gt;
&lt;p&gt;Why anybody would search for &amp;#8220;Arguing!&amp;#8221;, I don&amp;#8217;t know. I could for example see &lt;a href=&quot;https://twitter.com/billmaher/status/256568696212967424&quot;&gt;Paul Ryan&lt;/a&gt; search for: &amp;#8220;Arguing and debating, how does it work?&amp;#8221;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">How&amp;nbsp;I&amp;nbsp;develop&amp;nbsp;a&amp;nbsp;feature&amp;nbsp;for&amp;nbsp;Picky</title>
   <link href="http://florianhanke.com/blog/2012/07/23/how-I-develop-a-feature.html"/>
   <updated>2012-07-23T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2012/07/23/how-I-develop-a-feature</id>
   <content type="html">&lt;p&gt;How do I add a feature – here: &lt;a href=&quot;https://en.wikipedia.org/wiki/Faceted_search&quot;&gt;Facets&lt;/a&gt; – to &lt;a href=&quot;http://florianhanke.com/picky&quot;&gt;Picky&lt;/a&gt;? When? Why?&lt;/p&gt;
&lt;p&gt;Starting out 2 years ago, I had a relatively clear picture of what I was going to do in the original &lt;a href=&quot;https://github.com/floere/picky/wiki/Roadmap&quot;&gt;roadmap&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The last 3 points are:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Obtain real live octopus. Call it Picky and teach it searching tricks.&lt;/li&gt;
	&lt;li&gt;Become mayor of Krakow. Hold more Ruby conferences there. Eat all the available polish food.&lt;/li&gt;
	&lt;li&gt;Implement coffee making capabilities.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Pets aren&amp;#8217;t allowed by my landlord. Also, as you can see I&amp;#8217;m still working on becoming the mayor of Krakow. Regarding the coffee making capabilities, I am still evaluating several brands of coffee, converging on Papua New Guinean blue mountain sun roasted beans.&lt;/p&gt;
&lt;p&gt;Thankfully, world domination is already achieved. Or can you show me one of the seven seas which is not yet filled with octopi?&lt;/p&gt;
&lt;p&gt;But seriously: Where do you go from here? Total chaos, burning lines of code? Software pattern anarchy? Class warfare?&lt;/p&gt;
&lt;h2&gt;&lt;span class=&quot;caps&quot;&gt;UNDD&lt;/span&gt;: User Need Driven Development&lt;/h2&gt;
&lt;p&gt;I find myself often without direction regarding Picky – since I don&amp;#8217;t use it myself for any especially challenging projects (with Picky, too, no project is challenging &amp;#8211; just kidding), how does it get to push its own boundaries?&lt;/p&gt;
&lt;p&gt;Thankfully, Picky has a few helpful users to push it a bit:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class=&quot;caps&quot;&gt;UNDD&lt;/span&gt;&lt;/strong&gt;, aka User Need Driven Development!
(Coincidentally almost the German word for &amp;#8220;and&amp;#8221;, ie. &amp;#8220;und&amp;#8221; – &lt;span class=&quot;caps&quot;&gt;UNDD&lt;/span&gt; expressed as a sentence: &amp;#8220;We&amp;#8217;d like this and that and and and and…&amp;#8221;, it basically never ends)&lt;/p&gt;
&lt;p&gt;A week ago, &lt;span class=&quot;caps&quot;&gt;UNDD&lt;/span&gt; happened: &lt;a href=&quot;https://groups.google.com/forum/?fromgroups#!topic/picky-ruby/UvIxg4d1PME&quot;&gt;https://groups.google.com/forum/?fromgroups#!topic/picky-ruby/UvIxg4d1PME&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;David Lowenfels asked: &amp;#8220;I am wondering if Picky can do facets?&amp;#8221;&lt;/p&gt;
&lt;p&gt;As with any case of &lt;span class=&quot;caps&quot;&gt;UNDD&lt;/span&gt;, if there is no philosophical reason against including it in a framework, the answer is always:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Not yet, but…&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Example: Facets&lt;/h2&gt;
&lt;p&gt;Facets – as I understand them – is slicing the available data into categories and category-facets.&lt;/p&gt;
&lt;p&gt;David gave a good example with &lt;a href=&quot;http://www.trailspace.com/gear/boots/midweight/&quot;&gt;this hiking boot page&lt;/a&gt;. On the left facets are used to refine (filter) the results. In &amp;#8220;Brand&amp;#8221; we find &amp;#8220;Salomon&amp;#8221;, &amp;#8220;Merrell&amp;#8221;, &amp;#8220;Timberland&amp;#8221;, etc.&lt;/p&gt;
&lt;p&gt;If you then choose eg. &amp;#8220;Salomon&amp;#8221;, only Salomon shoes are shown. And, more importantly, not all Gender refinements are available anymore, but only the ones that are relevant to the brand &amp;#8220;Salomon&amp;#8221;.&lt;/p&gt;
&lt;p&gt;So, should I add that to Picky? Let&amp;#8217;s review the &lt;em&gt;official feature policy&lt;/em&gt;&amp;#8482;:&lt;/p&gt;
&lt;h2&gt;Feature Philosophy&lt;/h2&gt;
&lt;p&gt;Picky&amp;#8217;s &lt;a href=&quot;https://github.com/floere/picky/wiki/Feature-Philosophy&quot;&gt;Feature Philosophy&lt;/a&gt;, reprinted here:&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;1. If it is relatively easy to do, I write a feature myself.
2. If it is relatively easy to do, but not perfect, I write it myself too, with the option of adding an adapter to another search engine later.
3. If it is hard to do (and it is too much against Picky’s structure and way of doing things), I write a Query object that uses another search engine.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Is it easy to do?&lt;/p&gt;
&lt;p&gt;My first reaction to David&amp;#8217;s question was: Of course! Facets are all about filtering – and Picky is all about filtering.&lt;/p&gt;
&lt;p&gt;Eeeeeasy. &lt;strong&gt;Right?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Not necessarily. Although Picky&amp;#8217;s inverted indexes (eg. { &amp;#8216;florian&amp;#8217; =&amp;gt; [1, 4, 5, 19] }) already contain the right structure to get facets, it&amp;#8217;s not so clear cut in the case where a facet already was applied as a filter.&lt;/p&gt;
&lt;p&gt;Initially I thought that this is a #1 case, but due to the multiple facets applied filtering, it&amp;#8217;s squarely in #2: I can write it myself, but it might not be &lt;em&gt;that&lt;/em&gt; easy.&lt;/p&gt;
&lt;p&gt;How do we go about implementing this feature?&lt;/p&gt;
&lt;h2&gt;Write first&lt;/h2&gt;
&lt;p&gt;Write first. Before your code reaches perfection, just write. This could be rewritten as &lt;strong&gt;Stupid and works &amp;gt; Perfect and doesn&amp;#8217;t&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I always write a very simple solution first, and even though it might be slow, I am happy.&lt;/p&gt;
&lt;h3&gt;Straightforward facets on the Index instance&lt;/h3&gt;
&lt;p&gt;The first stab at facets for class &lt;code&gt;Picky::Index&lt;/code&gt; was ultra simple:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def facets category_identifier
  self[category_identifier].exact.weights
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So I simply get the right category from the index and extract the right index. In this case the weights.&lt;/p&gt;
&lt;p&gt;It is used like so (&lt;code&gt;data&lt;/code&gt; is the index):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data.facets :brand&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code eg. results in:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  'salomon' =&amp;gt; 3.14,
  'merell' =&amp;gt; 1.61,
  …
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nice, eh?&lt;/p&gt;
&lt;p&gt;The actual method signature is now &lt;code&gt;facets(:category, more_than: N)&lt;/code&gt; with the &lt;code&gt;more_than&lt;/code&gt; option a filter for only including facets with weight higher than &lt;code&gt;N&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This is, of course, blazingly fast.&lt;/p&gt;
&lt;p&gt;What about facet filtering?&lt;/p&gt;
&lt;h3&gt;Filtered facets on the Search instance&lt;/h3&gt;
&lt;p&gt;This one was a bit of a head scratcher. Picky does not have any indexes that would allow it to easily extract filtered facets.&lt;/p&gt;
&lt;p&gt;What was I to do?&lt;/p&gt;
&lt;p&gt;Remembering &amp;#8220;write first&amp;#8221; I simply made it work, disregarding all performance issues. Some details are omitted:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def facets category_identifier, options = {}
  weights = index.facets category_identifier, options
  
  return weights unless filter_query = options[:filter]
  
  weights.select do |key, weight|
    search(&quot;#{filter_query} #{category_identifier}:#{key}&quot;, 0, 0).total &amp;gt; 0
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is used like so:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;search.facets :brand, filter: 'gender:unisex', more_than: 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&amp;#8217;s look at the code pieces in turn:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;weights = index.facets category_identifier, options&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Get the facet hash we got from the facets method in the last section.&lt;/p&gt;
&lt;p&gt;If we don&amp;#8217;t filter:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;return weights unless filter_query = options[:filter]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;we simply return it as-is, as in the &lt;code&gt;facets&lt;/code&gt; method on an index.&lt;/p&gt;
&lt;p&gt;If we need to filter, go over all facets, and remove the ones where we get zero results when applying the filter:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;weights.select do |key, weight|
  search(&quot;#{filter_query} #{category_identifier}:#{key}&quot;, 0, 0).total &amp;gt; 0
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This returns a facet hash as in the other method.&lt;/p&gt;
&lt;p&gt;Note that Picky actually runs a query for each facet.&lt;/p&gt;
&lt;p&gt;Is this a problem? It was for David, as he had more than 100 facets. So for each of the 100 facets, a query was run.&lt;/p&gt;
&lt;p&gt;However, facets usually number only in extreme cases over 20. I&amp;#8217;d say a more useful range is 3 to 10 (see &lt;a href=&quot;http://www.trailspace.com/gear/boots/midweight/&quot;&gt;http://www.trailspace.com/gear/boots/midweight/&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;In addition to that, facet results are highly cacheable. There is no reason not to cache this result – except, of course, if the data is highly dynamic. But even then, I&amp;#8217;d cache it for half an hour.&lt;/p&gt;
&lt;p&gt;If you look at the last piece of code, you notice something: &lt;code&gt;filter_query&lt;/code&gt; is passed into that search multiple times. Couldn&amp;#8217;t that be optimized?&lt;/p&gt;
&lt;h2&gt;Clean up later&lt;/h2&gt;
&lt;p&gt;Indeed it can. But remember, we wanted to get it out and working first. This serves a dual purpose:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;A user can already work with it, with the promise of it getting faster.&lt;/li&gt;
	&lt;li&gt;I am now under pressure of improving it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above code then resulted in this mini roadmap for facets:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;del&gt;Write first simple implementation.&lt;/del&gt; (This can be released as &amp;#8220;experimental&amp;#8221;)&lt;/li&gt;
	&lt;li&gt;Improve the code by not tokenizing the filter query each time. (This can be released officially)&lt;/li&gt;
	&lt;li&gt;Optimize the code by either redefining the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, or only partially run the query. (This can be released in a white paper)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;What do I mean by #2? Again, for each facet, Picky does the work of tokenizing the &lt;code&gt;filter_query&lt;/code&gt; that is interpolated into the query. See:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;search(&quot;#{filter_query} #{category_identifier}:#{key}&quot;, 0, 0).total &amp;gt; 0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is bad, of course. So we could rewrite the method to either only accept a pretokenized filter, something like:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;search.facets :brand, filter: [['gender'], 'unisex', ['price', 'age'], 50], more_than: 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, a filter would be an array of pairs, &lt;code&gt;filter categories&lt;/code&gt; and &lt;code&gt;filter value&lt;/code&gt;. This would reduce the impact on Picky a lot already. However, I like the flexibility of passing in a search string to filter.&lt;/p&gt;
&lt;p&gt;So #2 means that Picky will process the string once, and we will then use the tokenized results to put together an optimized query. Something akin to:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;filter_tokens = tokenize filter_query
facets.select do |key, _|
  query_tokens = tokenize &quot;#{category_identifier}:#{key}&quot;
  search_with(filter_tokens + query_tokens, 0, 0).total &amp;gt; 0
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Suddenly we don&amp;#8217;t do as much work anymore. Nice.&lt;/p&gt;
&lt;p&gt;Point #3 is a bit harder, and usually, this is optional, or a coding/thinking goodie for later. Here, I could partially evaluate the filter query, and then use the halfway evaluated query to inject it with the variable parts (each facet), and continue running it for the final result. If this just sounded like garbled blah to you &amp;#8211; it&amp;#8217;s fine. It just means I have no idea how to specifically do this. Yet.&lt;/p&gt;
&lt;h2&gt;In short&lt;/h2&gt;
&lt;p&gt;This is how I develop Picky features:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Listen to the needs of your users.&lt;/li&gt;
	&lt;li&gt;Check if the need goes against the Picky grain.&lt;/li&gt;
	&lt;li&gt;Say &amp;#8220;Not yet.&amp;#8221;&lt;/li&gt;
	&lt;li&gt;Implement stupidly.&lt;/li&gt;
	&lt;li&gt;Release experimentally.&lt;/li&gt;
	&lt;li&gt;Say &amp;#8220;Please try.&amp;#8221;&lt;/li&gt;
	&lt;li&gt;Refine cleverly.&lt;/li&gt;
	&lt;li&gt;Release officially.&lt;/li&gt;
	&lt;li&gt;Leave ultra-cool rewrite for a glorious future.&lt;/li&gt;
	&lt;li&gt;Wait for next user request.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;And that is it.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">And&amp;nbsp;faster&amp;nbsp;still</title>
   <link href="http://florianhanke.com/blog/2012/07/16/and-faster-still.html"/>
   <updated>2012-07-16T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2012/07/16/and-faster-still</id>
   <content type="html">&lt;p&gt;Lately I&amp;#8217;ve been obsessed with making Picky as fast as possible (while not sacrificing any flexibility).&lt;/p&gt;
&lt;p&gt;This post is all about exploiting Picky&amp;#8217;s flexibility to gain speed. We&amp;#8217;ll also push towards its extremes to see how to sacrifice some of the flexibility to gain even more speed!&lt;/p&gt;
&lt;p&gt;So if you need a high performance Picky, or simply like to see big numbers: This is the post for you!&lt;/p&gt;
&lt;p&gt;As is the trade off of the high priests of speed: On the altar of performance, they are going to sacrifice flexibility…&lt;/p&gt;
&lt;h2&gt;The tests&lt;/h2&gt;
&lt;p&gt;All tests are run on my MacBook Pro 2010 model with 2 cores. They are all based on the standard Picky example you get when you run:&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;$ picky generate server some_server_directory&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We will modify that example slightly to adapt it to use different servers, however.&lt;/p&gt;
&lt;p&gt;We run three queries of varying complexity. First, just &amp;#8220;a&amp;#8221; (which means &amp;#8220;a*&amp;#8221;), complexity 1, then &amp;#8220;a* a&amp;#8221;, complexity 2, then &amp;#8220;a* a* a&amp;#8221; (see below for results of these queries). This covers more than 99% of all usual Picky search cases.
As Picky is a combinatorial search engine, we expect a nonlinearly increasing query duration.&lt;/p&gt;
&lt;p&gt;How much we will find out :)&lt;/p&gt;
&lt;p&gt;All numbers are in requests per second.&lt;/p&gt;
&lt;h2&gt;Unicorn&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://unicorn.bogomips.org/&quot;&gt;Unicorn&lt;/a&gt; is the workhorse of the web servers. It is reliable, can use multiple cores, and has so far been the recommended server for Picky, also because it weakens the impact of GC runs.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s see how it fares:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;619&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (600 + 632 + 625 + 620 + 619)/5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;588&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (595 + 585 + 580 + 596 + 584)/5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;527&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (561 + 537 + 425 + 552 + 562)/5 &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Quite respectably. But we don&amp;#8217;t want a workhorse. We want an arabian horse that shoots fire out of its nostrils! (and anywhere else, for that matter)&lt;/p&gt;
&lt;h2&gt;Thin (with Sinatra)&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://code.macournoyer.com/thin/&quot;&gt;Thin&lt;/a&gt; is a very well known event machine based server. It is fast.&lt;/p&gt;
&lt;p&gt;How fast?&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1252&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1262 + 1213 + 1270 + 1244 + 1269) / 5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1059&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1091 +  993 + 1042 + 1097 + 1074) / 5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 3: &lt;/td&gt;
		&lt;td&gt;  &lt;strong&gt;936&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; =  (872 +  931 +  946 +  975 +  954) / 5 &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;That is impressive, given that these are the numbers from one core.&lt;/p&gt;
&lt;p&gt;Two weeks ago, this happened:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;http://174.142.61.111/forum/files/a-challenger-appears-nignog_178.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Racer (with Sinatra)&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://github.com/charliesome/racer&quot;&gt;Racer&lt;/a&gt; by &lt;a href=&quot;http://twitter.com/charliesome&quot;&gt;Charlie Somerville&lt;/a&gt; is a &amp;#8220;Rack compliant Ruby web server&amp;#8221;. It is mainly based on &lt;a href=&quot;http://github.com/joyent/libuv&quot;&gt;libuv&lt;/a&gt;. According to its &lt;a href=&quot;http://github.com/charliesome/racer/blob/master/README.md&quot;&gt;&lt;span class=&quot;caps&quot;&gt;README&lt;/span&gt;&lt;/a&gt; (worth a look just for the image ;) ), it is twice as fast as thin using a &amp;#8220;Hello world!&amp;#8221; app.&lt;/p&gt;
&lt;p&gt;As Picky performs a bit more work than a simple &amp;#8220;Hello world!&amp;#8221;, it won&amp;#8217;t be twice as fast. But how much faster will it be? Let&amp;#8217;s see…&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1370&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1374 + 1381 + 1384 + 1374 + 1337)/5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1134&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1243 + 1153 + 1088 + 1072 + 1115)/5 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1094&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1143 + 1081 + 1081 + 1080 + 1084)/5 &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Now, why don&amp;#8217;t we get double the speed as with thin, as shown on &lt;a href=&quot;https://github.com/charliesome/racer&quot;&gt;Racer&amp;#8217;s webpage&lt;/a&gt;, but just 10%? The thing is, instead of just returning &amp;#8220;hello world&amp;#8221;, Picky needs to do a bit of work.&lt;/p&gt;
&lt;h2&gt;Picky vs. Racer&lt;/h2&gt;
&lt;p&gt;To calculate how much of this time is needed by Picky, let&amp;#8217;s assume &amp;#8220;hello world&amp;#8221; takes no time at all, and Racer is double as fast as thin. With Picky, Racer is only 10% faster than thin. What does this tell us about Picky?&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s calculate a bit. With the time from &amp;#8220;hello world&amp;#8221; ignored we know:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; 1: &lt;/td&gt;
		&lt;td&gt; T(thin) / T(racer) == 2 &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 2: &lt;/td&gt;
		&lt;td&gt; (T(thin) + T(picky)) / (T(racer) + T(picky)) == 1.1 &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Rewriting:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; 3: &lt;/td&gt;
		&lt;td&gt; T(thin) + T(picky) == 1.1*T(racer) + 1.1*T(picky) &lt;/td&gt;
		&lt;td&gt; from 2. &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 4: &lt;/td&gt;
		&lt;td&gt; T(thin) &amp;#8211; 1.1*T(racer) == 0.1*T(picky) &lt;/td&gt;
		&lt;td&gt; from 3. &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 5: &lt;/td&gt;
		&lt;td&gt; T(thin) == 2*T(racer) &lt;/td&gt;
		&lt;td&gt; from 1. &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 6: &lt;/td&gt;
		&lt;td&gt; 0.9*T(racer) == 0.1*T(picky) &lt;/td&gt;
		&lt;td&gt; from 4, 5. &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 7: &lt;/td&gt;
		&lt;td&gt; T(picky) == 9*T(racer) &lt;/td&gt;
		&lt;td&gt; from 6. &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;So, Picky (including Sinatra) takes around 9 times longer than Racer. Let&amp;#8217;s remember this for our conclusion.&lt;/p&gt;
&lt;h2&gt;Multiple processes&lt;/h2&gt;
&lt;p&gt;In the Ruby web app world, to get more speed, we usually run more processes.&lt;/p&gt;
&lt;p&gt;As Racer cannot yet accept on file descriptors, I am going to use http load balancers &lt;a href=&quot;http://siag.nu/pen/&quot;&gt;Pen&lt;/a&gt; and &lt;a href=&quot;http://www.nginx.org/&quot;&gt;Nginx&lt;/a&gt; and see how they fare on my 2 core &lt;span class=&quot;caps&quot;&gt;MBP&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;Pen (with Racer)&lt;/h2&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1993&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (2140 + 1915 + 1901 + 2142 + 1869)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1370&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1696&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1798 + 1735 + 1631 + 1644 + 1673)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1134&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1490&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1256 + 1546 + 1541 + 1542 + 1565)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1094&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Certainly a good result, and plausible since it is not 2x as fast.&lt;/p&gt;
&lt;h2&gt;Nginx (with Racer)&lt;/h2&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;2048&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (2078 + 1993 + 1790 + 2177 + 2203)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1370&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1765&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1660 + 1843 + 1830 + 1684 + 1808)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1134&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1489&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1549 + 1456 + 1463 + 1473 + 1503)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1094&lt;/strong&gt; (1 core) &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Nginx seems to be a bit more speed-stable than Pen, but otherwise in the same ball-park.&lt;/p&gt;
&lt;h2&gt;Sacrificing flexibility&lt;/h2&gt;
&lt;p&gt;A high priest of speed approaches us to remind us of a good rule:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To gain speed, one must often sacrifice an abstraction layer and its inherent flexibility. Evaluate if this flexibility is needed, and if not, sacrifice without remorse.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The question here is: Do we really need the routing etc. capabilities of Sinatra? (while still keeping the abstraction given to us by Rack)&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s assume we don&amp;#8217;t and rewrite our app a bit. To remove Sinatra, we simply do not inherit from &lt;code&gt;Sinatra::Base&lt;/code&gt; and install a &lt;code&gt;#call&lt;/code&gt; method on our class.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Prepare a few pseudo-constants.
#
query_string = &quot;QUERY_STRING&quot;.freeze
result_array = [200, { &quot;Content-Type&quot; =&amp;gt; &quot;text/html&quot; }, []]
regexp       = /\Aquery=([^&amp;amp;]+)&amp;amp;ids=([^&amp;amp;]+)&amp;amp;offset=([^\z]+)/

# Define #call method.
#
define_method :call do |env|
  # Extract relevant parameters.
  #
  _, query, ids, offset = *env[query_string].match(regexp)
  results = books.search query, ids || 20, offset || 0
  
  # Put together result.
  #
  result_array[2][0] = results.to_json
  
  result_array
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that we manually extract the parameters from the query_string, and thus reduce the work done to only what we actually need. We don&amp;#8217;t need routing or any other processing.&lt;/p&gt;
&lt;p&gt;However, we now can only call our app with a strictly ordered query string (and lose the flexibility afforded to us by Sinatra):&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;?query=S&amp;amp;ids=N&amp;amp;offset=M&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(However, we still get Rack conform data)&lt;/p&gt;
&lt;p&gt;We run it the exact same way as the Sinatra app:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;run BookSearch.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(We can do this since we still use the abstraction defined by Rack)&lt;/p&gt;
&lt;h2&gt;Removing Sinatra&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s see how our no-sinatra approach turns out to be and compare:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;3972&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (3855 + 3900 + 4203 + 3574 + 4329)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;2048&lt;/strong&gt; (Sinatra) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;2295&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (2246 + 2352 + 2337 + 2294 + 2245)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1765&lt;/strong&gt; (Sinatra) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Compl. 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1173&lt;/strong&gt; &lt;/td&gt;
		&lt;td&gt; = (1157 + 1157 + 1155 + 1166 + 1232)/5 &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1489&lt;/strong&gt; (Sinatra) &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Quite breathtaking, especially in the low complexity case!&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s calculate again a bit. We know that:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; 1: &lt;/td&gt;
		&lt;td&gt; T(picky + sinatra) == 9*T(racer) == 1/2000 (roughly) &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; 2: &lt;/td&gt;
		&lt;td&gt; T(picky) == ?*T(racer) == 1/4000 (roughly) &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Rewriting:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; 3: &lt;/td&gt;
		&lt;td&gt; T(picky + sinatra) == 2*T(picky) &lt;/td&gt;
		&lt;td&gt; from 1, 2. &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;This was easier!&lt;/p&gt;
&lt;p&gt;From this we see that Sinatra takes as much time as does Picky in the low complexity case. For the highest complexity, Sinatra takes about 30% of the time that Picky takes.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Given that we want speed, and only speed: Knowing that Sinatra and Picky each take about 4.5x the time that Racer does – is it prudent to try many fast servers, or should one simply not use Sinatra?&lt;/p&gt;
&lt;p&gt;We arrive at:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Which app server to choose is not as relevant as deciding whether to use Sinatra.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Surprised?&lt;/p&gt;
&lt;p&gt;Note (especially to Sinatra fans): Remember, this is always under the assumption that speed is the ultimate goal, and that flexibility can be sacrificed.&lt;/p&gt;
&lt;p&gt;However:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If the ultimate speed is what you need, choosing a fast server also becomes important.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That one is pretty obvious.&lt;/p&gt;
&lt;p&gt;What if we go one step further?&lt;/p&gt;
&lt;h2&gt;Next up: Sacrificing Rack?&lt;/h2&gt;
&lt;p&gt;The big question is:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens when we give up the flexibility afforded by Rack?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s say we were to rewrite Racer such that it would not call our app anymore with Rack conform data, but only with minimally processed data (eg. we would not process the domain, for example, but only extract the query string).&lt;/p&gt;
&lt;p&gt;How fast can we get this thing? Please tune in in the next blog post, where we explore rewriting Racer for ultimate speed.&lt;/p&gt;
&lt;h2&gt;Footnote 1: The pinnacle of ultimate speed&lt;/h2&gt;
&lt;p&gt;To compare: How fast would this be without app servers?&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s first see how fast we can get: In pure Ruby&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;p Benchmark.measure {
  5000.times {
    results = books.search 'a', 20, 0 # and &quot;a* a&quot;, and &quot;a* a* a&quot;, as above.
    results.to_json
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Running this on a single core yields us the following (rounded) numbers:&lt;/p&gt;
&lt;table&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 1: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;6250&lt;/strong&gt; &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 2: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;3000&lt;/strong&gt; &lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt; Complexity 3: &lt;/td&gt;
		&lt;td&gt; &lt;strong&gt;1500&lt;/strong&gt; &lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Impressive.&lt;/p&gt;
&lt;h2&gt;Footnote 2: Results&lt;/h2&gt;
&lt;p&gt;&lt;span class=&quot;caps&quot;&gt;FYI&lt;/span&gt;, these are the &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; results Picky put together for each &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; response:&lt;/p&gt;
&lt;p&gt;a: &lt;pre class=&quot;sh_json&quot;&gt;&lt;code&gt;{&quot;allocations&quot;:[[&quot;books&quot;,18.439999999999998,74,[[&quot;author&quot;,&quot;a&quot;,&quot;a&quot;]],[4,7,8,11,18,38,48,51,55,80,97,108,117,119,125,126,132,134,138,140]]],&quot;offset&quot;:0,&quot;duration&quot;:0.000163,&quot;total&quot;:74}&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;a*-a: &lt;pre class=&quot;sh_json&quot;&gt;&lt;code&gt;{&quot;allocations&quot;:[[&quot;books&quot;,9.872,36,[[&quot;author&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a&quot;,&quot;a&quot;]],[4,7,8,11,18,38,48,51,55,80,117,119,132,134,138,142,165,184,227,239]],[&quot;books&quot;,6.568,262,[[&quot;title&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a&quot;,&quot;a&quot;]],[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]]],&quot;offset&quot;:0,&quot;duration&quot;:0.00019,&quot;total&quot;:36}&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;a*-a*-a: &lt;pre class=&quot;sh_json&quot;&gt;&lt;code&gt;{&quot;allocations&quot;:[[&quot;books&quot;,15.44,36,[[&quot;author&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a&quot;,&quot;a&quot;]],[4,7,8,11,18,38,48,51,55,80,117,119,132,134,138,142,165,184,227,239]],[&quot;books&quot;,9.872,36,[[&quot;title&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;author&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a&quot;,&quot;a&quot;]],[4,7,8,11,18,38,48,51,55,80,117,119,132,134,138,142,165,184,227,239]],[&quot;books&quot;,6.568,262,[[&quot;title&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a*&quot;,&quot;a&quot;],[&quot;title&quot;,&quot;a&quot;,&quot;a&quot;]],[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]]],&quot;offset&quot;:0,&quot;duration&quot;:0.000226,&quot;total&quot;:36}&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Statistics&amp;nbsp;Interface</title>
   <link href="http://florianhanke.com/blog/2012/07/02/picky-statistics-interface.html"/>
   <updated>2012-07-02T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2012/07/02/picky-statistics-interface</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This post is about a fun statistics interface I&amp;#8217;ve been working on, including a video. Download 4.5.2+, and enter this in your preferred shell.&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;picky stats path/to/log/file.log&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will tell you this:&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;Logfile path/to/log/file.log found.
Clam, Picky's friend, is looking at Picky's logfile
path/to/log/file.log
and showing results on port 4567.
== Sinatra/1.3.2 has taken the stage on 4567 for development with backup from Thin
&amp;gt;&amp;gt; Thin web server (v1.3.1 codename Triple Espresso)
&amp;gt;&amp;gt; Maximum connections set to 1024
&amp;gt;&amp;gt; Listening on 0.0.0.0:4567, CTRL+C to stop&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, in another shell, enter&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;open localhost:4567&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(on &lt;span class=&quot;caps&quot;&gt;OSX&lt;/span&gt;) and have fun!.&lt;/p&gt;
&lt;h2&gt;Video Demo&lt;/h2&gt;
&lt;p&gt;See this short video (it&amp;#8217;s best to full-screen it):&lt;/p&gt;
&lt;p&gt;&lt;iframe src=&quot;http://player.vimeo.com/video/45051903&quot; width=&quot;600&quot; height=&quot;387&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;The interface uses this great JS lib: &lt;a href=&quot;http://square.github.com/crossfilter/&quot;&gt;http://square.github.com/crossfilter/&lt;/a&gt;. Check it out :)&lt;/p&gt;
&lt;h2&gt;Interface Usage Ideas&lt;/h2&gt;
&lt;p&gt;Slice and dice your data:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;What queries are slowest?&lt;/li&gt;
	&lt;li&gt;Are they suspiciously slow in the morning?&lt;/li&gt;
	&lt;li&gt;How many return more than 1 allocation?&lt;/li&gt;
	&lt;li&gt;Does more allocations also mean slower? Or more results?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Etc.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Visual&amp;nbsp;Programming&amp;nbsp;1</title>
   <link href="http://florianhanke.com/blog/2012/06/25/visual-programming-1.html"/>
   <updated>2012-06-25T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2012/06/25/visual-programming-1</id>
   <content type="html">&lt;p&gt;Have you ever heard of &lt;a href=&quot;https://en.wikipedia.org/wiki/Visual_programming_language&quot;&gt;visual programming&lt;/a&gt;?&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s about &amp;#8220;writing&amp;#8221;, or rather, drawing programs by using a graphical language, where programs aren&amp;#8217;t codified as text, but mostly using boxes and lines, with only little text.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s one of my favorite hobby horses.&lt;/p&gt;
&lt;h2&gt;Designing a visual programming language&lt;/h2&gt;
&lt;p&gt;As an exercise, I&amp;#8217;d like to design a visual programming language top-down. By top–down I mean that first we are going to sketch out the visual interface, and how it would work.&lt;/p&gt;
&lt;p&gt;We will be looking at whether we can interface the program with its running counterpart, to display debugging information and other information normally hard to look at. Also, can we make a running program changeable, so we can play with the program, as Bret Victor has shown in one of his latest presentations, &lt;a href=&quot;http://vimeo.com/36579366&quot;&gt;Inventing on Principle&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If the design proves feasible, we will start working on an implementation. I&amp;#8217;d like to do it in Ruby, but I get a strong feeling that Haskell would be far more suited. Why? Because it&amp;#8217;s a purely functional language. But more on that later. We&amp;#8217;ll see.&lt;/p&gt;
&lt;h2&gt;Dataflow programming&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s look at some ideas. &lt;a href=&quot;https://en.wikipedia.org/wiki/Dataflow_programming&quot;&gt;Dataflow programming&lt;/a&gt; is an often used paradigm in visual programming environments where graphics are processed. In dataflow programming, all operations on data are represented by boxes with multiple inputs. If all inputs are &amp;#8220;ready&amp;#8221;, ie. have valid data to offer, the operation will run.&lt;/p&gt;
&lt;p&gt;One famous example of this is &lt;a href=&quot;https://en.wikipedia.org/wiki/Quartz_Composer&quot;&gt;Quartz Composer&lt;/a&gt;. If you haven&amp;#8217;t seen it and own &lt;span class=&quot;caps&quot;&gt;OSX&lt;/span&gt;, you should type cmd-space quartz composer right now and have a look. It&amp;#8217;s great to play with. Some people even used it to make a 24 hours music video stream once… (ah, the memories).&lt;/p&gt;
&lt;p&gt;The data in dataflow programming comes from various sources, always positioned on the left hand side, and flowing to the right, where the sinks are. A sink might be a display of some sort, or a loudspeaker.&lt;/p&gt;
&lt;h2&gt;This language&lt;/h2&gt;
&lt;p&gt;However, in my imagination, programs look more like &lt;a href=&quot;https://en.wikipedia.org/wiki/Tree_structure&quot;&gt;trees&lt;/a&gt;, or a forest (a set of trees), that are traversed according to rules set in the nodes.&lt;/p&gt;
&lt;p&gt;&amp;#8220;Huh?&amp;#8221;, you wonder, &amp;#8220;Tree? What?&amp;#8221;. Consider this piece of Ruby code:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def this_or_that thing
  if thing
    this thing
  else
    that thing
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&amp;#8217;s assume the tree grows from the left to the right.&lt;/p&gt;
&lt;p&gt;This method would be represented by a node, with two branches growing to the right. One branch would be traversed if the thing wasn&amp;#8217;t nil or false, and the other if it was.&lt;/p&gt;
&lt;p&gt;In fact, the &lt;code&gt;this_or_that&lt;/code&gt; method could be considered a tree itself, with a root and two branches, that can be grafted onto other trees. The newly formed tree would represent a new program.&lt;/p&gt;
&lt;p&gt;So instead of flowing from left to right, a – let&amp;#8217;s say – &lt;a href=&quot;https://en.wikipedia.org/wiki/Logo_(programming_language&quot;&gt;turtle&lt;/a&gt; would traverse the tree down to its branches, and return, following instructions along the way.&lt;/p&gt;
&lt;p&gt;Going right would represent function calls, and left returning from them.&lt;/p&gt;
&lt;p&gt;The tree metaphor is fitting in other ways. Perhaps you&amp;#8217;ve seen the &lt;a href=&quot;http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/&quot;&gt;fantastic video series by Abelson and Sussman&lt;/a&gt;? In the first lession they talk about interfaces.
Let&amp;#8217;s say you collapse a subtree of a program – if you then are looking at a useful set of inputs, you know your interface is fine.
If you still understand the program when collapsing a subtree, chances are you did a good job of designing the interfaces.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s look at a quick example. The program is about cleaning your house.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2012-06-25-program.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;(The instructions for &lt;code&gt;Clean Kitchen&lt;/code&gt; and &lt;code&gt;Clean Garage&lt;/code&gt; are collapsed)&lt;/p&gt;
&lt;p&gt;If you collapsed everything that was &amp;#8220;inside&amp;#8221; &lt;code&gt;Clean Bedroom&lt;/code&gt;, ie. &lt;code&gt;Clean Bed&lt;/code&gt;, &lt;code&gt;Clean Floor&lt;/code&gt;, &lt;code&gt;Clean Windows&lt;/code&gt;, would you still understand the program itself?&lt;/p&gt;
&lt;p&gt;Yes you would! Whoever is going to do the actual work on cleaning the bedroom would need the detailed instructions, but you know that one of the things that will be cleaned will be the bedroom (well, assuming your kid is actually doing the work – if it&amp;#8217;s the compiler, the work will be done). You still understand the program itself: Possibly well designed.&lt;/p&gt;
&lt;p&gt;You can already see that naming the nodes is very important, and that it &lt;em&gt;might&lt;/em&gt; be easier designing APIs using a visual programming language than text.&lt;/p&gt;
&lt;h2&gt;So far&lt;/h2&gt;
&lt;p&gt;Without really designing anything, we already made a lot of assumptions on how this language is going to look.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s just throw our example element up there:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2012-06-25-exists.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;That looks ok. There&amp;#8217;s an input and two function calls. We&amp;#8217;ve come suprisingly far already, since before looking at simple components, we&amp;#8217;ve already built our more complex component.&lt;/p&gt;
&lt;p&gt;Did you notice anything? Do we have to use constrained method names like (&lt;code&gt;does_it_exist?&lt;/code&gt;)? Do we have to use &lt;code&gt;true&lt;/code&gt; or &lt;code&gt;false&lt;/code&gt;? Is the underlying language in any way important on how we &lt;em&gt;name&lt;/em&gt; things?&lt;/p&gt;
&lt;p&gt;No, no, and no: Luckily this is not important anymore. You ask: &amp;#8220;How is this possible, when this is incredibly important in &lt;em&gt;my favorite programming language&lt;/em&gt;?&amp;#8221; Magic?&lt;/p&gt;
&lt;h2&gt;Possibly next&lt;/h2&gt;
&lt;p&gt;Knowing a bit about visual programming languages:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;How would you design a &lt;code&gt;for&lt;/code&gt; loop? Would it be easy to do? (My guess: Hard.) Is it needed? (No.)&lt;/li&gt;
	&lt;li&gt;How about a &lt;code&gt;map&lt;/code&gt; operation? (Easy)&lt;/li&gt;
	&lt;li&gt;How would you combine functions? (Super easy)&lt;/li&gt;
	&lt;li&gt;What does it mean to have a higher-order function? (Uh-oh)&lt;/li&gt;
&lt;/ul&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Business&amp;nbsp;Cards</title>
   <link href="http://florianhanke.com/blog/2012/06/20/business-cards.html"/>
   <updated>2012-06-20T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2012/06/20/business-cards</id>
   <content type="html">&lt;p&gt;Last year, &lt;a href=&quot;http://absurd.li/&quot;&gt;Kaspar Schiess&lt;/a&gt; and I opened our own business: &lt;a href=&quot;http://technologyastronauts.ch&quot;&gt;Technology Astronauts&lt;/a&gt;! This May, we started in earnest. Hooray!&lt;/p&gt;
&lt;p&gt;(Wait until crowd&amp;#8217;s cheers calm down)&lt;/p&gt;
&lt;p&gt;However, we found ourselves too many times short one business card when friendly exchanges were in order. Only one way to rectify this:&lt;/p&gt;
&lt;h2&gt;Technology Astronauts Business Cards&lt;/h2&gt;
&lt;p&gt;We opted to go for a very simple, striking high-contrast &lt;a href=&quot;https://en.wikipedia.org/wiki/Woodcut&quot;&gt;xylography&lt;/a&gt; design with a modern font and subjects that are indirectly or directly related to the rough business of astronauting.&lt;/p&gt;
&lt;p&gt;First, the moon:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2012-06-25-moon.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;On the northern hemisphere, our main area of operations (apart from, you know, space) if the moon is seen in this configuration, it is filling up, mere days away from blasting earth with photons. Its craters remind us of its resilience towards hits, its striking character and its ability to accumulate new material without changing too much: Moon stays moon, born from the earth itself.&lt;/p&gt;
&lt;p&gt;Second, coding astronaut:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2012-06-25-astronaut.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Joel Spolsky for example &lt;a href=&quot;http://www.joelonsoftware.com/articles/fog0000000018.html&quot;&gt;sees the astronaut as running out of oxygen&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I disagree: The astronaut represents the ultimate in human achievement in engineering, they itself are operating in an extreme environment, focusing on a specific number of tasks and performing above the rest of us.&lt;/p&gt;
&lt;p&gt;This is fairly standard in business cards. Where do we differentiate us?&lt;/p&gt;
&lt;h2&gt;&amp;#8220;Put a face on it&amp;#8221;&lt;/h2&gt;
&lt;p&gt;I haven&amp;#8217;t found any business cards with faces on them, but there must be some out there!&lt;/p&gt;
&lt;p&gt;In any case, after dozens of conferences and business meetings I have accumulated about a hundred business cards. Looking at them, I can&amp;#8217;t remember the person behind them, if I don&amp;#8217;t communicate regularly.&lt;/p&gt;
&lt;p&gt;Up to about three weeks after a meeting a face is fresh in my mind. After that? Not so much. If the business card has been handed to me with an accompanying anecdote, I can remember.&lt;/p&gt;
&lt;p&gt;Now, this might be just my brain. But chances are, this might be your brain and memory as well.&lt;/p&gt;
&lt;p&gt;To bring it up to speed much faster, we decided to &amp;#8220;put a face on it&amp;#8221;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2012-06-25-florian.png&quot; style=&quot;float:none;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This is me laughing uselessly into space. But: We believe that this jogs your memory much better than just mere text. How do you like it?&lt;/p&gt;
&lt;p&gt;Say &amp;#8220;Hi!&amp;#8221; either to Kaspar or me if you want one too :)&lt;/p&gt;
&lt;p&gt;P.S: We like it even better if you say &amp;#8220;Hi, I have this fantastic project for you!&amp;#8221;.&lt;/p&gt;
&lt;p&gt;P.P.S: The &amp;#8220;Put a face on it&amp;#8221; is a reference to &lt;a href=&quot;http://www.youtube.com/watch?v=iHmLljk2t8M&quot;&gt;this YouTube episode of Portlandia&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;P.P.P.S: Joel, you know who is much more about oxygen than astronauts? Divers.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Guest&amp;nbsp;Post:&amp;nbsp;Chris&amp;nbsp;Corbyn&amp;nbsp;of&amp;nbsp;Flippa</title>
   <link href="http://florianhanke.com/blog/2012/03/20/guest-post-chris-corbyn.html"/>
   <updated>2012-03-20T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/03/20/guest-post-chris-corbyn</id>
   <content type="html">&lt;p&gt;This is a great guest post by &lt;a href=&quot;http://chriscorbyn.co.uk/&quot;&gt;Chris Corbyn&lt;/a&gt; where he explains the search engine journey undertaken by &lt;a href=&quot;http://flippa.com&quot;&gt;Flippa&lt;/a&gt; and the decisions behind them.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;(Later sections written by Chris Corbyn)&lt;/p&gt;
&lt;p&gt;Wondering why us developers don&amp;#8217;t talk much more about search engine design, I asked on twitter:&lt;/p&gt;
&lt;blockquote class=&quot;twitter-tweet&quot;&gt;&lt;p&gt;App developers with searches: Am I the only one here who thinks search design should be much much more important than it is? I&amp;#8217;m interested!&lt;/p&gt;&lt;p&gt;&amp;mdash; Florian Hanke (@hanke) &lt;a href=&quot;https://twitter.com/hanke/status/175850202840829952&quot; data-datetime=&quot;2012-03-03T07:48:50+00:00&quot;&gt;March 3, 2012&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src=&quot;//platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;&lt;/p&gt;
&lt;p&gt;Subsequently, a few discussionlets developed with other people also interested in search engine design: &lt;a href=&quot;https://twitter.com/#!/ezkl&quot;&gt;@ezkl&lt;/a&gt;, &lt;a href=&quot;https://twitter.com/#!/_tomash&quot;&gt;@_tomash&lt;/a&gt;, &lt;a href=&quot;https://twitter.com/#!/manfreds&quot;&gt;@manfreds&lt;/a&gt; and last but not least &lt;a href=&quot;https://twitter.com/#!/d11wtq&quot;&gt;@d11wtq&lt;/a&gt;. Thanks all!&lt;/p&gt;
&lt;p&gt;The man behind the curious pseudonym &lt;a href=&quot;https://twitter.com/#!/d11wtq&quot;&gt;@d11wtq&lt;/a&gt; was &lt;a href=&quot;http://chriscorbyn.co.uk/&quot;&gt;Chris Corbyn&lt;/a&gt;, who took the time to respond in full on the design of &lt;a href=&quot;http://flippa.com&quot;&gt;Flippa&lt;/a&gt;, &amp;#8220;The #1 Marketplace for Buying and Selling Websites&amp;#8221;, where the search engine takes center stage.&lt;/p&gt;
&lt;p&gt;With his gracious permission I am reprinting his email in full.&lt;/p&gt;
&lt;p&gt;In Chris&amp;#8217; words:&lt;/p&gt;
&lt;h2&gt;&amp;#8220;The motivation behind putting a focus on search&lt;/h2&gt;
&lt;p&gt;At Flippa, we&amp;#8217;re currently up to our 3rd implementation of search and we consider it &lt;strong&gt;hugely&lt;/strong&gt; important to the success of our business.  We&amp;#8217;re something along the lines of an eBay platform, but built purely for buying and selling websites.  If buyers cannot find what they are looking for, we quickly lose those users, and if we don&amp;#8217;t have buyers on the site, logically we lose the sellers who market to them.  I still think we have a lot of room to improve, but when we look back at our first implementation, we have come a long way.  I guess we have come to learn over time just how important search is, rather than it being something that was apparent to us right from the day we launched Flippa (three years ago).&lt;/p&gt;
&lt;h2&gt;The first implementation&lt;/h2&gt;
&lt;p&gt;When we built Flippa, we knew we needed a search, but the scope of this was simply something that had to exist so that users could find listings by keywords.  It was not something that was well-integrated with the rest of the application.  We used Solr (as was the fashion at the time) and search was just a &amp;#8220;side feature&amp;#8221; that was often forgotten about.  Users could enter a keyword and get a set of results in a listing format entirely different from the layout we use when browsing our listings via the primary navigation.  Users regularly complained, with reason… our search was more or less useless for their needs, which were far more complex than matching on keywords.&lt;/p&gt;
&lt;h2&gt;The second implementation&lt;/h2&gt;
&lt;p&gt;Acknowledging that users needed to be able to search on a range of metrics and that we needed to make some rather substantial changes to our search infrastructure, we sat down to discuss what our end goals were.  Our users are interested more in raw numbers, than in text (e.g. they search for websites based on revenue, on page views, on alexa rank, etc).  Users also wanted the ability to put together a custom search and save it to the database, so that when they returned to the site they could easily repeat a previous search.&lt;/p&gt;
&lt;p&gt;We decided that we effectively needed to build a complete model around our search system, providing all the criteria our users would search on, in such a way that a fully-built set of criteria would be saved into the database for re-use.  We also wanted to integrate our primary navigation with this search system.  I don&amp;#8217;t remember what the driving force was behind using the same system for our primary navigation, but I suspect it was mostly about unifying the UI and the underlying model code, in addition to improving our categorization of listings—sellers could previously specify if their website was &amp;#8220;high end&amp;#8221; or &amp;#8220;turnkey&amp;#8221;, for example, but with this new search system we could determine such things by looking at the numbers, in realtime.&lt;/p&gt;
&lt;p&gt;We dropped Solr in favour of Sphinx, for two reasons:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Indexing time with MySQL was &lt;strong&gt;considerably&lt;/strong&gt; faster.&lt;/li&gt;
	&lt;li&gt;It provided SphinxSE, which is a plugin for MySQL, allowing the index to be queried through MySQL.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We built an advanced search library around Sphinx, allowing us to compose searches from a selection of pre-defined criteria, which were all exposed in the UI through our advanced search page.  Because of the MySQL integration, internally searches became a combination of full-text index querying + an &lt;span class=&quot;caps&quot;&gt;INNER&lt;/span&gt; &lt;span class=&quot;caps&quot;&gt;JOIN&lt;/span&gt; to our listings table in MySQL.  A sort of hybrid of MySQL and Sphinx full-text querying.  Searches could be saved to the database, though regrettably, as entire serialized objects.  We actually stored our primary navigation options this way in the database too.  This turned out to be a big mistake when it came to data portability.&lt;/p&gt;
&lt;p&gt;From our users&amp;#8217; perspective, the search capabilities were good, but it was too difficult to use.  We had tried to provide all the options they could ever need, but the end result was that there were too many options, some of which seemed ambiguous and confusing.  Additionally, every time we changed the name of a search field, the serialized objects saved into the database broke, and the migration procedure was much more complicated than it should have been.&lt;/p&gt;
&lt;p&gt;Users also found it difficult to &amp;#8220;narrow down&amp;#8221; their search, since it wasn&amp;#8217;t clear what impact changing an input in the advanced search would have on the size of the result set, without performing that search.&lt;/p&gt;
&lt;p&gt;At the time we built this particular search implementation, the only way to index your data with Sphinx was to rebuild the entire index.  Fortunately this only took about 20 seconds or so in our case (Sphinx is good at doing this stuff efficiently).  Though since we wanted close-to-realtime results (as our data changes practically every second or two), we were re-indexing the entire dataset every minute via cron, which was adding some strain to our database servers.&lt;/p&gt;
&lt;h2&gt;The third (and current) implementation&lt;/h2&gt;
&lt;p&gt;Generally we are happy with what have now, but we do have some things planned for further improvements.&lt;/p&gt;
&lt;p&gt;When we rewrote search this time around, beyond our desire to improve the underlying code internally, we wanted to make it easier for users to &amp;#8220;visualize&amp;#8221; the data as they browsed.  Since users were searching primarily on factors such as revenue and page views, we set about building a faceted search designed to allow click-by-click drill-down of the results, where the facets always show how many results you&amp;#8217;ll get if you click on them.  The facets would be displayed all the time, no matter what you were searching for.  This presented some challenges, since now instead of executing a single query per search, we had to execute something like 20 queries.&lt;/p&gt;
&lt;p&gt;Like the previous implementation, we used Sphinx—albeit a newer version with support for realtime indexes and multi-queries, which is how the facets are able to execute efficiently.  We also retained the idea of having our primary navigation hooked into our search system.  This had worked well for us in the previous implementation.  We ditched SphinxSE due to the complexity it added to our server infrastructure and the fact we wanted to use multi-queries in Sphinx, which would not work efficiently through MySQL.  While we still use the search system for our primary navigation (which means you&amp;#8217;ll always have facets down the side of the page), we stopped storing these in the database and simply have them formalized in code.  This makes tweaking them simpler, since it&amp;#8217;s a code edit, not a data migration.  We also built a proper schema for saving searches, instead of being lazy and serializing objects to the database (the benefits of which, probably do not require further explanation).&lt;/p&gt;
&lt;p&gt;Since the primary complaint with our previous implementation, from the user experience perspective, was that it was too confusing to use, we spent a considerable amount of time assessing what options we were providing to users via our advanced search page.  It was overly complicated and ambiguous in places.  As a result, we decided to either remove search options entirely, or combine them together, thus greatly simplifying the UI for our users.  I believe at the same time, we added new options, but the end result was still simpler.  Part of this change, however, was designed to draw the focus away from the advanced search and more towards our pre-defined facets, which suit the needs of most casual users browsing the site.&lt;/p&gt;
&lt;p&gt;The feedback we&amp;#8217;ve had from regarding our current search has been extremely positive.  Many of our listings are at the low end of the scale, which many buyers are not interested in.  Now buyers are able to quickly filter these out simply by clicking on the facets, directly from our primary navigation options.  This is something we were aiming to achieve… we&amp;#8217;re looking to encourage more quality listings, so making it easier for buyers to reach these listings and hide the 3-day-old WordPress blogs solves this.&lt;/p&gt;
&lt;p&gt;All the code is custom-written in &lt;span class=&quot;caps&quot;&gt;PHP&lt;/span&gt; (parts of our site are written in &lt;span class=&quot;caps&quot;&gt;PHP&lt;/span&gt;, other parts are written in Ruby).  We&amp;#8217;ll likely be porting this code to Ruby at some point, though we need a Sphinx gem that supports the features we&amp;#8217;re using from Sphinx 2, and Pat Allan&amp;#8217;s Riddle gem doesn&amp;#8217;t offer this just yet.  We may end up writing this ourselves.&lt;/p&gt;
&lt;h2&gt;Some things we have built around our search&lt;/h2&gt;
&lt;ul&gt;
	&lt;li&gt;From any of our primary navigation options, you may click &amp;#8220;Advanced&amp;#8221; at the top of the facets, to load the advanced search page with the inputs used to execute the search for that navigation option, either for inspection, or to modify them.&lt;/li&gt;
	&lt;li&gt;We have a &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; search &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, available only on request, used by third-parties who analyze our data for use on their own websites.&lt;/li&gt;
	&lt;li&gt;Users can have the results of a search emailed to them on a daily basis.  This simply loads the search from the database and executes it via a background job.&lt;/li&gt;
	&lt;li&gt;Some smaller features, such as watching certain tags and sellers use the search internally.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Where to next?&lt;/h2&gt;
&lt;p&gt;We have some things on the agenda for future improvements to our search, though nothing quite as major as our previous iterations.  There are some internal optimizations we can certainly make, such as having an effective caching strategy (though cache invalidation is &lt;strong&gt;hard&lt;/strong&gt;).  We also have some changes planned that focus on tailoring the search according to the region of the user, though I can&amp;#8217;t go into details on this.  All in all, we think we&amp;#8217;re getting there!&amp;quot;&lt;/p&gt;
&lt;h2&gt;Thanks / Guest Posts&lt;/h2&gt;
&lt;p&gt;Many thanks to Chris! Please post feedback right here or send to &lt;a href=&quot;http://twitter.com/d11wtq&quot;&gt;Chris&amp;#8217; Twitter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you don&amp;#8217;t have a blog or are interested in writing a guest post, roughly in these areas: Ruby, Framework Design, Search Design or similar, please &lt;a href=&quot;http://twitter.com/hanke&quot;&gt;contact me&lt;/a&gt;.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Normalizing&amp;nbsp;Indexed&amp;nbsp;Data</title>
   <link href="http://florianhanke.com/blog/2012/03/16/normalize-indexed-data.html"/>
   <updated>2012-03-16T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/03/16/normalize-indexed-data</id>
   <content type="html">&lt;p&gt;A quick blog post on a Picky tokenizer option.&lt;/p&gt;
&lt;h2&gt;Intro / Problem&lt;/h2&gt;
&lt;p&gt;On mobile devices it can be a bit annoying to enter special symbols, like &lt;code&gt;+&lt;/code&gt;, or &lt;code&gt;&amp;amp;&lt;/code&gt;, and it would be easier to just enter &lt;code&gt;plus&lt;/code&gt;, or &lt;code&gt;and&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Or maybe there are a lot of abbreviations, like &lt;code&gt;abbrev&lt;/code&gt;, or &lt;code&gt;e.g.&lt;/code&gt;, but you&amp;#8217;d still like to find the item when searching for &lt;code&gt;abbreviation&lt;/code&gt;, or &lt;code&gt;example&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Or maybe you&amp;#8217;d like number &lt;code&gt;1&lt;/code&gt; to be findable with &lt;code&gt;one&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In the search engine domain, this is one part of &lt;a href=&quot;https://en.wikipedia.org/wiki/Text_normalization&quot;&gt;text normalization&lt;/a&gt;, the examples being expanding abbreviations and converting numbers.&lt;/p&gt;
&lt;p&gt;In Picky, this is done using the tokenizer option &lt;code&gt;normalizes_words&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Tokenizer option &amp;#8220;normalizes_words&amp;#8221;&lt;/h2&gt;
&lt;p&gt;This option makes the tokenizer normalize words before indexing them.&lt;/p&gt;
&lt;p&gt;The usage is very simple. Just pass a 2d array of regexps and replacement terms into the &lt;code&gt;normalizes_words&lt;/code&gt; option, like so:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Index.new :normalized do
  indexing normalizes_words: [
    [/\+/, 'plus'], # + -&amp;gt; plus
    [/\&amp;amp;/, 'and'], # &amp;amp; -&amp;gt; and
    [/\w\//, 'with'], # w/ -&amp;gt; with
    [/abbr(ev)?/, 'abbreviation'], # abbr, abbrev -&amp;gt; abbreviation
    [/e\.g\./, 'example given'] # e.g. -&amp;gt; example given (note that the . have to survive)
  ]
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;stopwords&lt;/li&gt;
	&lt;li&gt;case&lt;/li&gt;
	&lt;li&gt;character removal&lt;/li&gt;
	&lt;li&gt;character replacement&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;are specifically handled in options&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;stopwords: /\b(word1|word2|...)\b/&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;case_sensitive: true/false&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;remove_characters: /[characters]/&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;substitutes_characters_with: Picky::CharacterSubstituters::WestEuropean.new&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and should be handled there.&lt;/p&gt;
&lt;h2&gt;Alternatives&lt;/h2&gt;
&lt;p&gt;What if this doesn&amp;#8217;t work for you?&lt;/p&gt;
&lt;p&gt;No problemo! Picky is all Ruby, so feel free to either monkey patch, or probably better: Preprocess the data to your heart&amp;#8217;s content.&lt;/p&gt;
&lt;p&gt;Have fun!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">CocoaPods&amp;nbsp;Search&amp;nbsp;Design</title>
   <link href="http://florianhanke.com/blog/2012/03/01/cocoapods-search-design.html"/>
   <updated>2012-03-01T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/03/01/cocoapods-search-design</id>
   <content type="html">&lt;p&gt;You probably have heard of &lt;a href=&quot;http://cocoapods.org&quot;&gt;CocoaPods&lt;/a&gt;, an Objective-C library dependency manager. The project was initiated by &lt;a href=&quot;http://twitter.com/alloy&quot;&gt;Eloy Durán&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let me tell you it&amp;#8217;s good stuff!&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;This post is about designing a search engine for CocoaPods. I&amp;#8217;m using &lt;a href=&quot;http://florianhanke.com/picky&quot;&gt;Picky&lt;/a&gt; for it, with moderate modifications.&lt;/p&gt;
&lt;p&gt;Chances are you know &lt;a href=&quot;http://rubygems.org&quot;&gt;RubyGems&lt;/a&gt;. CocoaPods use a slightly different approach, one I personally find very elegant: After creating a &lt;a href=&quot;http://cocoapods.org/#get_started&quot;&gt;podspec&lt;/a&gt; (similar to a gemspec), you ask for it to be included in the &lt;a href=&quot;http://github.com/CocoaPods/Specs&quot;&gt;central repository&lt;/a&gt; via a pull request. If it is accepted, from then on you get commit rights to push other pods.&lt;/p&gt;
&lt;p&gt;Since I think the &lt;a href=&quot;http://rubygems.org/search&quot;&gt;rubygems search&lt;/a&gt; is too slow, and not very impressive, I tried to make the &lt;a href=&quot;http://cocoapods.org&quot;&gt;CocoaPods search&lt;/a&gt; an example of how such a search should be designed. Try it! :)&lt;/p&gt;
&lt;p&gt;(Note: I&amp;#8217;m not just criticizing, but also &lt;em&gt;putting code where my mouth is&lt;/em&gt; regarding the rubygems search – try my &lt;a href=&quot;http://gemsearch.heroku.com/&quot;&gt;alternative take&lt;/a&gt; on it and &lt;a href=&quot;./../../../2011/02/13/a-better-rubygems-search.html&quot;&gt;read about it here&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;Many ideas for the CocoaPods search come from the old gem search alternative, but a few features are new, compiled in the…&lt;/p&gt;
&lt;h2&gt;Highlights&lt;/h2&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;#hooks&quot;&gt;Automagic index updates via Github post receive hooks&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#composites&quot;&gt;Making composite names (e.g. BlocksKit) searchable&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#callbacks&quot;&gt;Advanced: Invisible filtering by OS&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#duplicates&quot;&gt;Advanced: Removing duplicates from results&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#fun&quot;&gt;Fun things to try!&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;hooks&quot;&gt;Automagic index updates via Github post receive hooks&lt;/h2&gt;
&lt;p&gt;The challenge was to have Picky automatically update the search index without restarting, and without polling.&lt;/p&gt;
&lt;p&gt;The fact that the CocoaPods specs live in their own repository is fantastic – it means that we have the full power of Github&amp;#8217;s repo features at our disposal.&lt;/p&gt;
&lt;p&gt;The feature we use is &lt;a href=&quot;http://help.github.com/post-receive-hooks/&quot;&gt;post receive hooks&lt;/a&gt;. Every time someone pushes a new spec, or updates a spec, the search engine sinatra app is notified via a garbled &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, as follows:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;post &quot;/my_example_hook_url/#{ENV['GARBLED_HOOK_PATH']}&quot; do
  # index updating code here
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Every time this &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; is called, Picky downloads the zip file from github, unzips it, and indexes the loaded specs. All while running. That&amp;#8217;s it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class=&quot;caps&quot;&gt;HOLD&lt;/span&gt; ON!&lt;/strong&gt;, you say, why don&amp;#8217;t you just do a &lt;code&gt;git pull&lt;/code&gt;? I wish I could. But currently, Heroku doesn&amp;#8217;t allow &lt;code&gt;git pull&lt;/code&gt;,
or &lt;code&gt;tar&lt;/code&gt;,
or &lt;code&gt;gunzip&lt;/code&gt;. So currently, the search engine always downloads the zip file.&lt;/p&gt;
&lt;h2 id=&quot;composites&quot;&gt;Making composite names searchable&lt;/h2&gt;
&lt;p&gt;Pod names do not use spaces but are camelcased, e.g. &amp;#8220;BlocksKit&amp;#8221;. Like most search engines, Picky would index this as one word.&lt;/p&gt;
&lt;p&gt;Another issue with pod names is that authors sometimes prepend their initials to it. So, for example, &amp;#8220;Mocky&amp;#8221; would actually be called &amp;#8220;LRMocky&amp;#8221;.&lt;/p&gt;
&lt;p&gt;However, getting back to the &amp;#8220;BlocksKit&amp;#8221; example, we want people to be able to find it when they type &lt;a href=&quot;http://cocoapods.org/?q=blocks%20kit&quot;&gt;blocks kit&lt;/a&gt;, or just &lt;a href=&quot;http://cocoapods.org/?q=kit&quot;&gt;kit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In Picky lingo: If the data contains &lt;code&gt;&quot;BlocksKit&quot;&lt;/code&gt;, how do we index it as &lt;code&gt;&quot;BlocksKit Blocks Kit&quot;&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Turns out there is a snazzy Ruby regexp for that:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;&quot;BlocksKit&quot;.split /([A-Z]?[a-z]+)/ # =&amp;gt; [&quot;&quot;, &quot;Blocks&quot;, &quot;&quot;, &quot;Kit&quot;]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nice, eh? As a bonus works fine with numbers :)&lt;/p&gt;
&lt;p&gt;The Pod model offers a &lt;code&gt;prepared_name&lt;/code&gt; method, using the above &lt;code&gt;split&lt;/code&gt;, returning &lt;code&gt;&quot;BlocksKit Blocks Kit&quot;&lt;/code&gt;, which Picky uses for the &lt;code&gt;name&lt;/code&gt; category and consequently indexes all three words.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :name,
         similarity: Similarity::DoubleMetaphone.new(2),
         partial: Partial::Substring.new(from: 1),
         qualifiers: [:name, :pod],
         :from =&amp;gt; :prepared_name # &amp;lt;= :from indicates which (data) method to call in the source object&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try it with &lt;a href=&quot;http://cocoapods.org/?q=dynamic%20delegate&quot;&gt;dynamic delegate&lt;/a&gt;! :)&lt;/p&gt;
&lt;h2 id=&quot;callbacks&quot;&gt;Filtering by OS&lt;/h2&gt;
&lt;p&gt;This is a more advanced Picky trick, which might only be interesting to pros.&lt;/p&gt;
&lt;p&gt;Like Ruby gems, pods can run on multiple OSs: On iOS and/or on OS X.&lt;/p&gt;
&lt;p&gt;We always want to filter by either both (&lt;span class=&quot;caps&quot;&gt;AND&lt;/span&gt;), iOS, or OS X. This means we always prepend the platform filter to the query like so: &lt;code&gt;&quot;on:some_platform rest of the query&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This is problematic since it uses a lot of input field space, and also confuses the user.&lt;/p&gt;
&lt;p&gt;We would like to not show the OS in the search field, but use the value from the iOS style radio buttons.&lt;/p&gt;
&lt;p&gt;Picky helps us by offering multiple JS callbacks. If you copy a search link like &lt;a href=&quot;http://cocoapods.org/?q=on:osx%20Kiwi&quot;&gt;http://cocoapods.org/?q=on:osx%20Kiwi&lt;/a&gt; into the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; bar, Picky
runs a few JS callbacks, in the following order:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;code&gt;beforeInsert(query) // Before inserting the query into the search field.&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;before(query, params) // Before sending the query back to the server.&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;after(data, query) // After receiving the query back, before rendering.&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;success(data, query) // After the view/results have been updated.&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;(&lt;code&gt;data&lt;/code&gt; is the JS PickyData object)&lt;/p&gt;
&lt;p&gt;We need both &lt;code&gt;beforeInsert&lt;/code&gt; and &lt;code&gt;before&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In &lt;code&gt;beforeInsert&lt;/code&gt;, we remove the &lt;code&gt;os&lt;/code&gt; part, before it is inserted into the search field. In &lt;code&gt;before&lt;/code&gt;, before sending it to the backend, we add the OS back into the query, taken from the radio button value.&lt;/p&gt;
&lt;p&gt;In code (the Picky JS search client options), it looks like this:&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;// Before a query is inserted into the search field
// we clean it of any platform terms.
//
beforeInsert: function(query) {
  return query.replace(platformRemoverRegexp, '');
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The regexp to remove the platform search term looks like this:&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;var platformRemoverRegexp = /(platform|on\:\w+\s?)+/;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And before sending the search request to the backend, Picky calls the &lt;code&gt;before&lt;/code&gt; callback where we remove any OS parts, prepending the selected one (the iOS style radio buttons have the values &lt;code&gt;on:ios on:osx&lt;/code&gt;, &lt;code&gt;on:ios&lt;/code&gt;, and &lt;code&gt;on:osx&lt;/code&gt;).&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;before: function(query, params) {
  query = query.replace(platformRemoverRegexp, ''); // Clean the query.
  var platformModifier = platformSelect.find(&quot;input:checked&quot;).val(); // Get the selected OS.
  return platformModifier + ' ' + query; // Prepend it to the query.
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, the complete query, including the OS is still inserted into the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, ready for you to copy and send to friends.&lt;/p&gt;
&lt;p&gt;5 lines of nicely customizable code :)&lt;/p&gt;
&lt;h2 id=&quot;duplicates&quot;&gt;Removing duplicates from results&lt;/h2&gt;
&lt;p&gt;This is another more advanced Picky trick, which might only be interesting to pros.&lt;/p&gt;
&lt;p&gt;I often get requests on how to remove duplicates from search requests.&lt;/p&gt;
&lt;p&gt;Why are there duplicates in Picky&amp;#8217;s search results anyway?&lt;/p&gt;
&lt;p&gt;Picky returns categorized search results. For example, it might deem the combination of categories &lt;code&gt;&quot;first_name&quot;, &quot;last_name&quot;&lt;/code&gt; more important, before all search results found in the categories &lt;code&gt;&quot;street&quot;, &quot;last_name&quot;&lt;/code&gt;. But this also means that the same entry can be contained in both combinations of categories!&lt;/p&gt;
&lt;p&gt;Many Picky users just use &lt;code&gt;results.ids&lt;/code&gt; to extract a list of ids. To get the list of ids, Picky goes through the results in each combination of categories and extracts the ids. This means that Picky may well return &lt;code&gt;[1,3,1,2,3]&lt;/code&gt;,
with results &lt;code&gt;1&lt;/code&gt;
and &lt;code&gt;3&lt;/code&gt; occurring twice.&lt;/p&gt;
&lt;p&gt;Since cocoapods.org only wants to show an uncategorized list of result pods, we wish to remove duplicates to not confuse searchers.&lt;/p&gt;
&lt;p&gt;We achieve this by using Picky&amp;#8217;s JS &lt;code&gt;success&lt;/code&gt; callback. This goes through all combinations of categories (aka &lt;em&gt;allocations&lt;/em&gt;) and removes entries from the allocations if we&amp;#8217;ve already seen them previously. It ensures we only see unique results.&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;// We filter duplicate ids here.
// (Not in the server as it might be
// used for APIs etc.)
//
success: function(data, query) {
  var seen = {};
  
  var allocations = data.allocations;
  allocations.each(function(i, allocation) {
    var ids     = allocation.ids;
    var entries = allocation.entries;
    var remove = [];
    
    ids.each(function(j, id) {
      if (seen[id]) {
        data.total -= 1;
        remove.push(j);
      } else {
        seen[id] = true;
      }
    });
    
    for(var l = remove.length-1; 0 &amp;lt;= l; l--) {
      entries.splice(remove[l], 1);
    }
    
    allocation.entries = entries;
  });
  
  return data;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We could well do this in the server, but I opted against it, because a possible future search &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; might want to expose the duplicate results. This is why we do it in the client.&lt;/p&gt;
&lt;h2 id=&quot;fun&quot;&gt;Other fun things to try!&lt;/h2&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;http://cocoapods.org&quot;&gt;Search&lt;/a&gt; for anything and then click on a pod author name in the results.&lt;/li&gt;
	&lt;li&gt;Enter &lt;a href=&quot;http://cocoapods.org/?q=luke%201.0&quot;&gt;Luke 1.0&lt;/a&gt; to get all pods written by a luke with version 1.0*.&lt;/li&gt;
	&lt;li&gt;Enter e.g. &lt;a href=&quot;http://cocoapods.org/?q=stacked&quot;&gt;stacked&lt;/a&gt; and press each platform button to see what happens to the results.&lt;/li&gt;
	&lt;li&gt;Enter e.g. &lt;a href=&quot;http://cocoapods.org/?q=uses:json&quot;&gt;uses:json&lt;/a&gt; to see all pods which use a pod with &amp;#8220;json&amp;#8221; in their name.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Feedback&lt;/h2&gt;
&lt;p&gt;We&amp;#8217;re very glad for feedback – shoot us a line at &lt;a href=&quot;http://twitter.com/CocoaPodsOrg&quot;&gt;http://twitter.com/CocoaPodsOrg&lt;/a&gt;, or at &lt;a href=&quot;http://twitter.com/picky_rb&quot;&gt;http://twitter.com/picky_rb&lt;/a&gt;. Thanks!&lt;/p&gt;
&lt;p&gt;Thanks also to the CocoaPods team for a great project!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Active&amp;nbsp;Record&amp;nbsp;3</title>
   <link href="http://florianhanke.com/blog/2012/01/14/picky-active-record-3.html"/>
   <updated>2012-01-14T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/01/14/picky-active-record-3</id>
   <content type="html">&lt;p&gt;This post talks about integrating Picky directly into Rails/ActiveRecord.&lt;/p&gt;
&lt;p&gt;(By the way, greetings from &lt;a href=&quot;http://www.cmswire.com/events/item/rails-camp-x-adelaide-2012-013342.php&quot;&gt;Rails Camp X Adelaide&lt;/a&gt; – come up and say hi if you are here!)&lt;/p&gt;
&lt;p&gt;The last post illustrated a way of writing an active record integration. Still missing is index persistence.&lt;/p&gt;
&lt;p&gt;However, in this post I&amp;#8217;d like to talk about wrapping the last solution up into a nicer bundle.&lt;/p&gt;
&lt;h2&gt;Beautifying the last solution&lt;/h2&gt;
&lt;p&gt;Why? It contains a few advanced Ruby concepts and statements. While I think everybody should know about &lt;code&gt;class &amp;lt;&amp;lt; self&lt;/code&gt; and &lt;code&gt;define_method&lt;/code&gt;, it can get kind of hard to read compared to a more declarative style that Tire (Elastic Search), Thinking Sphinx (Sphinx), or Sunspot (Solr) offer.&lt;/p&gt;
&lt;p&gt;However: While I like the declarative style in many cases, some libraries hide away too much important code. Many times even code that is hugely important, or does things to your model which you only find out about after reading the library source. After a crash. In production.&lt;/p&gt;
&lt;h2&gt;Goals&lt;/h2&gt;
&lt;p&gt;So what I&amp;#8217;d like is&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;have the important bits be visible and manipulable.&lt;/li&gt;
	&lt;li&gt;hide away boilerplate code that makes code harder to read.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And maybe most important:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;use the standard Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A quick reminder what the basic Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; is:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data = Picky::Index.new :name do
  category :name
end
things = Picky::Search.new data
things.search 'something'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Most other search engine adapters try to elegantify the original &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;. This is nice.&lt;/p&gt;
&lt;p&gt;However, having control over both APIs, I believe that using the original (standard) Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; creates a pressure on it to stay as elegant as possible and as useable as possible.&lt;/p&gt;
&lt;p&gt;If we hide away the Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, pressure is only excerted on the ActiveRecord/Picky adapter. This also means that people who only use the Picky ActiveRecord &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; only come in contact with that one.&lt;/p&gt;
&lt;p&gt;Why is this a problem?
This is a problem when people want to &lt;em&gt;transcend&lt;/em&gt; the AR &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; to use for example the separate and specific Picky server. If the APIs look and feel fundamentally different, users will not willingly make this jump.
In fact, many people then start looking for search engine alternatives. This is a bad thing. Let me put this in bold, because it gets violated so many times:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The jump from the simple &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; to the harder &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; should not be noticeable.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The only way to do this is use a subset of the original &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; for the simpler one. However, since Picky is about giving &lt;strong&gt;you&lt;/strong&gt; the power, we will not constrict you, but instead make the whole &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; accessible.&lt;/p&gt;
&lt;h2&gt;A first draft&lt;/h2&gt;
&lt;p&gt;I am not the biggest fan of the following pattern:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  include Picky::ActiveRecord
  
  some_method_call_from_the included, module
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I am not sure why since it&amp;#8217;s perfectly ok Ruby. I believe it is because it usually consists of two lines, and only one really describes what is going on: &amp;#8220;I am using this&amp;#8221; and &amp;#8220;I am using it like this&amp;#8221;.&lt;/p&gt;
&lt;p&gt;With this subgoal in mind, I started drafting the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;. It turned out like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  extend(Picky::ActiveRecord.new(:models) do
    Picky::Index.new :models do
      category :name
      category :surname
    end
  end)
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Don&amp;#8217;t judge me. It gets better.&lt;/p&gt;
&lt;p&gt;Why do I use so many round parentheses, having declared them unnecessary not so long ago?&lt;/p&gt;
&lt;p&gt;Turns out, &lt;code&gt;extend&lt;/code&gt; gobbles up my block. Try running the following code:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;module A; end
class B
  extend A do
  	# ...
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I am unsure what happens here. Looking at the &lt;a href=&quot;http://ruby-doc.org/core-1.9.3/Object.html#method-i-extend&quot;&gt;CRuby code&lt;/a&gt; didn&amp;#8217;t help. Ideas?&lt;/p&gt;
&lt;p&gt;I guess we can all agree that this &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; is neither good looking nor elegant. Let&amp;#8217;s try again.&lt;/p&gt;
&lt;h2&gt;A better draft&lt;/h2&gt;
&lt;p&gt;So, teeth grinding, we return back to the standard solution of having a separate include and declarations. However, I&amp;#8217;d like to be able to use the Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;This is what I&amp;#8217;ve come up with:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  include Picky::ActiveRecord
  
  index = Picky::Index.new :models do
    category :name
    category :surname
  end
  
  search = Picky::Search.new index
  
  updates_picky index
  searches_picky search
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&amp;#8217;s look at the design in detail.&lt;/p&gt;
&lt;h3&gt;In detail&lt;/h3&gt;
&lt;p&gt;First of all, note that no saving of indexes in instance variables is done. You can do it, should you need it, but Picky is not saving anything like &lt;code&gt;@__picky_index&lt;/code&gt; for you. Instead, the index and the search are both passed into the a method in which they are captured in a closure.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s look at the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; code.&lt;/p&gt;
&lt;p&gt;The line&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;include Picky::ActiveRecord&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;does two things: First, it includes two other modules, &lt;code&gt;Picky::ActiveRecord::Indexing&lt;/code&gt; and &lt;code&gt;Picky::ActiveRecord::Searching&lt;/code&gt;, that are concerned with indexing and searching, respectively. It is well imaginable that one doesn&amp;#8217;t want realtime indexing, just searching, or vice versa.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Index.new :models do
  category :name
  category :surname
end

search = Picky::Search.new index&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the standard Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;. You create an index (definition) and pass it into the search.&lt;/p&gt;
&lt;p&gt;The line&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;updates_picky index&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;tells this class to automatically update the given index as soon as the &lt;code&gt;after_commit&lt;/code&gt; method is called.&lt;/p&gt;
&lt;p&gt;This method can also be called as follows:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;updates_picky :models&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;updates_picky&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first one uses the index called &lt;code&gt;:models&lt;/code&gt; and the second one uses &lt;code&gt;model_class.name.tableize&lt;/code&gt; to find the model name.&lt;/p&gt;
&lt;p&gt;Finally, the line&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;searches_picky search&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;installs a &lt;code&gt;Model.search&lt;/code&gt; method using the given &lt;code&gt;search&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Also of note&lt;/h3&gt;
&lt;p&gt;This &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; does not really care where anything is set up. This is well possible:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  include Picky::ActiveRecord
end

# In e.g. initializers/picky.rb
#
index = Picky::Index.new :models do
  category :name
  category :surname
end
  
search = Picky::Search.new index
  
Model.updates_picky index
Model.searches_picky search
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;for the case where you&amp;#8217;d like your search code outside the model.&lt;/p&gt;
&lt;p&gt;Also, you can call &lt;code&gt;updates_picky&lt;/code&gt; multiple times:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Model.updates_picky index
Model.updates_picky index2
Model.updates_picky index3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Any updates to the model will update each index.&lt;/p&gt;
&lt;h2&gt;Implementation&lt;/h2&gt;
&lt;p&gt;If you&amp;#8217;re interested in the implementation, see &lt;a href=&quot;https://github.com/floere/picky/blob/de7784bb15768fba38870601b2ecf59a64009ec7/server/prototypes/integrated_active_record/active_record.rb&quot;&gt;the Picky::ActiveRecord module&lt;/a&gt; (code at the time of this writing).&lt;/p&gt;
&lt;h2&gt;Finally, you&lt;/h2&gt;
&lt;p&gt;Hope you like the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; design series. The &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; is certainly turning out to be simple. Too simple? Who knows.&lt;/p&gt;
&lt;p&gt;Opinions, ideas?&lt;/p&gt;
&lt;p&gt;We still haven&amp;#8217;t looked at index persistence. We save this for another blog post.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Active&amp;nbsp;Record&amp;nbsp;2</title>
   <link href="http://florianhanke.com/blog/2012/01/13/picky-active-record-2.html"/>
   <updated>2012-01-13T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/01/13/picky-active-record-2</id>
   <content type="html">&lt;p&gt;This post talks about integrating Picky directly into Rails/ActiveRecord.&lt;/p&gt;
&lt;p&gt;(By the way, greetings from &lt;a href=&quot;http://www.cmswire.com/events/item/rails-camp-x-adelaide-2012-013342.php&quot;&gt;Rails Camp X Adelaide&lt;/a&gt; – come up and say hi if you are here!)&lt;/p&gt;
&lt;p&gt;In the last post we talked about a light active record integration. This has been &lt;a href=&quot;https://github.com/floere/picky/tree/master/server/prototypes/active_record&quot;&gt;implemented in the prototype&lt;/a&gt; and released in Picky 4.0.9.&lt;/p&gt;
&lt;p&gt;By light integration we mean:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;You have a separate Picky server.&lt;/li&gt;
	&lt;li&gt;The Picky server is not configured via the ActiveRecord model.&lt;/li&gt;
	&lt;li&gt;The ActiveRecord data is simply sent to the Picky server as-is for indexing after each commit.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;A quick example of 4.0.9 ActiveRecord integration&lt;/h3&gt;
&lt;p&gt;First, configure a Sinatra Picky server to be open for external indexing.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class YourSearch &amp;lt; Sinatra::Base
  extend Sinatra::IndexActions

  # Configure indexes etc. as usual
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, configure your AR model:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  # These are the default options.
  #
  extend Picky::Client::ActiveRecord.configure(host: 'localhost', port: 8080, path: '/')

  # The model definition as usual.
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And that&amp;#8217;s it already :)&lt;/p&gt;
&lt;h2&gt;Direct integration&lt;/h2&gt;
&lt;p&gt;While the above is very nice, you still need a separate server.&lt;/p&gt;
&lt;p&gt;Usually I advocate keeping search separate from the app, because normally, search and app have different goals. For example, caching for either needs to work differently. Search maybe needs to be restarted independently etc.&lt;/p&gt;
&lt;p&gt;But sometimes, you simply want a quick and simple search to directly run in the one server you have.&lt;/p&gt;
&lt;p&gt;So instead of setting up a separate server, we would integrate Picky directly in the model.&lt;/p&gt;
&lt;p&gt;How would we do this?&lt;/p&gt;
&lt;h2&gt;A first simple implementation&lt;/h2&gt;
&lt;p&gt;At this point I am incredibly glad to have designed Picky to work and run anywhere.&lt;/p&gt;
&lt;p&gt;Since you already can stick it anywhere (a Sinatra server, a DRb server, a simple script, a &lt;span class=&quot;caps&quot;&gt;PORO&lt;/span&gt;, &amp;#8230;), you can relatively easily stick it into an active record model.&lt;/p&gt;
&lt;p&gt;How, you ask? Let me show you &lt;a href=&quot;https://github.com/floere/picky/tree/master/server/prototypes/inside_active_record&quot;&gt;the whole thing&lt;/a&gt; and then pick it apart.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Model &amp;lt; ActiveRecord::Base
  
  class &amp;lt;&amp;lt; self
    data = Picky::Index.new :models do
      category :name
      category :surname
    end
  
    define_method :replace do |model|
      data.replace model
    end
    
    define_method :remove do |id|
      data.remove id
    end
  
    models = Picky::Search.new data
    
    define_method :search do |*args|
      models.search *args
    end
  end
  
  after_commit do
    if destroyed?
      self.class.remove self.id
    else
      self.class.replace self
    end
  end
  
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Got that? If not, here&amp;#8217;s a step by step explanation:&lt;/p&gt;
&lt;p&gt;We want the index and the search object to reside in the (singleton) class to define methods there, so we open it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class &amp;lt;&amp;lt; self&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then we define a Picky index (two searchable categories, &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;surname&lt;/code&gt;) and two methods. One to &lt;code&gt;replace&lt;/code&gt; (&amp;#8220;insert or update&amp;#8221;) indexed models and one to &lt;code&gt;remove&lt;/code&gt; indexed models with a given id:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data = Picky::Index.new :models do
  category :name
  category :surname
end

define_method :replace do |model|
  data.replace model
end

define_method :remove do |id|
  data.remove id
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why am I using &lt;code&gt;define_method&lt;/code&gt; instead of &lt;code&gt;def&lt;/code&gt;? I want to capture the &lt;code&gt;data&lt;/code&gt; (index) and the &lt;code&gt;models&lt;/code&gt; (search) in the block for these methods to use them later on.&lt;/p&gt;
&lt;p&gt;These two methods, since defined on the class&amp;#8217; singleton class, are used like that:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Model.replace model&lt;/code&gt;&lt;/pre&gt;
and
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Model.remove model_id&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;These are all the methods that have to do with curating the index.&lt;/p&gt;
&lt;p&gt;Finally, we want the class to update the index as soon as it changes. We use AR 3.0+ &lt;code&gt;after_commit&lt;/code&gt; callback for that:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;after_commit do
  if destroyed?
    self.class.remove self
  else
    self.class.replace self
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So if the object has been destroyed, we remove it from the index (using the &amp;#8220;class methods&amp;#8221; we defined earlier). If it hasn&amp;#8217;t, we simply replace the data.&lt;/p&gt;
&lt;p&gt;Interesting to note: On a &lt;code&gt;replace&lt;/code&gt;, Picky simply calls the methods the categories name: &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;surname&lt;/code&gt;. So not only can Picky index Active Record attributes, but any method it has.&lt;/p&gt;
&lt;h2&gt;First conclusion&lt;/h2&gt;
&lt;p&gt;You can already do this in the current Picky version 4.0.9.&lt;/p&gt;
&lt;p&gt;However, this has a few disadvantages:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The indexes aren&amp;#8217;t yet saved. (Hint: &lt;code&gt;Picky::Indexes.dump&lt;/code&gt;)&lt;/li&gt;
	&lt;li&gt;If they would be saved, they would not yet be reloaded.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;How do we do this? The dumping is relatively easy, but how do we get the data back into that index when restarting and loading the index?
If you&amp;#8217;re into trying to implement that have a go. If not, stay tuned! :)&lt;/p&gt;
&lt;p&gt;Another question for you: Is sticking the method on the &lt;code&gt;Model&lt;/code&gt; like&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Model.replace model&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;actually a good idea? What if, say Thinking Sphinx, reloads your models? Is your model – being an AR model – not already doing enough? What about the single responsibility principle?&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s already night here at Rails Camp X Adelaide, so good night. And good luck. Stay tuned!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Active&amp;nbsp;Record</title>
   <link href="http://florianhanke.com/blog/2012/01/04/picky-active-record.html"/>
   <updated>2012-01-04T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2012/01/04/picky-active-record</id>
   <content type="html">&lt;p&gt;This post is about the challenges of designing an Active Record interface for Picky.&lt;/p&gt;
&lt;p&gt;When we last time looked at writing a nice ActiveRecord integration, around version 2.0, and then 3.0, Picky the server wasn&amp;#8217;t ready yet.&lt;/p&gt;
&lt;p&gt;What was missing?&lt;/p&gt;
&lt;p&gt;Most importantly, an interface to save updates as they come in (in Picky: &lt;code&gt;Index#add&lt;/code&gt;, &lt;code&gt;Index#remove&lt;/code&gt;, &lt;code&gt;Index#replace&lt;/code&gt;). Secondly, the possibility to dump indexes during runtime (&lt;code&gt;Index#dump&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;How would we go about designing an Active Record interface for Picky? How do others do it?&lt;/p&gt;
&lt;h2&gt;Others&lt;/h2&gt;
&lt;p&gt;Some search servers (like &lt;a href=&quot;http://sphinxsearch.com/&quot;&gt;Sphinx&lt;/a&gt;) do not really offer an interface for live updates, but instead go the route of cleverly reindexing from a central data repository.&lt;/p&gt;
&lt;p&gt;Other search servers offer &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; interfaces (for example &lt;a href=&quot;http://elasticsearch.org&quot;&gt;elasticsearch&lt;/a&gt; with its &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; &lt;span class=&quot;caps&quot;&gt;POST&lt;/span&gt;/&lt;span class=&quot;caps&quot;&gt;PUT&lt;/span&gt; etc. interface).&lt;/p&gt;
&lt;p&gt;Since it is a nice and flexible standard interface, it enables interested coders to write software for it, for example &lt;a href=&quot;http://github.com/karmi/tire&quot;&gt;Tire&lt;/a&gt;. This is a great way of attracting effort.&lt;/p&gt;
&lt;p&gt;Another idea would be to open a port the engine listens on, pipes, or any form of communication imaginable.&lt;/p&gt;
&lt;p&gt;In any case, Picky needs a standard interface.&lt;/p&gt;
&lt;h2&gt;The rough idea&lt;/h2&gt;
&lt;p&gt;Our rough idea is to listen for updates in the server and create a gem for use with active record (and others), which talks to the server every time some data is updated.&lt;/p&gt;
&lt;p&gt;What are the challenges in the server?&lt;/p&gt;
&lt;h2&gt;The Server&lt;/h2&gt;
&lt;p&gt;Picky does not have a standard external interface beyond the &lt;code&gt;Picky::Index&lt;/code&gt; and &lt;code&gt;Picky::Search&lt;/code&gt;, which searches over the indexes.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Index.new(:name) do
  # ...
end
things = Picky::Search.new index
things.search 'something'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, this is a very flexible approach, but comes with the problem that we need an implementation for all the different containers of Picky.&lt;/p&gt;
&lt;p&gt;In the case of Sinatra, it will offer a &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; interface, where the picky-activerecord gem will send updates to.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s see how we would implement that. For updates, we will define a put action:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;put '/' do
  index_name = params['index']
  index = Picky::Indexes[index_name.to_sym] # Get the right index from the indexes.
  index.replace_from params['data']
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The method &lt;code&gt;replace_from(hash)&lt;/code&gt; is available in edge currently. Error handling is omitted.&lt;/p&gt;
&lt;p&gt;Then we can write up the &lt;span class=&quot;caps&quot;&gt;DELETE&lt;/span&gt; action etc., wrap it into a nice module &lt;code&gt;Picky::Interfaces::External&lt;/code&gt;, for example.&lt;/p&gt;
&lt;p&gt;Finally, if someone wants their indexes updated by anything external, she would extend the Sinatra app with that &lt;code&gt;Module&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class MyPickyServer &amp;lt; Sinatra::Base
  extend Picky::Interfaces::External
  
  # ...
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, when we&amp;#8217;d like to create/update/delete an indexed entry, we simply send a &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; request to the Picky server with the following payload:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  index: &quot;people&quot;,
  data: {
    id: 7,
    name: &quot;Florian&quot;,
    surname: &quot;Hanke&quot;
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Sounds easy so far, right?&lt;/p&gt;
&lt;p&gt;Ah, but what if we stop and restart Picky? What happens to the indexed data?&lt;/p&gt;
&lt;h3&gt;When Picky is restarted&lt;/h3&gt;
&lt;p&gt;Let&amp;#8217;s say you don&amp;#8217;t use the realtime &lt;code&gt;SQLite&lt;/code&gt; or &lt;code&gt;Redis&lt;/code&gt; persistent backend to store your indexes, but the standard &lt;code&gt;Memory&lt;/code&gt; backend.&lt;/p&gt;
&lt;p&gt;If we simply restarted, we would lose the indexes. We need a way to dump the data. One way to do this is simply dumping it when you quit Picky:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;at_exit { Picky::Indexes.dump }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or a specific index:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;at_exit { the_index.dump }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then, as you restart the server, you simply load the indexes. Probably in &lt;code&gt;config.ru&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Picky::Indexes.load&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I&amp;#8217;m quite excited about this!&lt;/p&gt;
&lt;p&gt;Sure, you have to write this yourself, but … you also &lt;strong&gt;&lt;span class=&quot;caps&quot;&gt;CAN&lt;/span&gt;&lt;/strong&gt; write it yourself. And control the behaviour of it. Dump it every X requests? Only on exit? I don&amp;#8217;t care! (I mean, I &lt;strong&gt;do&lt;/strong&gt;, but not how you do it :) )&lt;/p&gt;
&lt;p&gt;In closing, I like that in a documentation, picky-activerecord will only need a single line for the server: Add &lt;code&gt;extend Picky::Interfaces::External&lt;/code&gt; to your Sinatra app.&lt;/p&gt;
&lt;h3&gt;Other interfaces?&lt;/h3&gt;
&lt;p&gt;At the beginning, we will focus on writing an experimental/standard Sinatra interface.&lt;/p&gt;
&lt;p&gt;This will result in a nice Module that people can use to make their Sinatra Picky server open to external updates.&lt;/p&gt;
&lt;p&gt;But what about other interfaces?&lt;/p&gt;
&lt;p&gt;Since we expect the Picky Sinatra external interface to only be around 20-30 lines, we&amp;#8217;ll just leave it open for now and implement as the need arises.&lt;/p&gt;
&lt;h2&gt;The Client&lt;/h2&gt;
&lt;p&gt;We&amp;#8217;ll save the discussion on the client for later, but just quickly outline the ideas:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;It should offer a simple and easy configuration possibility, with the default being &lt;code&gt;host: 'localhost', port: 8080, path:'/'&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;It hooks into the &lt;code&gt;after_save&lt;/code&gt; callback.&lt;/li&gt;
	&lt;li&gt;It offers the possibility to save arbitrary data (not just model &lt;code&gt;Person&lt;/code&gt;, or &lt;code&gt;Company&lt;/code&gt; etc., but arbitrary hashes, like &lt;code&gt;Music&lt;/code&gt;, including a list of &lt;code&gt;Genres&lt;/code&gt;, even though that combined object doesn&amp;#8217;t exist – I might note that it could make great sense to create a combined model like this).&lt;/li&gt;
	&lt;li&gt;It should be less than 100 lines. I&amp;#8217;m not kidding.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;You&lt;/h2&gt;
&lt;p&gt;What do you think of the server design? Any obvious flaws? Ideas? Suggestions by those who have used other, similar interfaces?&lt;/p&gt;
&lt;p&gt;Have you already started on writing a picky-activerecord gem? :D&lt;/p&gt;
&lt;p&gt;In any case, thanks for following the slow but steady progress of Picky!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Recipes</title>
   <link href="http://florianhanke.com/blog/2011/12/29/picky-recipes-1.html"/>
   <updated>2011-12-29T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/29/picky-recipes-1</id>
   <content type="html">&lt;p&gt;I&amp;#8217;m currently putting together a collection of Picky recipes.&lt;/p&gt;
&lt;p&gt;I noticed that people who wanted to try Picky had a bit of trouble getting into it. Sure, there is the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;getting started guide&lt;/a&gt; on the &lt;a href=&quot;http://florianhanke.com/picky&quot;&gt;main page&lt;/a&gt;. And there&amp;#8217;s also videos and blog posts.&lt;/p&gt;
&lt;h2&gt;The Quick Demo Fix&lt;/h2&gt;
&lt;p&gt;BUT. The question I should have asked myself is: When I try something for the first time, what do I need? I guess, like many others, I am guilty of being rather lazy when trying software – I need a quick fix.&lt;/p&gt;
&lt;p&gt;This led me to put up a quick copy and paste code example on the &lt;a href=&quot;http://florianhanke.com/picky&quot;&gt;main page&lt;/a&gt;. I haven&amp;#8217;t received any feedback on it yet (except by a friend who urged me to use syntax highlighting), but I am happy with it. It shows the strengths of Picky off nicely.&lt;/p&gt;
&lt;h2&gt;Customized Search Engines in Projects&lt;/h2&gt;
&lt;p&gt;However, &lt;strong&gt;all&lt;/strong&gt; of the Picky projects are projects where the search engine needed to be modified from a little to – mostly – a lot. And in all the cases I or someone else helped getting it right. I don&amp;#8217;t believe this is a problem of Picky, but mostly a mixture of not knowing what options there are and the fact that Picky is not a &amp;#8220;boolean&amp;#8221; search engine framework.
(And, I might add, some of the stunts would not be possible with a non-flexible search engine like … not-Picky)&lt;/p&gt;
&lt;h2&gt;The Recipes&lt;/h2&gt;
&lt;p&gt;This led me to start putting together a few examples which you can copy and paste quickly to see how something works and how it can be used.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;http://github.com/floere/picky/tree/master/recipes&quot;&gt;first 25 simple recipes&lt;/a&gt; are pushed to the picky repo. You can clone the project, then run the recipes by using &amp;#8220;rake&amp;#8221; on the command line inside the recipes directory.&lt;/p&gt;
&lt;p&gt;Let me show one or two that I like to whet your appetite.&lt;/p&gt;
&lt;h3&gt;Realtime Indexing vs. Static Indexing&lt;/h3&gt;
&lt;p&gt;These examples illustrate how to use the static vs. the realtime index.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&quot;https://github.com/floere/picky/blob/master/recipes/basic/static_index.rb&quot;&gt;https://github.com/floere/picky/blob/master/recipes/basic/static_index.rb&lt;/a&gt; and &lt;a href=&quot;https://github.com/floere/picky/blob/master/recipes/basic/realtime_index.rb&quot;&gt;https://github.com/floere/picky/blob/master/recipes/basic/realtime_index.rb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Static indexing is easier if you only index once per day and are happy to use &lt;code&gt;rake index&lt;/code&gt;. Realtime indexing shows you how you can update the index as you get new data.&lt;/p&gt;
&lt;h3&gt;Only finds evenly sized partials&lt;/h3&gt;
&lt;p&gt;This is a bit of a silly recipe, but it illustrates well how easy it is to add a custom partializer.&lt;/p&gt;
&lt;p&gt;Partial searches refer to somebody being able to search for &amp;#8220;flor&amp;#8221; and still finding &amp;#8220;florian&amp;#8221; (use &amp;#8220;flor*&amp;#8221; to explicitly search partially for that word).&lt;/p&gt;
&lt;p&gt;Now what we want is to find only partial words whose length is even. This is the recipe for it: &lt;a href=&quot;https://github.com/floere/picky/blob/master/recipes/partial/customized.rb&quot;&gt;https://github.com/floere/picky/blob/master/recipes/partial/customized.rb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To use it, we just pass in our own partializer&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data = Picky::Index.new :people do
  category :first  
  category :last, partial: Partializer.new # &amp;lt;= Passed in here.
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;that is defined as&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Partializer
  def each_partial text
    temp = text.dup
    temp.length.times do
      yield temp if temp.size.even?
      temp.chop!
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Picky just needs an object with an &lt;code&gt;each_partial&lt;/code&gt; method. Our special partializer chops the word apart until it is gone, and yields if the word is of even length.&lt;/p&gt;
&lt;p&gt;Thus we only find a partial if of even length.&lt;/p&gt;
&lt;p&gt;Wasn&amp;#8217;t that easy?&lt;/p&gt;
&lt;h4&gt;With a Twist&lt;/h4&gt;
&lt;p&gt;Thanks to it yielding, we could have just wrapped one of the given partializers to do the work for us.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Partializer
  def initialize wrapped = Picky::Partial::Postfix.new(from: 1)
  	@wrapped = wrapped
  end	
  def each_partial text
	  @wrapped.each_partial do |partial|
	    yield partial if partial.size.even?
	  end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I like it! The partializer doesn&amp;#8217;t really know what partializer it gets. However, it will still only yield partials that are of even length. Think of it as a filter when used in this style.&lt;/p&gt;
&lt;h3&gt;Context Sensitive Advertisements&lt;/h3&gt;
&lt;p&gt;Let&amp;#8217;s say you want to search for people via name or location. In addition, you&amp;#8217;d like to show an advertisement next to the search results corresponding to the location.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&quot;https://github.com/floere/picky/blob/master/recipes/advanced/advertisement.rb&quot;&gt;https://github.com/floere/picky/blob/master/recipes/advanced/advertisement.rb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So if someone searches for &amp;#8220;Florian Melbourne&amp;#8221;, it should find a Florian in Melbourne, but also show an ad from Melbourne.&lt;/p&gt;
&lt;p&gt;The problem is, if I just use two indexes (one for people, one for ads), if I search in both, the ad index won&amp;#8217;t return any results if the query contains a name. So how do we make the ad search ignore &lt;em&gt;names&lt;/em&gt;???&lt;/p&gt;
&lt;p&gt;Picky tries to assign every search word to a likely category. What we&amp;#8217;d like is to only assign to locations, and if it is a name, to just ignore it.&lt;/p&gt;
&lt;p&gt;The magic thing to use here is &lt;code&gt;ignore_unassigned_tokens&lt;/code&gt;. So if a name cannot be assigned to a category, it will simply be ignored. That&amp;#8217;s it! Run the full example to see for yourself.&lt;/p&gt;
&lt;h2&gt;Yours?&lt;/h2&gt;
&lt;p&gt;If you have recipes to contribute, don&amp;#8217;t be shy. I&amp;#8217;d particularly be happy for a Rails one.&lt;/p&gt;
&lt;h2&gt;Outlook&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;ll be adding recipes as I go. What do you think? Do the recipes help? Do they bewilder you? Do you find what you are looking for? Why? Why not?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Unthinking&amp;nbsp;Autoloader</title>
   <link href="http://florianhanke.com/blog/2011/12/28/unthinking-autoloaders.html"/>
   <updated>2011-12-28T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/28/unthinking-autoloaders</id>
   <content type="html">&lt;p&gt;Let me inform you that the original title was &amp;#8220;Autoloading Is Cancer&amp;#8221;. That basically sets the scene.&lt;/p&gt;
&lt;p&gt;Don&amp;#8217;t know what autoloading is? Check out &lt;a href=&quot;http://www.rubyinside.com/ruby-techniques-revealed-autoload-1652.html&quot;&gt;Peter Cooper&amp;#8217;s quick intro&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;I believe that autoloading is used for all the wrong reasons, and I posit that coders who use autoloading don&amp;#8217;t really know why they use it.&lt;/p&gt;
&lt;p&gt;If you &lt;strong&gt;are&lt;/strong&gt; using it: Do you know why? Maybe I am unfair here, but this is a blog post to shake you up a little bit. You filthy autoloading pig.&lt;/p&gt;
&lt;h2&gt;Readable and clean code&lt;/h2&gt;
&lt;p&gt;After years of coding, one of the most important functions of code for me is &lt;strong&gt;its readability&lt;/strong&gt;. I do it for &lt;strong&gt;myself&lt;/strong&gt;, but first for &lt;strong&gt;everyone&lt;/strong&gt; who has a problem with my lib&amp;#8217;s functionality and/or simply wants to know how something works.&lt;/p&gt;
&lt;p&gt;After all, code is the best documentation, and is used most as such (apart from its &lt;em&gt;raison d&amp;#8217;être&lt;/em&gt; of being run).&lt;/p&gt;
&lt;p&gt;If others can go in and read your code, and even enjoy it, and are learning something from it, then you know that you have a great lib.&lt;/p&gt;
&lt;p&gt;Even if the reader can&amp;#8217;t use it right away, he/she can take away something from it.&lt;/p&gt;
&lt;h2&gt;Information transportation&lt;/h2&gt;
&lt;p&gt;To take away something from your code, it needs to transport information (into your brain that is) in the most efficient fashion.&lt;/p&gt;
&lt;h3&gt;Contrast and compare&lt;/h3&gt;
&lt;p&gt;Let&amp;#8217;s see what information can be gained by reading this code:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'models/person'
require 'models/company'
require 'server/auxiliary'
require 'server/core'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From this code, I can take away a lot of things!&lt;/p&gt;
&lt;p&gt;In order of importance:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;4 files are required (obvious)&lt;/li&gt;
	&lt;li&gt;Apparently we have model-related code and server related code (obvious since someone has done his/her naming homework)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But much cooler:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The &lt;code&gt;Person&lt;/code&gt; model is probably* most independent.&lt;/li&gt;
	&lt;li&gt;The &lt;code&gt;Company&lt;/code&gt; model might depend on the &lt;code&gt;Person&lt;/code&gt;, but the &lt;code&gt;Person&lt;/code&gt; model is independent* of the &lt;code&gt;Company&lt;/code&gt; model.&lt;/li&gt;
	&lt;li&gt;The server code might use the models.&lt;/li&gt;
	&lt;li&gt;The auxiliary server code is probably* independent of the core server code.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The * refers to the fact that the code, dynamically, might still not be independent (since it could refer to a constant in a method etc.). If it isn&amp;#8217;t, and is required before the &amp;#8220;dependency&amp;#8221;, ewwww.&lt;/p&gt;
&lt;p&gt;So, why do I think this independence thing is so cool. Let me show you another example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'server/core'
require 'models/person'
require 'models/company'
require 'server/auxiliary'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code tells me with a high probability that something might be awry in this code and I definitely need to take a closer look. Do the models refer to the server? Does the server not use the models at all? Does the &amp;#8220;auxiliary&amp;#8221; code use the models? Is this just a naming problem or do we have something at hand that needs to be trashed?&lt;/p&gt;
&lt;p&gt;Now that we know that one can gain quite a bit of information by reading code (surprise! :) ), let&amp;#8217;s take a look at the autoloading example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;module Server
  autoload :Core,      'server/core'
  autoload :Auxiliary, 'server/auxiliary'
end
module Models
  autoload :Person,  'models/person'
  autoload :Company, 'models/company'
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From this code we can take away the following things that are non-obvious:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Jack shit.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I have no clue as to who needs what, and why. No hints.&lt;/p&gt;
&lt;p&gt;This code fails me in readability on so many levels. Never mind introduces unneeded complexity. Also, can you tell me what happens when code like this is run in forked child processes? What about threads? What about both?&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I noticed that many people use autoloading for three reasons:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;It&amp;#8217;s a cool Ruby technique. (I am ignoring this one)&lt;/li&gt;
	&lt;li&gt;They don&amp;#8217;t want to think about dependencies in their code. (Also ignoring this one)&lt;/li&gt;
	&lt;li&gt;It enhances startup time.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Startup Time&lt;/h3&gt;
&lt;p&gt;Although it might help a bit by spreading the loading code over the run time of your program, believe me: This is not the place to solve this problem.&lt;/p&gt;
&lt;p&gt;A long startup time hints at deeper problems with code structure, unnecessary precaching, etc. in your lib or the libs you are using.&lt;/p&gt;
&lt;p&gt;Autoloading is not a solution for slow startup time. It is, at best, a quick fix for a problem which really is begging for some brains to be applied.&lt;/p&gt;
&lt;h2&gt;Final Question&lt;/h2&gt;
&lt;p&gt;Why do you use autoloading? Do you have good reasons that I haven&amp;#8217;t considered?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Why&amp;nbsp;I&amp;nbsp;don't&amp;nbsp;use&amp;nbsp;round&amp;nbsp;brackets</title>
   <link href="http://florianhanke.com/blog/2011/12/18/why-i-dont-use-parentheses.html"/>
   <updated>2011-12-18T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/18/why-i-dont-use-parentheses</id>
   <content type="html">&lt;p&gt;This is a blog post for once &lt;span class=&quot;caps&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;caps&quot;&gt;ABOUT&lt;/span&gt; &lt;span class=&quot;caps&quot;&gt;PICKY&lt;/span&gt;! :D So enjoy the tentacle-free space.&lt;/p&gt;
&lt;p&gt;Let me be blunt: I really don&amp;#8217;t like reading Ruby code that uses a lot of round brackets.&lt;/p&gt;
&lt;p&gt;No, let me be blunter: I hate reading code that uses a lot of round brackets.&lt;/p&gt;
&lt;p&gt;Actually, it&amp;#8217;s like this: &lt;strong&gt;Round brackets are the training wheels of a Ruby coder. They might be useful in the beginning, but at some point they should come off!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;But let me be less contrarian and just show you why I don&amp;#8217;t use them anymore…&lt;/p&gt;
&lt;h2&gt;Weaning yourself off the training wheels&lt;/h2&gt;
&lt;p&gt;There&amp;#8217;s a few good reasons why I don&amp;#8217;t use round brackets anymore.&lt;/p&gt;
&lt;h3&gt;Less noise&lt;/h3&gt;
&lt;p&gt;Brackets introduce visual noise. Compare and contrast these two method signatures:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def extract_from(text)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def extract_from text&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What do you gain by introducing brackets? Would you gain something by introducing them into text?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;My name is(Florian Hanke)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you think this text example has nothing to do with code then we have different views on code readability. It&amp;#8217;s more legible to me.&lt;/p&gt;
&lt;h3&gt;Law of Demeter&lt;/h3&gt;
&lt;p&gt;You&amp;#8217;ve probably heard of the &lt;a href=&quot;http://en.wikipedia.org/wiki/Law_of_Demeter&quot;&gt;Law of Demeter&lt;/a&gt;? If you haven&amp;#8217;t, please read about it :)&lt;/p&gt;
&lt;p&gt;Not wanting to use round brackets introduces a strain every time I am about to break the Law of Demeter.&lt;/p&gt;
&lt;p&gt;Consider this code:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;text = extract other_text&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, if I wanted to call another method on the result, I&amp;#8217;d have to write this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;text = extract(other_text).process&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Spotting violations is easy for me. I just look for the brackets. If I see brackets in my code, I instantly know that they are there for a good reason and that I actually had a reason to break the Law of Demeter.&lt;/p&gt;
&lt;p&gt;Code like&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;a.b(c).d(e).f&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is simply impossible for me, and that&amp;#8217;s a good thing!&lt;/p&gt;
&lt;h3&gt;Typing&lt;/h3&gt;
&lt;p&gt;This is not about typing speed. It is simply about comfort. The comfort of not having to do bracket acrobatics&amp;#8482;.&lt;/p&gt;
&lt;p&gt;Not using brackets lets you type as if the code was free text.&lt;/p&gt;
&lt;p&gt;As opposed to e.g. JavaScript, Ruby actually lets you do this, so take advantage.&lt;/p&gt;
&lt;h3&gt;Being explicit about no parameters&lt;/h3&gt;
&lt;p&gt;Two small counterpoints.&lt;/p&gt;
&lt;p&gt;I use Rspec. Chances are, you use it as well.&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s an expression that goes like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;thing.should_receive(:some_method).once.with&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It&amp;#8217;s a fluid interface, so using parentheses is ok for me. One of the exceptions. However, I even add them explicitly to tell the future me that I really don&amp;#8217;t expect any parameters:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;thing.should_receive(:some_method).once.with()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Equals &amp;#8220;with nothing&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Another exception is the &amp;#8220;gobbler&amp;#8221; * argument to a method, where Ruby needs brackets to know what it is looking at.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def try(*) end&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;But I&amp;#8217;m used to it!&lt;/h3&gt;
&lt;p&gt;Yes, and you&amp;#8217;re also trained on &lt;span class=&quot;caps&quot;&gt;QWERTY&lt;/span&gt;. Doesn&amp;#8217;t mean it was a good idea.&lt;/p&gt;
&lt;h3&gt;But, but, I need to help Ruby with reading my code!&lt;/h3&gt;
&lt;p&gt;Please. You&amp;#8217;re probably the first to cheer when the robot overlords arrive.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;It&amp;#8217;s a good idea to be sceptical.&lt;/p&gt;
&lt;p&gt;I simply asked myself: Why am I actually using brackets when they are not needed?&lt;/p&gt;
&lt;p&gt;I couldn&amp;#8217;t think of good reasons, while I was able to find some reasons against using brackets.&lt;/p&gt;
&lt;p&gt;Hence, no brackets.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;caps&quot;&gt;WDYT&lt;/span&gt;?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Search&amp;nbsp;Options</title>
   <link href="http://florianhanke.com/blog/2011/12/18/picky-search-options.html"/>
   <updated>2011-12-18T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/18/picky-search-options</id>
   <content type="html">&lt;p&gt;A few examples of what search options are there &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&amp;#8217;re going to look at a simple example and how to search it with Picky 4.0!&lt;/p&gt;
&lt;h2&gt;The Copy &amp;amp; Paste Example&lt;/h2&gt;
&lt;p&gt;(This is the same example as in the last post)&lt;/p&gt;
&lt;p&gt;The example is simple. We have an index of 4 persons (you might recognize the two famous ones). Each person has a first and a last name. Then we use a &lt;code&gt;Search&lt;/code&gt; object on the index to search on it.&lt;/p&gt;
&lt;p&gt;Go ahead, copy it into TextMate or similar!&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'picky'

Person = Struct.new :id, :first, :last

data = Picky::Index.new :people do
  category :first
  category :last
end

data.replace Person.new(1, 'Donald', 'Knuth')
data.replace Person.new(2, 'Niklaus', 'Wirth')
data.replace Person.new(3, 'Donald', 'Worth')
data.replace Person.new(4, 'Peter', 'Niklaus')

people = Picky::Search.new data

results = people.search 'donald'

p results.ids
p results.allocations&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This returns ids &lt;code&gt;[3, 1]&lt;/code&gt; and
the allocations &lt;code&gt;[ [:people, 0.0, 2, [ [:first, &quot;donald&quot;, &quot;donald&quot;] ], [3, 1]] ]&lt;/code&gt;. That might look a little funny, so let me explain: &lt;code&gt;:people&lt;/code&gt; is the index name where it was found. &lt;code&gt;0.0&lt;/code&gt; is the total weight. &lt;code&gt;2&lt;/code&gt; is the total number of ids in this &amp;#8220;allocation&amp;#8221; (combination of categories).
&lt;code&gt;[:first, &quot;donald&quot;, &quot;donald&quot;]&lt;/code&gt; is the category the query word was found in, together with the token and the original.&lt;/p&gt;
&lt;p&gt;All clear?&lt;/p&gt;
&lt;p&gt;Try searching for &amp;#8220;Niklaus&amp;#8221;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should find ids &lt;code&gt;[2, 4]&lt;/code&gt; and two allocations now, first in the first name, then in the last name.&lt;/p&gt;
&lt;p&gt;Cool. Are there some options to fudge the search?&lt;/p&gt;
&lt;p&gt;Sure!&lt;/p&gt;
&lt;h3&gt;boost&lt;/h3&gt;
&lt;p&gt;To move an allocation up in the ranking, we used weights (see last post).&lt;/p&gt;
&lt;p&gt;Picky knows a trick that almost no search engine knows. It can &lt;strong&gt;boost combinations&lt;/strong&gt;!&lt;/p&gt;
&lt;p&gt;Look for:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Looking at the allocations, we see that Picky tells us that Donald was found in a first name, and Knuth in a last name:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[[:people, 0.693, 1, [[:first, &quot;donald&quot;, &quot;donald&quot;], [:last, &quot;knuth&quot;, &quot;knuth&quot;]], [1]]]&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s pretty useful to know what was found where.&lt;/p&gt;
&lt;p&gt;As people usually look for the first name, then the last name, we want to give this more boost.&lt;/p&gt;
&lt;p&gt;Replace this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  boost [:first, :last] =&amp;gt; +3
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A whole 3 points more! Try it the other way around:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Knuth Donald'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We don&amp;#8217;t get the boost. This is incredibly useful: If you look at how people search and then support them this way, they will find relevant results even easier!&lt;/p&gt;
&lt;h3&gt;max_allocations&lt;/h3&gt;
&lt;p&gt;Sometimes you only want the best allocation to appear in the results.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This finds two ids and two allocations, once in the first name, once in the last name.&lt;/p&gt;
&lt;p&gt;Replace:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  max_allocations 1
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now Picky only calculates 1 allocation. Try&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Only the best allocation is found.&lt;/p&gt;
&lt;h3&gt;ignore_unassigned_tokens&lt;/h3&gt;
&lt;p&gt;Did Donald Knuth ever have the nickname &amp;#8220;Popeye&amp;#8221;? Try this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Popeye Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Not really. But what if we want to find him even if one token cannot be assigned to a category?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  ignore_unassigned_tokens
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Popeye Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Voilà!&lt;/p&gt;
&lt;p&gt;This is incredibly useful for an advertisement search. Say in the ads index you only index the city where a person lives. If someone looks for &lt;code&gt;Florian Hanke Melbourne&lt;/code&gt;, you can show the person relevant ads from Melbourne.&lt;/p&gt;
&lt;h3&gt;terminate_early&lt;/h3&gt;
&lt;p&gt;Search for niklaus, and tell Picky you only want 1 id:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus', 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Yes, Picky only calculates 1 id, but still calculates and returns all valid allocations. if you only really need the ids (the Picky interface needs the allocations), then this is unnecessary and could be faster.&lt;/p&gt;
&lt;p&gt;Replace:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  terminate_early
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus', 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Hey presto! Just one allocation.&lt;/p&gt;
&lt;p&gt;This code&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  terminate_early +2
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;will tell Picky to calculate all necessary allocations, plus 2 following ones, for good measure.&lt;/p&gt;
&lt;h3&gt;ignore&lt;/h3&gt;
&lt;p&gt;Try this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You&amp;#8217;ll get results in first and last name. If you only wanted results from the first name, you&amp;#8217;d search for this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'first:Niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Cool. But let&amp;#8217;s say: You, the search engine designer, don&amp;#8217;t want anybody to find anything in a last name, for any reason. Using &lt;code&gt;first:&lt;/code&gt; will select only first. But you might only want to remove the &lt;code&gt;last&lt;/code&gt; category. Do this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  ignore :last
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Niklaus is not found in the last name again.&lt;/p&gt;
&lt;p&gt;You can give it even more:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  ignore :first, :last
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But that is pretty silly in this example. Picky won&amp;#8217;t find anything anymore!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;And that&amp;#8217;s the options the Picky Search object has. As you&amp;#8217;ve seen in the last post, some searching is defined on the indexes, but some options are exclusive to the search side, and are only defined there.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s best to play a bit to unlock their versatility and power :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;APIs</title>
   <link href="http://florianhanke.com/blog/2011/12/18/picky-apis.html"/>
   <updated>2011-12-18T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/18/picky-apis</id>
   <content type="html">&lt;p&gt;A few examples of how to inject your own functionality into &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&amp;#8217;re going to look at a simple example and how to customize it with Picky 4.0!&lt;/p&gt;
&lt;h2&gt;The Copy &amp;amp; Paste Example&lt;/h2&gt;
&lt;p&gt;The example is simple. We have an index of 4 persons (you might recognize the two famous ones). Each person has a first and a last name. Then we use a &lt;code&gt;Search&lt;/code&gt; object on the index to search on it.&lt;/p&gt;
&lt;p&gt;Go ahead, copy it into TextMate 2 Alpha or similar!&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'picky'

Person = Struct.new :id, :first, :last

data = Picky::Index.new :people do
  category :first
  category :last
end

data.replace Person.new(1, 'Donald', 'Knuth')
data.replace Person.new(2, 'Niklaus', 'Wirth')
data.replace Person.new(3, 'Donald', 'Worth')
data.replace Person.new(4, 'Peter', 'Niklaus')

people = Picky::Search.new data

results = people.search 'donald'

p results.ids
p results.allocations&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This returns ids &lt;code&gt;[3, 1]&lt;/code&gt; and
the allocations &lt;code&gt;[ [:people, 0.0, 2, [ [:first, &quot;donald&quot;, &quot;donald&quot;] ], [3, 1]] ]&lt;/code&gt;. That might look a little funny, so let me explain: &lt;code&gt;:people&lt;/code&gt; is the index name where it was found. &lt;code&gt;0.0&lt;/code&gt; is the total weight. &lt;code&gt;2&lt;/code&gt; is the total number of ids in this &amp;#8220;allocation&amp;#8221; (combination of categories).
&lt;code&gt;[:first, &quot;donald&quot;, &quot;donald&quot;]&lt;/code&gt; is the category the query word was found in, together with the token and the original.&lt;/p&gt;
&lt;p&gt;All clear?&lt;/p&gt;
&lt;p&gt;Try searching for &amp;#8220;Niklaus&amp;#8221;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'niklaus'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should find ids &lt;code&gt;[2, 4]&lt;/code&gt; and two allocations now, first in the first name, then in the last name.&lt;/p&gt;
&lt;p&gt;What if you want to find the last name first? We add some weight to it!&lt;/p&gt;
&lt;h3&gt;Adding weight&lt;/h3&gt;
&lt;p&gt;By default, Picky already weighs the categories with a logarithmic weight. That is, the more a token occurs in a category, the &amp;#8220;heavier&amp;#8221; it is.&lt;/p&gt;
&lt;p&gt;So this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is actually&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last, weight: Weights::Logarithmic.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, for &amp;#8220;Niklaus&amp;#8221;, that resolves to a weight of 0.0.&lt;/p&gt;
&lt;p&gt;So let&amp;#8217;s add our own weight object. It just needs to respond to &lt;code&gt;#weight_for(amount_of_ids)&lt;/code&gt; and return a float.&lt;/p&gt;
&lt;p&gt;We ignore the amount and return a flat 12.3. Copy this in your example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Weight = Class.new do
  def weight_for amount
    12.3
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and replace&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last, weight: Weight.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now the last name comes first, with a weight of 12.3, not surprisingly.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[[:people, 12.3, 1, [[:last, &quot;niklaus&quot;, &quot;niklaus&quot;]], [4]], [:people, 0.0, 1, [[:first, &quot;niklaus&quot;, &quot;niklaus&quot;]], [2]]]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Picky provides a few weights itself:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;Picky::Weights::Logarithmic.new&lt;/code&gt; The default.&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Picky::Weights::Constant.new&lt;/code&gt; (with 0.0) or &lt;code&gt;Picky::Weights::Constant.new(1.23)&lt;/code&gt; (with 1.23)&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Picky::Weights::Dynamic.new { |str_or_sym| str_or_sym.size }&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What if we want &amp;#8220;Wirth&amp;#8221; and &amp;#8220;Worth&amp;#8221; be found at the same time?&lt;/p&gt;
&lt;h3&gt;Adding similarity&lt;/h3&gt;
&lt;p&gt;By default, Picky does not look for similar words.&lt;/p&gt;
&lt;p&gt;This:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is actually&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last, similarity: Similarity::None.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, look for &amp;#8220;warth~&amp;#8221; (the ~ tells Picky to look for similar words):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'warth~'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You found nothing, right?&lt;/p&gt;
&lt;p&gt;Picky only looks for similar words if the category enables it!&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s write a similarity such that both will be found. Copy this in your example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Similarity = Class.new do
  def encode text
    text.gsub /[aeiou]/, ''
  end
  def prioritize ary, encoded

  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We encode a text such that its vowels are removed. This will make &amp;#8220;wirth&amp;#8221; and &amp;#8220;worth&amp;#8221; resolve both to &amp;#8220;wrth&amp;#8221;, and that makes them similar.
(The &lt;code&gt;prioritize&lt;/code&gt; method allows you to sort and trim the similars list)&lt;/p&gt;
&lt;p&gt;and replace&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :last, similarity: Similarity.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Again, search for &amp;#8220;warth~&amp;#8221;.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'warth~'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This time you found both, right?&lt;/p&gt;
&lt;p&gt;Picky offers &lt;code&gt;Similarity::Soundex.new(amount_of_similar)&lt;/code&gt;, &lt;code&gt;Similarity::Metaphone.new(amount_of_similar)&lt;/code&gt; and &lt;code&gt;Similarity::DoubleMetaphone.new(amount_of_similar)&lt;/code&gt;. But rolling your own is easy, as you have seen.&lt;/p&gt;
&lt;h3&gt;Adding partial searching&lt;/h3&gt;
&lt;p&gt;Can you find Donald Knuth by entering &amp;#8220;Donal&amp;#8221;?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'donal'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can. But why?&lt;/p&gt;
&lt;p&gt;The word &amp;#8220;donal&amp;#8221; finds something because this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :first&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is actually&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :first, partial: Partial::Postfix.new(from: -3)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That means it finds &amp;#8220;dona&amp;#8221;, &amp;#8220;donal&amp;#8221;, &amp;#8220;donald&amp;#8221;. Try them all!&lt;/p&gt;
&lt;p&gt;Does it find &amp;#8220;don&amp;#8221;? Try it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'don'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;No, it doesn&amp;#8217;t! We could use &lt;code&gt;Partial::Postfix.new(from: -4)&lt;/code&gt; to include this case, but let&amp;#8217;s write our own :)&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Partial = Class.new do
  def each_partial text
    text = text.dup
    (text.size - 1).times do
      yield text.chop!
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and replace&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :first, partial: Partial.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'don'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we find Donald. You can even do this with our partial code:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'd'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We still find him.&lt;/p&gt;
&lt;p&gt;Now, Picky already offers a few partial behaviours:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;Partial::None.new&lt;/code&gt; (Do not search for a partial)&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Partial::Postfix.new(from: position)&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Partial::Substring.new(from: position, to: position)&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Partial::Infix.new(min: size, max: size)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One important note: Picky always searches for the last token in the partial index, even without the asterisk next to the word. If it&amp;#8217;s not the last word, you need an asterisk: &amp;#8220;Don* Knuth&amp;#8221;.&lt;/p&gt;
&lt;h3&gt;Boosting&lt;/h3&gt;
&lt;p&gt;To move an allocation up in the ranking, we used weights.&lt;/p&gt;
&lt;p&gt;Picky knows a trick that almost no search engine knows. It can &lt;strong&gt;boost combinations&lt;/strong&gt;!&lt;/p&gt;
&lt;p&gt;Look for:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Looking at the allocations, we see that Picky tells us that Donald was found in a first name, and Knuth in a last name:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[[:people, 0.693, 1, [[:first, &quot;donald&quot;, &quot;donald&quot;], [:last, &quot;knuth&quot;, &quot;knuth&quot;]], [1]]]&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s pretty useful to know what was found where.&lt;/p&gt;
&lt;p&gt;As people usually look for the first name, then the last name, we want to give this more boost.&lt;/p&gt;
&lt;p&gt;Replace this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  boost [:first, :last] =&amp;gt; +3
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A whole 3 points more! Try it the other way around:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Knuth Donald'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We don&amp;#8217;t get the boost. This is incredibly useful: If you look at how people search and then support them this way, they will find relevant results even easier!&lt;/p&gt;
&lt;p&gt;But how about we want to boost in a specific way?&lt;/p&gt;
&lt;h3&gt;Custom Boosting&lt;/h3&gt;
&lt;p&gt;Copy this into the example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Boosts = Class.new do
  def boost_for combinations
    @map ||= {
      [:first, :last] =&amp;gt; +5
    }
    @map[combinations.map(&amp;amp;:category_name)] || -20
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(A combination is basically a tuple of category and token)&lt;/p&gt;
&lt;p&gt;and replace:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;people = Picky::Search.new data do
  boost Boosts.new
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now try again:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Donald Knuth'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A whole 5 points more! Try it the other way around:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = people.search 'Knuth Donald'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A whopping -20, which would send this allocation back to the end of the list, was there more data.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;I hope you&amp;#8217;re going to try Picky in your next project.&lt;/p&gt;
&lt;p&gt;See the next post for some fancy search options.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;4.0</title>
   <link href="http://florianhanke.com/blog/2011/12/18/picky-4-0-0.html"/>
   <updated>2011-12-18T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/12/18/picky-4-0-0</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; 4.0 release &amp;#8211; a quick description of the goals and the changes from version 3.6.16. More to come later.&lt;/p&gt;
&lt;h2&gt;Goals&lt;/h2&gt;
&lt;p&gt;The ultimate goal of Picky is to become &lt;strong&gt;&lt;span class=&quot;caps&quot;&gt;THE&lt;/span&gt;&lt;/strong&gt; choice for a &lt;strong&gt;lightweight&lt;/strong&gt; search engine, as &lt;strong&gt;flexible&lt;/strong&gt; as possible, regarding the container on one hand (useable in a script/a &lt;a href=&quot;http://sinatrarb.com&quot;&gt;Sinatra&lt;/a&gt; instance, a DRb server, wherever) and in itself on the other hand, offering a rich &lt;a href=&quot;https://github.com/floere/picky/blob/master/server/APIs.textile&quot;&gt;&lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;&lt;/a&gt; where you plug in search engine behavior.&lt;/p&gt;
&lt;p&gt;Release 4.0 is another big step towards these goals.&lt;/p&gt;
&lt;h2&gt;Thanks&lt;/h2&gt;
&lt;p&gt;Thanks to all who helped with this release! Among others: &lt;a href=&quot;http://twitter.com/rogerbraun&quot;&gt;Roger Braun&lt;/a&gt;, &lt;a href=&quot;http://twitter.com/ende42&quot;&gt;Niko Dittmann&lt;/a&gt;, &lt;a href=&quot;http://twitter.com/kasparschiess&quot;&gt;Kaspar Schiess&lt;/a&gt;, &lt;a href=&quot;http://twitter.com/glenmaddern&quot;&gt;Glen Maddern&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Changes (tl;dr)&lt;/h2&gt;
&lt;p&gt;The one big change is that both the classic Picky application and classic Picky sources have been removed. If you need these, please continue using 3.6.16.&lt;/p&gt;
&lt;p&gt;If you want to jump on 4.0, replace with a Sinatra app and convert your source into one that responds to &lt;code&gt;#each&lt;/code&gt; (See the Wiki on sources).&lt;/p&gt;
&lt;p&gt;Other important changes:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;Picky::Index&lt;/code&gt;, option &lt;code&gt;weights&lt;/code&gt; has been renamed to &lt;code&gt;weight&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;Picky uses the &lt;a href=&quot;https://github.com/kschiess/procrastinate&quot;&gt;procrastinate&lt;/a&gt; gem to parallelize indexing.&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Picky::Indexes.reload&lt;/code&gt; =&amp;gt; &lt;code&gt;Picky::Indexes.load&lt;/code&gt;, analog on &lt;code&gt;Index&lt;/code&gt;, &lt;code&gt;Category&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;If you call any &lt;code&gt;define_*&lt;/code&gt; methods, please remove the &lt;code&gt;define_&lt;/code&gt; part.&lt;/li&gt;
	&lt;li&gt;If you defined a &lt;code&gt;source { with a block }&lt;/code&gt;, the block is now evaluated each time the indexer runs on a category.&lt;/li&gt;
	&lt;li&gt;Rake task &lt;code&gt;rake index:parallel&lt;/code&gt; is used by &lt;code&gt;rake index&lt;/code&gt;. If you can&amp;#8217;t index in multiple processes, please use &lt;code&gt;rake index:serial&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Detailed Changes&lt;/h2&gt;
&lt;p&gt;This is for users that are currently on version 3.6.&amp;#215;. Extracted from the &lt;a href=&quot;https://github.com/floere/picky/blob/master/history.textile&quot;&gt;history.textile&lt;/a&gt; file:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; &lt;code&gt;Picky::Indexes.index&lt;/code&gt; does not index in parallel anymore.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Renamed &lt;code&gt;Picky::Indexes.index_for_tests&lt;/code&gt; to &lt;code&gt;Picky::Indexes.index&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) If you want to explicitly run parallel indexing programmatically, use &lt;code&gt;Picky::Indexes.index Picky::Scheduler.new(parallel: true)&lt;/code&gt; or &lt;code&gt;Picky::Indexes[:index_name].index Picky::Scheduler.new(parallel: true)&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Renamed &lt;code&gt;Picky::Wrappers::Category::ExactFirst&lt;/code&gt; to &lt;code&gt;Picky::Results::ExactFirst&lt;/code&gt;. Extend instead of wrap: &lt;code&gt;index.extend Results::ExactFirst&lt;/code&gt; or &lt;code&gt;category.extend Results::ExactFirst&lt;/code&gt;. If an index is extended, each category of the index will be extended.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; &lt;code&gt;Picky::Indexes.reload&lt;/code&gt; has been renamed to &lt;code&gt;Picky::Indexes.load&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; &lt;code&gt;index.reload&lt;/code&gt; has been renamed to &lt;code&gt;index.load&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; &lt;code&gt;category.reload&lt;/code&gt; has been renamed to &lt;code&gt;category.load&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Removed all &lt;code&gt;define_...&lt;/code&gt; methods on indexes.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Removed Picky classic application. Please use Picky e.g. in a Sinatra app.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Removed Picky classic sources. Please use a source with the #each method.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Option &lt;code&gt;weights&lt;/code&gt; for the &lt;code&gt;Picky::Index#category&lt;/code&gt; method has been renamed &lt;code&gt;weight&lt;/code&gt; to conform with the other methods.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Picky does not require the text gem anymore by default. Only when you use phonetic similarity. It will tell you what it needs.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Added the PICKY_ENVIRONMENT in front of the Redis key namespace to differentiate the various environments.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Removed &lt;code&gt;rake routes&lt;/code&gt; since only the classic server was able to provide it.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Removed the classic server from the generators.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Reverting customizeable backends from version 3.3.2. They are no longer available. Please use simple subclassing to achieve funky backends.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; SQLite &lt;code&gt;self_indexed&lt;/code&gt; and Redis &lt;code&gt;immediate&lt;/code&gt; option is now called &lt;code&gt;realtime&lt;/code&gt;, as changes go directly through to the actual backends, in &amp;#8220;realtime&amp;#8221;.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; The &lt;code&gt;tokenizer&lt;/code&gt; option for a category has been renamed to &lt;code&gt;indexing&lt;/code&gt;, to conform with the methods for the index and the sinatra app.&lt;/li&gt;
	&lt;li&gt;hanke: (server) &lt;span class=&quot;caps&quot;&gt;BREAKING&lt;/span&gt; Internal &lt;code&gt;Similarity#encoded&lt;/code&gt; method has been renamed to &lt;code&gt;#encode&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;hanke: (statistics) Overhauled statistics interface. Use &lt;code&gt;picky statistics log/search.log&lt;/code&gt; to start it.&lt;/li&gt;
	&lt;li&gt;hanke: (server) The &lt;code&gt;Index#source&lt;/code&gt; block is now evaluated every time an indexer runs.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Explicitly uses &lt;code&gt;Yajl::Encoder#encode&lt;/code&gt; for &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; encoding.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Fixed cases where even when no similarity was defined on a category, similar results were still found.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Rake task &lt;code&gt;index&lt;/code&gt; now points to task &lt;code&gt;index:parallel&lt;/code&gt; by default. Call &lt;code&gt;rake:serial&lt;/code&gt; to index serially.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Indexer calls &lt;code&gt;reconnect!&lt;/code&gt; on sources that support it.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Location/Volumetric/Geosearch rewritten.&lt;/li&gt;
	&lt;li&gt;hanke: (generators) Fixed integration specs for the generated &amp;#8220;all in one&amp;#8221; server/client.&lt;/li&gt;
	&lt;li&gt;hanke: (generators) Changed method calls to adapt to above changes.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Using the &lt;code&gt;procrastinate&lt;/code&gt; gem to parallelize indexing.&lt;/li&gt;
	&lt;li&gt;hanke: (server) Indexing call structure cleaned up. Improves performance by about 40%.&lt;/li&gt;
&lt;/ul&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Search&amp;nbsp;Performance&amp;nbsp;(Backends)</title>
   <link href="http://florianhanke.com/blog/2011/11/20/picky-search-performance.html"/>
   <updated>2011-11-20T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/11/20/picky-search-performance</id>
   <content type="html">&lt;p&gt;This is a post about &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; performance when searching in various backends.&lt;/p&gt;
&lt;p&gt;But first, a picture that was taken during the performance tests:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-picky-runs.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;How is taking this picture possible you ask? I am writing this from a hospital.&lt;/p&gt;
&lt;p&gt;Heh, no. Not really.&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;In the single-process/single-threaded case on one core of a 2.66 GHz i7 Macbook Pro, Picky&amp;#8217;s search performance ranges from 0.0001s for a single-word query on the memory backend to 0.01s for a three word query on the Redis backend. Around 0.0003s per query on the memory backend for a more realistic case.&lt;/p&gt;
&lt;h2&gt;Why?&lt;/h2&gt;
&lt;p&gt;We are currently working on designing the Picky backends, amongst other ideas, to enable realtime indexing.&lt;/p&gt;
&lt;p&gt;If you want to contribute a backend, please do!&lt;/p&gt;
&lt;h2&gt;The raw data&lt;/h2&gt;
&lt;p&gt;In descending order of performance, we evaluated four backends that are available: &lt;code&gt;Memory&lt;/code&gt;, &lt;code&gt;File&lt;/code&gt;, &lt;code&gt;SQLite&lt;/code&gt; (graciously donated by &lt;a href=&quot;http://twitter.com/rogerbraun&quot;&gt;Roger Braun&lt;/a&gt;) and the &lt;code&gt;Redis&lt;/code&gt; backend.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-table.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The 10 &amp;#8211; 100000 show the number of objects in the database. The columns 1-3 denote the complexity. 1 is just using one word, and 3 means we looked for three words.&lt;/p&gt;
&lt;p&gt;We were wondering about the Redis backend a bit, and also the file backend (see below). Memory and SQLite are as expected. What did we expect?&lt;/p&gt;
&lt;h2&gt;Expectations&lt;/h2&gt;
&lt;p&gt;All of the following charts show the three different complexity levels in various index sizes (objects indexed).&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-memory-file.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Since the memory backend runs fully in memory (duh), we get the best performance there. It&amp;#8217;s all fully in memory, so none of the dirty slow stuff even gets touched.&lt;/p&gt;
&lt;p&gt;With the exception of that dirty old man that touches everything, the Ruby Garbage Collector.&lt;/p&gt;
&lt;p&gt;The file backend (very naïve, &lt;a href=&quot;http://github.com/floere/picky/blob/master/server/lib/picky/backends/file/json.rb&quot;&gt;see here&lt;/a&gt;) surprised us a bit, since we are actually loading &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; encoded data from a file.&lt;/p&gt;
&lt;p&gt;However, seeking in Ruby and decoding with Yajl &lt;code&gt;Yajl::Parser.parse IO.read(cache_path, length, offset)&lt;/code&gt; is apparently quite fast.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-sqlite-redis.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Tests of a first draft of a SQLite database (by &lt;a href=&quot;http://twitter.com/rogerbraun&quot;&gt;Roger Braun&lt;/a&gt;) show lots of promise as well.&lt;/p&gt;
&lt;p&gt;Redis is rather slow, as expected. However, this is not just Redis&amp;#8217; fault. The current implementation does three roundtrips per simple internal query.&lt;/p&gt;
&lt;p&gt;For example, in the three words case, and having four different categories each word can be in results in 36 up to 72 roundtrips. And for that, the Redis backend performs very well.&lt;/p&gt;
&lt;p&gt;With the arrival of Redis 2.6.0, we will make use of the &lt;a href=&quot;http://antirez.com/post/scripting-branch-released.html&quot;&gt;Lua scripting&lt;/a&gt; and the &lt;a href=&quot;http://redis.io/commands/eval&quot;&gt;&lt;span class=&quot;caps&quot;&gt;EVALSHA&lt;/span&gt; command&lt;/a&gt; to divide the number of roundtrips by 3.&lt;/p&gt;
&lt;p&gt;That will, for a four category, three word query result in only 12 up to 24 roundtrips. Still a lot, but this should prove to be much faster.&lt;/p&gt;
&lt;p&gt;One Redis behaviour that surprised us a lot was that for the &amp;#8220;complexity 3&amp;#8221; case where we looked for three words, the performance of Redis in the graph remains constant. Why does it remain constant, and why doesn&amp;#8217;t it show the same behaviour?&lt;/p&gt;
&lt;p&gt;Turns out, the curve does exactly the same, but is squished, because the complexity tends to make a large difference to the baseline.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-redis-detail.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;If you look at just the &amp;#8220;complexity 3&amp;#8221; case (here in blue instead of yellow), we can see the same behaviour.&lt;/p&gt;
&lt;p&gt;What happens is that for the multi-word case, the amount of expensive roundtrips shoots up. The amount of combinatorics and calculations that Picky does is just the cherry on top of a large roundtrip cake.&lt;/p&gt;
&lt;p&gt;For four words, this would be even worse: We would have to search for the line around 0.02s.&lt;/p&gt;
&lt;p&gt;We hope to reduce this greatly with Redis 2.6.0 and expect a 3-4x speed increase.&lt;/p&gt;
&lt;h2&gt;Comparisons&lt;/h2&gt;
&lt;p&gt;Comparing each of the complexity cases (1 word, 2 words, 3 words) for the backends, they are nicely evenly spaced apart.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-20-complexity.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;That is, on a log scale. From &lt;code&gt;Memory&lt;/code&gt; to &lt;code&gt;File&lt;/code&gt;, from &lt;code&gt;File&lt;/code&gt; to &lt;code&gt;SQLite&lt;/code&gt;, from &lt;code&gt;SQLite&lt;/code&gt; to &lt;code&gt;Redis&lt;/code&gt; we each have about a 2x query time increase. Comparing &lt;code&gt;Memory&lt;/code&gt; and &lt;code&gt;Redis&lt;/code&gt;, we thus get about a 8x increase (actually, more like 10x).&lt;/p&gt;
&lt;p&gt;While for the one word case, the data remains quite flat as the index size increases, the impact on performance is very noticeable in the three word cases.&lt;/p&gt;
&lt;p&gt;A note on the index sizes:
Yes, 100&amp;#8217;000 entries is not a very realistic size (we do not have access to large servers yet). But it is enough to see Picky&amp;#8217;s behaviour regarding speed. However, the curves behaviour is quite predictable and can be extrapolated from the curves seen above.&lt;/p&gt;
&lt;p&gt;For example, if you extend the curve of the memory case to 1000 times the size (to 100&amp;#8217;000&amp;#8217;000 entries): The complexity case 1 it arrives at 0.0002s, in the complexity case 3, at around 0.005s.&lt;/p&gt;
&lt;p&gt;In the case of 15&amp;#8217;000&amp;#8217;000 entries, this is exactly what we found to be true for the memory case. Please see
&lt;a href=&quot;http://florianhanke.com/picky/enterprise.html#use_case_1&quot;&gt;use case 1&lt;/a&gt; on the Picky page.&lt;/p&gt;
&lt;h2&gt;Selecting a backend&lt;/h2&gt;
&lt;p&gt;What does it mean for you when choosing the backend?&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;If you need a realtime index, then the only backend that supports this is the &lt;code&gt;Memory&lt;/code&gt; backend (current version at the time of this post is 3.5.4). We are working on getting the others up to speed, but this is what&amp;#8217;s there for now.&lt;/li&gt;
	&lt;li&gt;If you need persistence and/or distributed Pickies, we recommend the Redis backend. Speed may not be fantastic, but from Redis 2.6.0 on it will be quite a bit faster. We predict around 3-4 times faster.&lt;/li&gt;
	&lt;li&gt;The &lt;code&gt;File&lt;/code&gt; and &lt;code&gt;SQLite&lt;/code&gt; backends are still in development. Use the &lt;code&gt;File&lt;/code&gt; backend when you have a static index and do not want to use too much memory. The same holds for the &lt;code&gt;SQLite&lt;/code&gt; backend, with the improvement that you have all the SQLite tools at your service.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As usual, it&amp;#8217;s a tradeoff between speed, space, tools etc.&lt;/p&gt;
&lt;h2&gt;The code&lt;/h2&gt;
&lt;p&gt;The code for these tests is here:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://github.com/floere/picky/blob/master/server/performance_tests/search.rb&quot;&gt;http://github.com/floere/picky/blob/master/server/performance_tests/search.rb&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;We generated sets of 10-100000 indexed things, each with 4 categories and an id. Then we randomly selected data from the indexes and in roughly half of the cases are searching for just part of the word for which Picky uses a partial search.&lt;/p&gt;
&lt;p&gt;We ran 100 random queries each, and divided the resulting time by 100 to get an average per-query-time.&lt;/p&gt;
&lt;h2&gt;A note on combinatorial search engines&lt;/h2&gt;
&lt;p&gt;Combinatorial search engines are hard to performance test.&lt;/p&gt;
&lt;p&gt;If in a phone book search on Picky you search for &amp;#8220;peter paul victoria&amp;#8221;, Picky evaluates what you are most likely looking for. This involves a fair bit of calculation.&lt;/p&gt;
&lt;p&gt;In the mentioned case, if &amp;#8220;peter&amp;#8221; can be a first name, name, street, city, and the other words are similarly ambiguous, then Picky has to look at all the possible combinations and has to find out which one is the one that is most likely, based on the &lt;a href=&quot;http://florianhanke.com/picky/documentation.html#indexes-categories-weights&quot;&gt;weights&lt;/a&gt; and &lt;a href=&quot;http://florianhanke.com/picky/documentation.html#search-options-boost&quot;&gt;boost&lt;/a&gt; you defined.&lt;/p&gt;
&lt;p&gt;Now, this is very dependent on the data underlying it. So I tried to use relatively standard data.&lt;/p&gt;
&lt;p&gt;So, in closing, it must be said that it is hard to compare this style of search engine to one of the generic search engines. But Picky would really like to take one on soon ;)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Update&amp;nbsp;Performance</title>
   <link href="http://florianhanke.com/blog/2011/11/13/picky-update-performance.html"/>
   <updated>2011-11-13T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/11/13/picky-update-performance</id>
   <content type="html">&lt;p&gt;This is a post about &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; performance when updating realtime indexes.&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;In the single-process/single-threaded case on one core of a 2.66 GHz i7 Macbook Pro, Picky realtime index update performance ranges from 500 updates/s to 25&amp;#8217;700 updates/s. Around 2&amp;#8217;300 to 5&amp;#8217;100 updates/s for a default case.&lt;/p&gt;
&lt;h2&gt;Quick realtime index refresher&lt;/h2&gt;
&lt;p&gt;If you didn&amp;#8217;t know, since 3.2.0, you can add/remove/replace (update) objects from and to a Picky index. In realtime.&lt;/p&gt;
&lt;p&gt;For example,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Index.new :things do
  category :text
end

index.replace thing_with_text_method&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;would replace the index data for the &lt;code&gt;thing_with_text_method&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If you added a search interface for the index,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;things = Search.new index&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you could also search for it and it would return different things if you changed the index in between.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;things.search &quot;some thing&quot; # =&amp;gt; Finds the thing.

index.remove thing_with_text_method.id

things.search &quot;some thing&quot; # =&amp;gt; Finds it no more.&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The Setup&lt;/h2&gt;
&lt;p&gt;All numbers are valid for a 2.66 GHz i7 Macbook Pro (one core of it) with 4GB 1067 MHz DDR3 &lt;span class=&quot;caps&quot;&gt;RAM&lt;/span&gt;, using Picky 3.5.3 on ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0].&lt;/p&gt;
&lt;p&gt;For testing performance, we randomly pregenerated a large set of objects with methods &lt;code&gt;id&lt;/code&gt;, &lt;code&gt;user&lt;/code&gt; (8 random characters), &lt;code&gt;text1&lt;/code&gt; (20 random characters, 26 from the alphabet, 5 spaces), and &lt;code&gt;text2&lt;/code&gt;, &lt;code&gt;text3&lt;/code&gt;, &lt;code&gt;text4&lt;/code&gt; (see generation of text1).&lt;/p&gt;
&lt;p&gt;Then, we used the following index. To make everything easier, we used a config variable to enable/disable categories.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;include Picky

config = 0 # 1, 2, 3, 4.

index = Index.new :things do

  weights    = Weights::Default    # These configurations were changed.
  partial    = Partial::Default    #
  similarity = Similarity::Default #

  if config &amp;gt;= 0
    category :user,
             weights:    weights,
             partial:    partial,
             similarity: similarity
  end
  if config &amp;gt;= 1
    category :text1,
             weights:    weights,
             partial:    partial,
             similarity: similarity
  end
  if config &amp;gt;= 2
    category :text2,
             weights:    weights,
             partial:    partial,
             similarity: similarity
  end
  if config &amp;gt;= 3
    category :text3,
             weights:    weights,
             partial:    partial,
             similarity: similarity
  end
  if config &amp;gt;= 4
    category :text4,
             weights:    weights,
             partial:    partial,
             similarity: similarity
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If &lt;code&gt;config&lt;/code&gt; was for example &lt;code&gt;3&lt;/code&gt;, Picky used categories &lt;code&gt;:user&lt;/code&gt;, &lt;code&gt;:text1&lt;/code&gt;, &lt;code&gt;:text2&lt;/code&gt; and &lt;code&gt;:text3&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;We then varied the configurations for &lt;code&gt;weights&lt;/code&gt;, &lt;code&gt;partial&lt;/code&gt;, &lt;code&gt;similarity&lt;/code&gt;. Weights control how the categories are weighed. Partial how you can search partially (like just for the first character or only for the exact word). And similarity defines if you can search for similar words to the one you entered.&lt;/p&gt;
&lt;p&gt;We indexed until the average update/s value stabilized.&lt;/p&gt;
&lt;h2&gt;The Chart&lt;/h2&gt;
&lt;p&gt;A quick explanation of the legend: It is ordered weights/partial/similarity. From fastest to slowest, the options were…&lt;/p&gt;
&lt;p&gt;Weights (2): Constant, Default/Logarithmic.&lt;/p&gt;
&lt;p&gt;Partial (3): None, Default/Postfix(from: -3), Postfix(from: 1).&lt;/p&gt;
&lt;p&gt;Similarity (3): Default/None, Soundex(3), DoubleMetaphone(3).&lt;/p&gt;
&lt;p&gt;We did not explore all combinations. The numbers were rounded down to the nearest hundreds.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-13-updating-chart.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;From left to right, we first indexed just the user category, then successively added text1, text2, text3, and text4.&lt;/p&gt;
&lt;p&gt;The baseline (not shown), when no category was defined, was &lt;code&gt;212'000&lt;/code&gt; updates/s.&lt;/p&gt;
&lt;p&gt;The absolute winner is indexing just the 8 character user category, with a constant weight, no partial indexing, and no similarity, at &lt;code&gt;25'700&lt;/code&gt; u/s.&lt;/p&gt;
&lt;p&gt;Usually though, you&amp;#8217;d want Picky&amp;#8217;s weighing and scoring to be used. So, the same scenario, with no partial/no similarity yields a speed of &lt;code&gt;22'000&lt;/code&gt; u/s. In a more realistic case with 3 text categories and 1 user category, it is &lt;code&gt;5'400&lt;/code&gt; u/s.&lt;/p&gt;
&lt;p&gt;For added convenience, you&amp;#8217;d use the default partial algorithm, which also includes parts of words, from the 3rd last character of a word to the last. With default weighing, and no similarity, this yields &lt;code&gt;1'600&lt;/code&gt; up to &lt;code&gt;10'400&lt;/code&gt; u/s. This is with all settings to default (Default/Default/Default), as if you had defined nothing:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :text1 # etc.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you are interested in similarity, but not partial, the numbers range from &lt;code&gt;900&lt;/code&gt; to &lt;code&gt;6'200&lt;/code&gt; u/s.&lt;/p&gt;
&lt;p&gt;The most brutal case, standard weighing, full partial, and best similarity costs dearly: Only &lt;code&gt;500&lt;/code&gt; up to &lt;code&gt;4'700&lt;/code&gt; updates per second.&lt;/p&gt;
&lt;p&gt;If you like pure numbers better than a graph, here&amp;#8217;s a table for you:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-11-13-updating-table.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now, let&amp;#8217;s all &amp;#8220;jump&amp;#8221; to conclusions! ;)&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;First of all, we are very happy with the numbers. We did expect a much lower performance. (Sorry, Picky)&lt;/p&gt;
&lt;p&gt;How Picky weighs and scores the data didn&amp;#8217;t impact the results much. This is no surprise, as no string manipulation is done.&lt;/p&gt;
&lt;p&gt;Partial indexing impact was what we expected. Around a 40% to 60% reduction from (Default/None/Default) to (Default/Default/Default) in speed depending on how many text categories were indexed. The jump to (Default/Postfix(from: 1)/Default) – an all inclusive partial – is around 50% to around 70%.&lt;/p&gt;
&lt;p&gt;The worst impact comes from similarity indexing: Using similarity brings down indexing speed to about 25% (no partial) to 50% (also using partial).&lt;/p&gt;
&lt;p&gt;The big takeaway: Text categories with much content are most important, followed by whether you do similarity, followed by whether you do partial searches. Weighing almost plays no role.&lt;/p&gt;
&lt;p&gt;The bigger takeaway: Picky is fast when updating indexes. And does it in realtime. On Ruby.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Search&amp;nbsp;Engine&amp;nbsp;in&amp;nbsp;a&amp;nbsp;Script</title>
   <link href="http://florianhanke.com/blog/2011/10/26/search-engine-in-a-script.html"/>
   <updated>2011-10-26T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/10/26/search-engine-in-a-script</id>
   <content type="html">&lt;p&gt;This is a post about running &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; in a small script.&lt;/p&gt;
&lt;h2&gt;Design Philosopy&lt;/h2&gt;
&lt;p&gt;You all know that with Picky we want the full flexibility of Ruby.&lt;/p&gt;
&lt;p&gt;What we also want is a search engine that runs with a minimal setup. Small and sweet. Portable, lightweight, bam!&lt;/p&gt;
&lt;p&gt;In short: Picky wants to be the &lt;a href=&quot;http://sinatrarb.com&quot;&gt;Sinatra&lt;/a&gt; of search engines. Did we achieve this? Not yet, but we are very close indeed.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s have a quick dance in the rain.&lt;/p&gt;
&lt;h2&gt;The code&lt;/h2&gt;
&lt;p&gt;Go ahead and replace Picky with a very small script! You can copy, right? And paste? Ok. Go.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Possible since Picky 3.2.0.
#
require 'picky'

include Picky

Thing = Struct.new :id, :name

index = Index.new :test do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
end

index.replace Thing.new(1, 'Picky')
index.replace Thing.new(2, 'Parslet')

things = Search.new(index) do
  boost [:name] =&amp;gt; +3
end

p things.search(&quot;Pick&quot;).ids
p things.search(&quot;Pic&quot;).to_hash
p things.search(&quot;Parsley~&quot;).allocations&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it. Easy to just try something, and later evolve into a fully fleshed, super-powerful search engine.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Twitter&amp;nbsp;Account</title>
   <link href="http://florianhanke.com/blog/2011/10/23/twitter.html"/>
   <updated>2011-10-23T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/10/23/twitter</id>
   <content type="html">&lt;p&gt;Our loveable octopus, &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt;, got a Twitter account under &lt;a href=&quot;http://twitter.com/picky_rb&quot;&gt;@picky_rb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In one of his first tweets he noted &lt;a href=&quot;http://twitter.com/#!/picky_rb/status/127674363632828416&quot;&gt;Man I look fat in my profile image. I really should cut back on crabs and molluscs.&lt;/a&gt;. Yeah, you should. Also on rich indexes, I might add.&lt;/p&gt;
&lt;p&gt;He will not tweet very often, apparently, as he mentions in &lt;a href=&quot;http://twitter.com/#!/picky_rb/status/127674733134221313&quot;&gt;this tweet&lt;/a&gt;. Just version updates and some personal life stuff.&lt;/p&gt;
&lt;p&gt;He can be a bit of a blabbermouth. Let&amp;#8217;s hope he can control himself.&lt;/p&gt;
&lt;p&gt;Last thing I heard he was engaged in a semi-epic battle in the Mariana Trench with his bigger buddies, against whales. In a DM, in his usual style he wrote me: &amp;#8220;Battling the big blue ones. 5 suitably categorized targets found in 0.000013s. Wish me luck! P.S: Don&amp;#8217;t snack on my sushi.&amp;#8221;&lt;/p&gt;
&lt;p&gt;I hope he makes it out alive. We need to get going on this realtime indexes update.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Designing&amp;nbsp;Realtime&amp;nbsp;Indexes</title>
   <link href="http://florianhanke.com/blog/2011/10/23/designing-realtime-indexes.html"/>
   <updated>2011-10-23T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/10/23/designing-realtime-indexes</id>
   <content type="html">&lt;p&gt;This is a post about designing realtime indexes for  &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Realtime indexes are an exciting thing! The possibility of inserting something into a search engine, then having the thing pop up immediately in results is fantastic. Wouldn&amp;#8217;t you love that in Picky? Man, me too!&lt;/p&gt;
&lt;p&gt;Too bad that we yet have to implement it. Heh.&lt;/p&gt;
&lt;p&gt;On the other hand, some good &lt;span class=&quot;caps&quot;&gt;TDD&lt;/span&gt; should do it. &amp;#8220;&lt;span class=&quot;caps&quot;&gt;TDD&lt;/span&gt;&amp;#8221; you ask? &lt;span class=&quot;caps&quot;&gt;TDD&lt;/span&gt; of course, is the noble activity of Thought Driven Development. Also known as &lt;span class=&quot;caps&quot;&gt;QDD&lt;/span&gt;, Question Driven Development.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s fire up the cranial engines and get the gray matter bubbling.&lt;/p&gt;
&lt;p&gt;Specifically, I&amp;#8217;d like to talk about the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, and how to implement it in Picky. Along the way I will touch on the inverted index, necessary bookkeeping, how I will implement it, how to use it and how &lt;strong&gt;not&lt;/strong&gt; to use it, the latter being more important than the former.&lt;/p&gt;
&lt;h2&gt;What is a realtime index?&lt;/h2&gt;
&lt;p&gt;A realtime index is an index that has the ability to have e.g. text indexed at runtime, and returning results for that text &lt;strong&gt;immediately&lt;/strong&gt; after indexing.&lt;/p&gt;
&lt;p&gt;One example for a Ruby realtime search engine is &lt;a href=&quot;http://masanjin.net/whistlepig/&quot;&gt;Whistlepig&lt;/a&gt; by &lt;a href=&quot;http://twitter.com/wm&quot;&gt;William Morgan&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Ok, let&amp;#8217;s talk about what we want.&lt;/p&gt;
&lt;h2&gt;What I want&lt;/h2&gt;
&lt;p&gt;In ways, writing Picky has been like being the first person in a group of mountaineers: You climb a mountain. It&amp;#8217;s a bit taxing. But we take it one step at a time. The summit is visible at all times. Meaning: Goal clear, steps towards it too.&lt;/p&gt;
&lt;p&gt;As a first step towards a multi-process, multi-threaded realtime index we&amp;#8217;d like to get it working for a single process, for a single thread. Then cross the next bridge when we get to it.&lt;/p&gt;
&lt;p&gt;When designing software, I have yet to see a case where designing multiple things at the same time is better than focusing on a single thing at first.&lt;/p&gt;
&lt;p&gt;Ok, so let&amp;#8217;s just look at what we want:&lt;/p&gt;
&lt;h2&gt;&lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s offer three methods on an index:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;#remove(id)&lt;/code&gt; (Removes an element with a given id from the index)&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;#add(object)&lt;/code&gt; (Adds an element with a given id to the index)&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;#replace(object)&lt;/code&gt; (&lt;code&gt;#remove&lt;/code&gt; followed by &lt;code&gt;#add&lt;/code&gt; – this is what you&amp;#8217;d use usually)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We could just do the replace method first, but there might be cases where you&amp;#8217;d want to remove and add separately. When you have a producer and a consumer, for example.&lt;/p&gt;
&lt;p&gt;What would they return? I&amp;#8217;m quite unsure yet. Let&amp;#8217;s leave that out for later. Maybe you have some ideas?&lt;/p&gt;
&lt;p&gt;How would we call these?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Picky::Index.new(:example) do
  # index definition
end
index.remove(13)
index.add(thing)
index.replace(thing_that_responds_to_the_id_method)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You call these methods from a &lt;a href=&quot;http://sinatrarb.com&quot;&gt;Sinatra&lt;/a&gt; or Rails action, from a Signal trap, etc.&lt;/p&gt;
&lt;h2&gt;What I do (not yet) want&lt;/h2&gt;
&lt;p&gt;To focus on a good &lt;span class=&quot;caps&quot;&gt;SRP&lt;/span&gt; implementation of realtime indexes, we won&amp;#8217;t (yet) implement:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Multiprocess&lt;/li&gt;
	&lt;li&gt;Multithreading&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and work for the (assumed) 80% case where we want to have more recent objects sorted at the top of the results (Whistlepig also does this).&lt;/p&gt;
&lt;p&gt;So, realtime indexes will be sorted initially like a normal index, but will then gravitate towards a &amp;#8220;most recent first&amp;#8221; sorting.&lt;/p&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;So, what we want is a realtime index that lets us add and remove elements at runtime. Elements that are removed will not show up anymore in the results and elements that are added will show up on top of the results.&lt;/p&gt;
&lt;p&gt;So,&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;code&gt;thing_search.search &quot;blah&quot; # =&amp;gt; [1,2,3]&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;index_of_thing_search.add(thing_with_id_5_and_text_blah)&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;thing_search.search &quot;blah&quot; # =&amp;gt; [5,1,2,3]&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;is what we want.&lt;/p&gt;
&lt;p&gt;That is an (assumed) default case and this is what we will go for in this first implementation.&lt;/p&gt;
&lt;h2&gt;The inverted index&lt;/h2&gt;
&lt;p&gt;Amongst other things, Picky contains an &lt;a href=&quot;http://en.wikipedia.org/wiki/Inverted_index&quot;&gt;Inverted Index&lt;/a&gt; that is central to most search engines.&lt;/p&gt;
&lt;p&gt;We&amp;#8217;ll review it quickly so you can follow the implementation.&lt;/p&gt;
&lt;p&gt;In its simplest form, the inverted index saves tokens that point to a list of ids.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :token1 =&amp;gt; [1,4,2,5,6],
  :token2 =&amp;gt; [3,4,8,2]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and so on. This makes it easy to look up text that the user is looking for. Just do a&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ids = inverted_index[text]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and you have all the ids that contain that text.&lt;/p&gt;
&lt;p&gt;Picky has quite a few more internal indexes that help it look stuff up, but we&amp;#8217;ll focus on the inverted index here.&lt;/p&gt;
&lt;p&gt;All clear? Now let&amp;#8217;s add realtime indexing to that.&lt;/p&gt;
&lt;h2&gt;The naive approach&lt;/h2&gt;
&lt;p&gt;So, given that we have an inverted index like&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :picky =&amp;gt; [1,2,3,4,5],
  :whistlepig =&amp;gt; [5,6,7,8,9]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and we want to remove an id, say &lt;code&gt;5&lt;/code&gt;, in a naive way.&lt;/p&gt;
&lt;p&gt;We could just iterate over all arrays on the values side of the hash. Here, this would be easy:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;inverted_index.each do |_, ids|
  ids.delete id_to_remove
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You probably already see the problem. On a 12GB index (the first Picky production use case), this would take a loooong time.&lt;/p&gt;
&lt;p&gt;So, although nice and very understandable, this is not feasible.&lt;/p&gt;
&lt;p&gt;We need to make it faster.&lt;/p&gt;
&lt;h2&gt;A better way&lt;/h2&gt;
&lt;p&gt;Q: How do you make something faster in computer science?&lt;/p&gt;
&lt;p&gt;A: Get a bigger computer?&lt;/p&gt;
&lt;p&gt;A: More processors?&lt;/p&gt;
&lt;p&gt;A: Uh, why are you looking at me like that?&lt;/p&gt;
&lt;p&gt;Q: &lt;b&gt;whacks student with a large trout&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;But seriously, if you want to get &lt;strong&gt;speed&lt;/strong&gt;, you have to sacrifice &lt;strong&gt;space&lt;/strong&gt;. Hello, age-old trade-off.&lt;/p&gt;
&lt;p&gt;This always means adding some sort of data structure, since when I say space, I mean data structures. And this means complexity. From which follows that we have consistency troubles ahead of us.&lt;/p&gt;
&lt;p&gt;Anyway, on with it!&lt;/p&gt;
&lt;h3&gt;The fast approach that needs some bookkeeping&amp;#8482;&lt;/h3&gt;
&lt;p&gt;So, instead of iterating over all id arrays, we should remember which array had a certain id in it.&lt;/p&gt;
&lt;p&gt;How would you do this?&lt;/p&gt;
&lt;p&gt;Hello Mr. Hash. We remember which id was in which array. So we have a telephone book of ids that maps to the id array &lt;em&gt;references&lt;/em&gt;, such that:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  1 =&amp;gt; [[1,2,3,4,5]],             # reference
  5 =&amp;gt; [[1,2,3,4,5], [5,6,7,8,9]] # references
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we can ask this mapping to find out incredibly quickly which arrays we need to update:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;array_of_id_arrays = mapping[5]&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;A bit of a kicker / homework&lt;/h3&gt;
&lt;p&gt;I&amp;#8217;ve got a question for you:&lt;/p&gt;
&lt;p&gt;In the case of removing an id, how would you remove it? Look at&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;mapping[5].each do |id_array|
  id_array.delete 5
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Does this work? Has the array in the hash changed? If so, why? If not, why not?&lt;/p&gt;
&lt;p&gt;Hint: It&amp;#8217;s not an accident I was talking about &lt;em&gt;references&lt;/em&gt;, above.&lt;/p&gt;
&lt;h3&gt;Adding&lt;/h3&gt;
&lt;p&gt;Removing is relatively easy. How about adding?&lt;/p&gt;
&lt;p&gt;When adding, we process the data to get tokens, then look up each token in the inverted index, prepending the id to the id_array.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;tokens.each do |token|
  inverted_index[token].unshift id
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Easy as well.&lt;/p&gt;
&lt;p&gt;Note: This is only a good thing to do if the id isn&amp;#8217;t in the index yet.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;When just looking at the inverted index, realtime indexing looks rather easy doesn&amp;#8217;t it?&lt;/p&gt;
&lt;p&gt;Well, I hope it does so now, to you, I also hope that the basics of search engines seem less daunting to you now :)&lt;/p&gt;
&lt;p&gt;It will be a bit more complicated to implement, as a few more internal indexes need to be held consistent, but as usual, a large array of tests should help with that.&lt;/p&gt;
&lt;h2&gt;Caveats&lt;/h2&gt;
&lt;p&gt;This implementation completely ignores the case where Picky runs in multiple processes (i.e. in Unicorn), or in multiple threads. But we&amp;#8217;ll cross that bridge when we get to it. These concerns are completely orthogonal, thus it&amp;#8217;s a good thing to separate thinking about them. As usual.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Case&amp;nbsp;Study&amp;#58;&amp;nbsp;Single&amp;nbsp;Server&amp;nbsp;App&amp;nbsp;for&amp;nbsp;Heroku</title>
   <link href="http://florianhanke.com/blog/2011/09/11/picky-case-study-single-server-app-for-heroku.html"/>
   <updated>2011-09-11T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/09/11/picky-case-study-single-server-app-for-heroku</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;This is about running a Picky search on a single server on Heroku.&lt;/p&gt;
&lt;p&gt;Skipping options:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;#heroku&quot;&gt;Skip the Intro, but what is Heroku?&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#singleserverapp&quot;&gt;Skip the Intro, I know what Heroku is.&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;Last night you got together with your friends. Beer flowed freely, smoothly moved over to wine, Caipirinhas. The sizzling of meat on a grill. Chicken hearts. Entrecôtes.&lt;/p&gt;
&lt;p&gt;Then, pure Vodka, shots, maybe even as far as &lt;a href=&quot;http://www.drinksmixer.com/drink1026.html&quot;&gt;Baltimore Zoos&lt;/a&gt;. Women. Making out.&lt;/p&gt;
&lt;p&gt;The night drags on. One of your friends mistakes the kitchen for a toilet. Dancing on tables. The police visits multiple times. Sausages.&lt;/p&gt;
&lt;p&gt;The policemen decide to join you. Vomit. Promises. Friendships.&lt;/p&gt;
&lt;p&gt;And dares. You are the computer dude of the group.&lt;/p&gt;
&lt;p&gt;&amp;#8220;Make a new Google in a day!&amp;#8221; someone shouts. &amp;#8220;I dare you!&amp;#8221;&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s the last thing you remember as you dive nose first into an Aperol Spritz.&lt;/p&gt;
&lt;p&gt;Make that &amp;#8220;eye first&amp;#8221;.&lt;/p&gt;
&lt;h3&gt;The next day&lt;/h3&gt;
&lt;p&gt;You wake up with a grandmother of a hangover. A lingering smell of meat and vomit, caked on your lips. Ketchup stains. Who is that girl on the floor?&lt;/p&gt;
&lt;p&gt;Blearily, you wander to your computer, take a look at your emails, a swig of water, a munch on raw bacon. Shit.&lt;/p&gt;
&lt;p&gt;There is it. The email you&amp;#8217;ve been dreading. A dare and promise forged in blood: &amp;#8220;Make a drinks search engine. You have until midnight.&amp;#8221;&lt;/p&gt;
&lt;h3&gt;Picky&lt;/h3&gt;
&lt;p&gt;You barely remember a blog post by a crazy dude called Florian Hanke, always touting a search engine&amp;#8217;s simplicity and usability, on using it with Heroku. Man, that guy is crazy. Fucking foaming at the mouth.&lt;/p&gt;
&lt;p&gt;What was it called again? &amp;#8220;Pinky&amp;#8221;? What a silly name.&lt;/p&gt;
&lt;p&gt;Maybe he&amp;#8217;s right, though. Let&amp;#8217;s see.&lt;/p&gt;
&lt;p&gt;You try to navigate Google, but the search bar keeps moving. It&amp;#8217;s like being seasick, but on the interwebs. Man, totally netsick. Heh, netsick. &lt;strong&gt;snort&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There it is. Found it. Man, thank goodness it&amp;#8217;s rather short.&lt;/p&gt;
&lt;h2 id=&quot;heroku&quot;&gt;Heroku&lt;/h2&gt;
&lt;p&gt;This use case uses Heroku.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.heroku.com/&quot;&gt;Heroku&lt;/a&gt; is a great place to host your small search engine. They are very generous in offering free servers for your projects.&lt;/p&gt;
&lt;p&gt;The original &lt;a href=&quot;http://gemsearch.heroku.com/&quot;&gt;GemSearch&lt;/a&gt; was running on two servers. One for running the web app, one for running the actual Picky server. Read more about it &lt;a href=&quot;http://florianhanke.com/blog/2011/02/13/a-better-rubygems-search.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This was problematic, since the data for the index needed to be on both servers. Once as an index, and once for rendering, in the web app.&lt;/p&gt;
&lt;p&gt;Another thing was that free Heroku servers are started up on demand. This meant waiting a little for the web app, then waiting for the search server. Many people were wondering why their search was taking so long.&lt;/p&gt;
&lt;p&gt;We can speed this up by moving the web app and the search server into a single Heroku server.&lt;/p&gt;
&lt;h2 id=&quot;singleserverapp&quot;&gt;Single Server App&lt;/h2&gt;
&lt;p&gt;Picky 3.0+ offers the possibility of generating single server apps (aka &amp;#8220;all in one&amp;#8221;). Just type:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ picky generate all_in_one drinks&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to generate such an app in the &lt;code&gt;drinks&lt;/code&gt; directory. This app combines the Picky server with the web app.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;app.rb&lt;/code&gt; represents the web app and the search server in one (the separate areas are clearly marked). The &lt;code&gt;images&lt;/code&gt;, &lt;code&gt;javascripts&lt;/code&gt;, &lt;code&gt;stylesheets&lt;/code&gt; and &lt;code&gt;views&lt;/code&gt; directories belong to the web app. And the &lt;code&gt;index&lt;/code&gt; directory is from the server.&lt;/p&gt;
&lt;p&gt;With this in mind, adapt it to your needs.&lt;/p&gt;
&lt;h2&gt;Herokuizing this Single Server App&lt;/h2&gt;
&lt;p&gt;Four simple steps:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;First, make it a Heroku app: &lt;a href=&quot;http://devcenter.heroku.com/articles/quickstart&quot;&gt;http://devcenter.heroku.com/articles/quickstart&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;Index your data: &lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ PICKY_ENV=production bundle exec rake index&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
	&lt;li&gt;Then, check the production index into git. The app loads the index from there.&lt;/li&gt;
	&lt;li&gt;Finally, let it loose: &lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;git push heroku master&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;One example of this is the &lt;a href=&quot;http://gemsearch.heroku.com/&quot;&gt;Gem search&lt;/a&gt;. The code is &lt;a href=&quot;https://github.com/floere/gemsearch&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Outro&lt;/h2&gt;
&lt;p&gt;After two hours you&amp;#8217;re done. A bit of sun next to the lake does you good. Over the iPhone you look up that crazy drink you&amp;#8217;re having, &lt;a href=&quot;http://www.drinksmixer.com/drink6373.html&quot;&gt;The Ricky Martini&lt;/a&gt;. Man, where do they find these bartenders?&lt;/p&gt;
&lt;p&gt;Smooth. It works. Rose&amp;#8217;s Lime Juice? It&amp;#8217;s good, though.&lt;/p&gt;
&lt;p&gt;Your end of the dare is met.&lt;/p&gt;
&lt;p&gt;With a broad grin you type your friend&amp;#8217;s email address. Your dare. His turn.&lt;/p&gt;
&lt;p&gt;You&amp;#8217;re wondering though where he&amp;#8217;s going to get a &lt;a href=&quot;http://en.wikipedia.org/wiki/Ballet_tutu&quot;&gt;Tutu&lt;/a&gt; and a &lt;a href=&quot;http://en.wikipedia.org/wiki/Kick_scooter&quot;&gt;Scooter&lt;/a&gt; on a Sunday…&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Ignoring&amp;nbsp;Unassigned&amp;nbsp;Tokens</title>
   <link href="http://florianhanke.com/blog/2011/09/05/picky-ignoring-unassigned-tokens.html"/>
   <updated>2011-09-05T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/09/05/picky-ignoring-unassigned-tokens</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;It is about a new &lt;code&gt;Search&lt;/code&gt; object option &lt;code&gt;ignore_unassigned_tokens&lt;/code&gt; that is exposed from version 3.1.5 onwards. It allows you to tell Picky that it should just ignore any tokens which cannot be found in an index.&lt;/p&gt;
&lt;p&gt;This is how you set it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Search.new my_index do
  ignore_unassigned_tokens true
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The option was buried in an internal &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; but slowly made its way out to the &lt;code&gt;Search&lt;/code&gt; object (see last post).&lt;/p&gt;
&lt;h2&gt;Ignoring unassigned tokens&lt;/h2&gt;
&lt;p&gt;What do I mean by this?&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s say you are searching for &lt;code&gt;&quot;Chicken Cajun Style&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Picky only has &amp;#8220;Chicken&amp;#8221; and &amp;#8220;Cajun&amp;#8221; indexed, as a recipe title.&lt;/p&gt;
&lt;p&gt;What happens is: Picky will find the token &amp;#8220;Chicken&amp;#8221; in the title category, and the token &amp;#8220;Cajun&amp;#8221;, also in the title category. But it won&amp;#8217;t find &amp;#8220;Style&amp;#8221; anywhere in the index. It might, but not for the same indexed object.&lt;/p&gt;
&lt;p&gt;So Picky will return an empty result set.&lt;/p&gt;
&lt;p&gt;So maybe you want to make Picky more forgiving.&lt;/p&gt;
&lt;p&gt;One way to do this is to tell it to ignore unassignable/unassigned tokens. This means that if a token cannot be matched to any category, it will be thrown away.&lt;/p&gt;
&lt;p&gt;So, in the example above, Picky would return the results for &lt;code&gt;&quot;Chicken Cajun&quot;&lt;/code&gt;. It&amp;#8217;s as if the &amp;#8220;Style&amp;#8221; had never existed.&lt;/p&gt;
&lt;h2&gt;An idea on how to use this&lt;/h2&gt;
&lt;p&gt;One idea on how to use this is in an implicit search, separate from the main search.&lt;/p&gt;
&lt;p&gt;So you have a main search, using the Picky interface, but also a space where you show relevant ads.&lt;/p&gt;
&lt;p&gt;Say you have a &lt;code&gt;Car&lt;/code&gt; model, with advertisements attached. If someone searches for a car, it will show relevant ads.&lt;/p&gt;
&lt;p&gt;In the code you&amp;#8217;d have:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;cars_search = Search.new cars_index

ads_search = Search.new cars_index do
  ignore_unassigned_tokens true
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then you&amp;#8217;d do two searches. The idea here is – even if there is no exact result in the main search – to show anything that is in any way related to the query. (See the case study on location based ads three posts back on how to fine-tune this)&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it – hope it inspires you to try Picky be more lenient, or perhaps this was exactly what you were looking for!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">A&amp;nbsp;quick&amp;nbsp;note&amp;nbsp;on&amp;nbsp;APIs</title>
   <link href="http://florianhanke.com/blog/2011/09/04/a-quick-note-on-apis.html"/>
   <updated>2011-09-04T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/09/04/a-quick-note-on-apis</id>
   <content type="html">&lt;p&gt;While writing Picky, one thing occurred to me:
If you have an (external) &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, it will exert pressure on the internal APIs, or the design, the structure of your code.&lt;/p&gt;
&lt;h2&gt;Lowest energy state&lt;/h2&gt;
&lt;p&gt;If your internal structure is too complicated, it takes more energy from you – in maintaining, coding, testing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A system will always push towards the lowest energy state.&lt;/strong&gt;*&lt;/p&gt;
&lt;p&gt;And I believe, this is true even for your code structure, even though it is actually something that is not alive when writing code.
But invoking it periodically, by running tests, or the program itself, pressure will be exerted.&lt;/p&gt;
&lt;p&gt;If information is not in the right place, the information needs to be passed around, adding more parameters, or more ugly looking method signatures.&lt;/p&gt;
&lt;p&gt;You can try to package the parameters in a capsule object, to make it look neater, but by doing this you are merely &amp;#8220;pushing the bubble in the carpet around&amp;#8221;, which I will explain later.&lt;/p&gt;
&lt;p&gt;Assuming you are running the code quite often, and looking at it, a system under your care will tend to become more beautiful, as a more ugly system will take up more energy.*&lt;/p&gt;
&lt;h2&gt;Simple illustration&lt;/h2&gt;
&lt;p&gt;Say you have an external &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; on class A, and this class calls B, which in turn calls a method in C, which then calls a method in B.&lt;/p&gt;
&lt;p&gt;So, A &amp;#8594; B &amp;#8594; C &amp;#8594; B&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s also say you use tests, integration or otherwise: It will be hard to set up nice tests.&lt;/p&gt;
&lt;p&gt;Such a system will (most probably) tend to move towards this:&lt;/p&gt;
&lt;p&gt;A &amp;#8594; B &amp;#8594; C&lt;/p&gt;
&lt;p&gt;Yes, you could argue that C calls a callback on B, but then it would look most likely like this:&lt;/p&gt;
&lt;p&gt;A &amp;#8594; C &amp;#8594; B&lt;/p&gt;
&lt;p&gt;(Where B is passed into C by A)&lt;/p&gt;
&lt;p&gt;What I am trying to say is: If the information makes detours, if it needs to be passed around, i.e. is not in the right place, it will gravitate towards the right place.&lt;/p&gt;
&lt;h2&gt;Pushing the bubble in the carpet.&lt;/h2&gt;
&lt;p&gt;One image I always get when working on APIs is the one where &lt;strong&gt;I push around bubbles in a carpet&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Picky for example is littered with TODOs. This does not mean that Picky is buggy, or parts of it cannot be used. A &lt;span class=&quot;caps&quot;&gt;TODO&lt;/span&gt; is very often a location where I spotted a bubble in the carpet of Picky code.&lt;/p&gt;
&lt;p&gt;It works, but somehow it&amp;#8217;s a parameter that needs to passed through, and hasn&amp;#8217;t yet found its rightful place.&lt;/p&gt;
&lt;h2&gt;From ball to snowflake&lt;/h2&gt;
&lt;p&gt;In the beginning, many systems tend to look like a clump, a ball of code.&lt;/p&gt;
&lt;p&gt;Maybe you start with a more complex structure, but relative to the end, the beginning looks clumpy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;There are bubbles everywhere in the thing.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As they are pushed out – and by &amp;#8220;pushed out&amp;#8221; I mean, towards the edges, and hopefully removed – as they are pushed out, the ball-like structure tends to look more and more like a snowflake.
A snowflake with an external &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; in the middle. A single or more method calls that tend to call multiple other methods, which use other methods, resulting in smaller, more detailed, fine-grained code.&lt;/p&gt;
&lt;h2&gt;The beauty&lt;/h2&gt;
&lt;p&gt;The beautiful thing about all of it is:&lt;/p&gt;
&lt;p&gt;I don&amp;#8217;t feel I am the conscious writer of all of it. It feels like it is the system itself that wishes I push the bubbles out.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The system is designing itself.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Like a statue under a chiseler&amp;#8217;s care, yearning to escape the block of marble.&lt;/p&gt;
&lt;h2&gt;*Disclaimer&lt;/h2&gt;
&lt;p&gt;This assumes you want your code to use up the &lt;strong&gt;least amount of energy from you&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;If you are somebody who pushes overly complicated code systems for job security reasons, all of the above does not apply.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Case&amp;nbsp;Study&amp;#58;&amp;nbsp;Running&amp;nbsp;it&amp;nbsp;in&amp;nbsp;a&amp;nbsp;DRb&amp;nbsp;Server</title>
   <link href="http://florianhanke.com/blog/2011/09/01/picky-case-study-not-singing-in-the-rain.html"/>
   <updated>2011-09-01T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/09/01/picky-case-study-not-singing-in-the-rain</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;The picky generators, for example &lt;code&gt;picky generate server &amp;lt;dirname&amp;gt;&lt;/code&gt; only generate web server examples, like the &lt;a href=&quot;http://sinatrarb.com&quot;&gt;Sinatra&lt;/a&gt; server.&lt;/p&gt;
&lt;p&gt;However, who tells you to always sing in the rain? Sometimes it is much more prudent to just use a &lt;a href=&quot;http://segment7.net/projects/ruby/drb/introduction.html&quot;&gt;DRb (Distributed Ruby) Server&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;How can we have one run our searches? Not much different than in the Sinatra server. Or the classic server.
(With the exception on how the access is defined. In the classic server, it&amp;#8217;s &lt;code&gt;route&lt;/code&gt;, in Sinatra it&amp;#8217;s probably &lt;code&gt;get&lt;/code&gt;, and here it&amp;#8217;s starting the service)&lt;/p&gt;
&lt;h2&gt;Server&lt;/h2&gt;
&lt;p&gt;So, copy-and-paste away, into a file called &lt;code&gt;app.rb&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'activesupport'
require 'yajl'
require 'picky'
require 'drb/drb'

# &quot;Model&quot;.
#
class Item
  attr_reader :id, :name
  def initialize id, name
    @id, @name = id, name
  end
end

# Server.
#
class Server

  items = [
    Item.new(1, 'picky'),
    Item.new(2, 'drb'),
    Item.new(3, 'test'),
  ]

  drb_index = Picky::Index.new(:drb) do
    source   items
    category :name
  end
  drb_index.reindex

  drb_search = Picky::Search.new drb_index

  define_method :search do |*args|
    drb_search.search(*args).to_json
  end

end

DRb.start_service 'druby://localhost:8787', Server.new
DRb.thread.join&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And that&amp;#8217;s it for the server. Note that you don&amp;#8217;t need to index right in the server. I only do that for your copy-paste convenience.&lt;/p&gt;
&lt;p&gt;You could, for example, add a&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Signal.trap('USR1') do
  drb_index.reindex
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to have the server index on receiving the &lt;code&gt;USR1&lt;/code&gt; signal (&lt;code&gt;kill -USR1 &amp;lt;pid&amp;gt;&lt;/code&gt;).&lt;/p&gt;
&lt;h2&gt;Client&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;client.rb&lt;/code&gt; is much easier:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'drb/drb'

search_server = DRbObject.new_with_uri 'druby://localhost:8787'
1_000.times do
  puts search_server.search 'test'
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And that&amp;#8217;s it.&lt;/p&gt;
&lt;h2&gt;Running it&lt;/h2&gt;
&lt;p&gt;Start the server&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ ruby app.rb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and in another Terminal window you enter&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ ruby client.rb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to see the queries fly.&lt;/p&gt;
&lt;p&gt;On my MacBook Pro I get 1600 &amp;#8220;requests&amp;#8221; per second. An that is on a single core!&lt;/p&gt;
&lt;p&gt;… perhaps it could even be faster using &lt;a href=&quot;http://msgpack.org/&quot;&gt;http://msgpack.org/&lt;/a&gt;?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Case&amp;nbsp;Study&amp;#58;&amp;nbsp;Location&amp;nbsp;Based&amp;nbsp;Ads</title>
   <link href="http://florianhanke.com/blog/2011/09/01/picky-case-study-location-based-ads.html"/>
   <updated>2011-09-01T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/09/01/picky-case-study-location-based-ads</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s say we offered a search engine where we could search stores using a name and/or location. A location could be a zipcode or suburb.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Store
 attr_reader :id,
             :name,
             :location
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, when users search a store using a name and location, it should also show us what other stores are there, in a sidebar, to help with exploration and show the user what else is there.&lt;/p&gt;
&lt;p&gt;So, when you&amp;#8217;d look for &amp;#8220;Barbershop Brooklyn&amp;#8221;, you&amp;#8217;d also get other nice stores that are located in &amp;#8220;Brooklyn&amp;#8221;.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s tricky. Without Picky.&lt;/p&gt;
&lt;p&gt;We could define two indexes. Both index all stores. But one just has the &lt;code&gt;location&lt;/code&gt; category, and the other has &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But that is a waste of precious memory space.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s what the new Picky version can help with.&lt;/p&gt;
&lt;h2&gt;Picky 3.1.3&lt;/h2&gt;
&lt;p&gt;Version 3.1.3 introduces the &lt;code&gt;ignore&lt;/code&gt; option in the search definition block:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;stores = Index.new :stores do
  source { Store.order('name DESC') }
  category :name
  category :location
end

search = Search.new stores do
  ignore :name
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;ignore :name&lt;/code&gt; makes that &lt;code&gt;Search&lt;/code&gt; throw away (&lt;code&gt;ignore&lt;/code&gt;) any tokens that map to that category. So if Picky finds that the word &amp;#8220;barbershop&amp;#8221; in &amp;#8220;barbershop brooklyn&amp;#8221; maps to the &lt;code&gt;:name&lt;/code&gt; category, such that both would map to &lt;code&gt;[:name, :location]&lt;/code&gt;,
then Picky throws away the &amp;#8220;barbershop&amp;#8221;, such that only &lt;code&gt;:location&lt;/code&gt; brooklyn remains.&lt;/p&gt;
&lt;h2&gt;Location-based Ads&lt;/h2&gt;
&lt;p&gt;For our example, we would define the main search like this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;main_search = Search.new stores&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;because we want it to not ignore anything. If the user enters &amp;#8220;barbershop brooklyn&amp;#8221;, it must be found in the name (barbershop) and location (brooklyn), or Picky won&amp;#8217;t return it.&lt;/p&gt;
&lt;p&gt;Now, the ads search works a little differently. Whatever search word maps to name, we ignore it. We are only interested in words matching the location&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ads_search = Search.new stores do
  ignore :name
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the webapp, we would then search twice: Once for the &amp;#8220;real&amp;#8221; search, and once just for the ads to show on the side, using the same search.*&lt;/p&gt;
&lt;p&gt;Because wouldn&amp;#8217;t you just love to try Vinnie&amp;#8217;s Pizza after Uncle Joe&amp;#8217;s Barbershop? I would.&lt;/p&gt;
&lt;h2&gt;Examples&lt;/h2&gt;
&lt;p&gt;Not following? Let me give you a few examples:&lt;/p&gt;
&lt;p&gt;Searching for &amp;#8220;Barbershop&amp;#8221; will yield results in the main search, but none in the ads, since &amp;#8220;Barbershop&amp;#8221; does not match any location.&lt;/p&gt;
&lt;p&gt;Searching for &amp;#8220;Santa Barbara&amp;#8221; will probably yield something like &amp;#8220;Santa Lucia Pizzeria, Santa Barbara&amp;#8221; for the main results, and return ads from Santa Barbara, since &amp;#8220;Santa&amp;#8221; or &amp;#8220;Barbara&amp;#8221; matching as names is ignored.&lt;/p&gt;
&lt;p&gt;Searching for &amp;#8220;Chicago&amp;#8221; will return basically the same for the main result and the ads. But who searches just for &amp;#8220;Chicago&amp;#8221;?&lt;/p&gt;
&lt;h2&gt;Advanced*&lt;/h2&gt;
&lt;p&gt;If you think calling the Picky server a second time just for the ads is too much, you can use the &lt;em&gt;piggybacking&lt;/em&gt; technique:&lt;/p&gt;
&lt;p&gt;In the Sinatra server, search the main search, but at the same time, search the ads. Then, stick the results for the ads onto the main results.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;get '/stores' do
  query = params[:query]

  main_results = main_search.search query # etc.
  ads_results  = ads_search.search query # etc.

  results_hash = main_results.to_hash
  results_hash[:ads] = ads_results.to_hash

  results_hash.to_json
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, in the app server, de-piggyback the ad results and render separately. As usual, it&amp;#8217;s all Ruby.&lt;/p&gt;
&lt;h2&gt;Note&lt;/h2&gt;
&lt;p&gt;You could of course use a real geosearch instead of the simple location above. But it&amp;#8217;s just more understandable like this.&lt;/p&gt;
&lt;p&gt;Also, sometimes this is enough, and anything more correct is simply unnecessary and costs too much time.&lt;/p&gt;
&lt;h2&gt;Note 2&lt;/h2&gt;
&lt;p&gt;I recommend not to use this in the normal search. It&amp;#8217;s just too surprising for users to have their precious search words thrown away like this.&lt;/p&gt;
&lt;p&gt;As if they were just mere strings. To be tentacled away.&lt;/p&gt;
&lt;p&gt;That reminds me… one of the next blog posts really &lt;strong&gt;has to be&lt;/strong&gt; called &amp;#8220;Day of the Tentacle&amp;#8221;! &lt;strong&gt;cough&lt;/strong&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Case&amp;nbsp;Study&amp;#58;&amp;nbsp;Restricting&amp;nbsp;Results</title>
   <link href="http://florianhanke.com/blog/2011/08/31/picky-case-study-restricting-results.html"/>
   <updated>2011-08-31T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/31/picky-case-study-restricting-results</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;Recently a Picky user contacted me with an intriguing question.
&lt;strong&gt;Items have restricted visibility&lt;/strong&gt;. Some items can only be seen by Mr. Black (user id 5), but others only by Mr. Pink (user id 42). All items can each only be seen by a small number of users.&lt;/p&gt;
&lt;p&gt;The question: &amp;#8220;How can we do it?&amp;#8221;&lt;/p&gt;
&lt;p&gt;It turns out, Picky can do this already quite easily.&lt;/p&gt;
&lt;h2&gt;Here goes&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s say we have items that have a method &lt;code&gt;#restricted_to_user_ids&lt;/code&gt; that returns an array of user ids which can &amp;#8220;see&amp;#8221; this item in results:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Item
 attr_reader :id # e.g. 42
 attr_reader :name # e.g. &quot;Dan&quot;
 attr_reader :restricted_to_user_ids # e.g. [2,3,5,7,11]
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Quite nice.&lt;/p&gt;
&lt;p&gt;But how can we ask Picky to just return results that the current user can see?&lt;/p&gt;
&lt;p&gt;Since Picky is good at filtering, we could prefix each query by, say,&lt;/p&gt;
&lt;p&gt;&lt;code&gt;restricted:5&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;which would create queries like&lt;/p&gt;
&lt;p&gt;&lt;code&gt;restricted:5 my cool query&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;(how we do this we&amp;#8217;ll see later). This means we&amp;#8217;d only search for items which have 5 in their restricted user ids list.&lt;/p&gt;
&lt;p&gt;Now. Since Picky cannot yet directly index the array returned by &lt;code&gt;#restricted_to_user_ids&lt;/code&gt;, we have to use a technique, which in german would be called &amp;#8220;&lt;em&gt;from behind through the breast into the eye&lt;/em&gt;&amp;#8221;:&lt;/p&gt;
&lt;p&gt;We create a reader, which simply joins the array from &lt;code&gt;#restricted_to_user_ids&lt;/code&gt; into a string with space-separated user id values.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Item
 attr_reader :id # e.g. 42
 attr_reader :name # e.g. &quot;Dan&quot;
 attr_reader :restricted_to_user_ids # e.g. [2,3,5,7,11]
 def restricted
   restricted_to_user_ids.join(' ') # e.g. &quot;2 3 5 7 11&quot;
 end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Assuming we split the data on spaces, Picky indexes the ids nicely for each item.&lt;/p&gt;
&lt;p&gt;Then, all we have to do is add the category &lt;code&gt;:restricted&lt;/code&gt; (which uses the reader we just defined) to the index.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;items = Picky::Index.new :items do
 source { Item.order('name DESC') }
 indexing splits_text_on: /\s/
 category :name
 category :restricted
end&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The JS frontend&lt;/h2&gt;
&lt;p&gt;Finally, to add the &lt;code&gt;restricted:&amp;lt;user_id&amp;gt;&lt;/code&gt; text in front of each query, we use the Javascript callback available in the generated client, &lt;code&gt;before&lt;/code&gt;. Since version 3.1.2, before gets the query and the params.&lt;/p&gt;
&lt;p&gt;Whatever you return is used as the new query.&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;before: function(query, params) { return query.replace(/^/, 'restricted:' + user_id + ' ') }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code replaces &lt;code&gt;&quot;my beautiful query&quot; =&amp;gt; &quot;restricted:5 my beautiful query&quot;&lt;/code&gt; (Please note that the JS function &lt;code&gt;#replace&lt;/code&gt; leaves the original string alone).&lt;/p&gt;
&lt;h2&gt;One little problem&lt;/h2&gt;
&lt;p&gt;Did you notice? There&amp;#8217;s one little problem with solving it in JavaScript.&lt;/p&gt;
&lt;p&gt;If the visibility restriction is not crucial, but only helpful to your users, we would be finished.&lt;/p&gt;
&lt;p&gt;However, if Mr. Pink cannot ever see results that only Mr. Black should have access to, we&amp;#8217;d now have a big problem.&lt;/p&gt;
&lt;h2&gt;The solution?&lt;/h2&gt;
&lt;p&gt;The solution is to route the full and live requests through our web server, and adding the &lt;code&gt;restricted:&amp;lt;user_id&amp;gt;&lt;/code&gt; there. So in the server you&amp;#8217;d prepend your query with &lt;code&gt;&quot;restricted:#{current_user.id} #{params[:query]}&quot;&lt;/code&gt; and send it off to the Picky server.&lt;/p&gt;
&lt;p&gt;And that&amp;#8217;s it already. Nobody loses an ear. Quite easy, don&amp;#8217;t you think?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Migrating&amp;nbsp;to&amp;nbsp;Picky&amp;nbsp;3.1&amp;nbsp;(from&amp;nbsp;3.0)</title>
   <link href="http://florianhanke.com/blog/2011/08/26/migrating-to-picky-31.html"/>
   <updated>2011-08-26T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/26/migrating-to-picky-31</id>
   <content type="html">&lt;p&gt;This post is intended for &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; users that are at version 3.0 (or near) and would like to move to version 3.1.&lt;/p&gt;
&lt;h2&gt;Picky 3.1 is released!&lt;/h2&gt;
&lt;p&gt;You&amp;#8217;re probably wondering: The last post handled upgrading to 3.0, why is there another update so close to it?&lt;/p&gt;
&lt;p&gt;First of all, let me say sorry for the quick succession of upgrades. Picky will help you and tell you what to do, as good as it can.&lt;/p&gt;
&lt;p&gt;Secondly, Picky&amp;#8217;s goal is to be very &lt;strong&gt;modular&lt;/strong&gt; and have &lt;strong&gt;exchangeable modules&lt;/strong&gt;, while &lt;strong&gt;not being more complicated&lt;/strong&gt; to read or use.&lt;/p&gt;
&lt;p&gt;What does this have to do with this update?&lt;/p&gt;
&lt;h2&gt;What has changed?&lt;/h2&gt;
&lt;p&gt;Instead of defining your memory/redis indexes like so&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;memory_index = Picky::Indexes::Memory.new :name do
  # definition
end

redis_index = Picky::Indexes::Redis.new :name do
  # definition
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you now only use &lt;code&gt;Picky::Index.new&lt;/code&gt; and pass in the appropriate index backend. Since the memory backend is the default, you don&amp;#8217;t need to pass it in. For the Redis backend, you use &lt;code&gt;Picky::Backends::Redis.new&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;memory_index = Picky::Index.new :name do
  # definition
end

redis_index = Picky::Index.new :name do
  backend Picky::Backends::Redis.new
  # definition
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Two reasons:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Exchangeable backends&lt;/li&gt;
	&lt;li&gt;Inheritance is overrated&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Double Index. What does it meeeean?&lt;/h2&gt;
&lt;p&gt;This means that from now on you can pass in your own backend!&lt;/p&gt;
&lt;p&gt;We would be quite happy if someone decided to do a purely file-based backend :) Got one? Please contribute!
(As an example, see &lt;a href=&quot;http://github.com/floere/picky/blob/master/server/lib/picky/backends/redis.rb&quot;&gt;http://github.com/floere/picky/blob/master/server/lib/picky/backends/redis.rb&lt;/a&gt;, explanations will follow. Stay tuned!)&lt;/p&gt;
&lt;p&gt;This is the main &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; change in 3.1.&lt;/p&gt;
&lt;h2&gt;ちわ, WaDoku!&lt;/h2&gt;
&lt;p&gt;In other news, Picky now can index and search Japanese.
(Mainly due to &lt;a href=&quot;http://wadoku.eu/&quot;&gt;this project&lt;/a&gt; and the combined efforts of &lt;a href=&quot;http://twitter.com/rogerbraun&quot;&gt;Roger Braun&lt;/a&gt; and &lt;a href=&quot;http://twitter.com/brianmario&quot;&gt;Brian Lopez&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;Thanks for reading and have fun! さよなら!!!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Migrating&amp;nbsp;to&amp;nbsp;Picky&amp;nbsp;3.0&amp;nbsp;(from&amp;nbsp;2.7)</title>
   <link href="http://florianhanke.com/blog/2011/08/23/migrating-to-picky-30.html"/>
   <updated>2011-08-23T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/23/migrating-to-picky-30</id>
   <content type="html">&lt;p&gt;This post is intended for &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; users that are at version 2.7 (or near) and would like to move to version 3.0.&lt;/p&gt;
&lt;p&gt;An update recipe:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Rakefile: Rewrite &lt;code&gt;require 'picky-tasks'&lt;/code&gt; =&amp;gt; &lt;code&gt;require 'picky/tasks'&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;Index::Memory&lt;/code&gt; has been renamed to &lt;code&gt;Indexes::Memory&lt;/code&gt;, same with &lt;code&gt;Index::Redis&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;If you pass in options into the index initializer: They have been removed. Options now can only be set in the initializer block.&lt;/li&gt;
	&lt;li&gt;If you have already been using Sinatra as a server, please do not call &lt;code&gt;#search_with_text&lt;/code&gt; anymore. Instead call &lt;code&gt;#search(text, ids, offset)&lt;/code&gt;, the new &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; method. It still returns a &lt;code&gt;Result&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;The &lt;code&gt;logging.rb&lt;/code&gt; file is not &lt;code&gt;load&lt;/code&gt; ed anymore, so you can load whatever you want (being less opinionated). If you still want to load the &lt;code&gt;logging.rb&lt;/code&gt; file, please &lt;code&gt;require&lt;/code&gt; or &lt;code&gt;load&lt;/code&gt; it in the application file, for example. If you &lt;code&gt;load&lt;/code&gt; it in the application file, it will be reloaded if you call &lt;code&gt;Picky::Application.reload&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;If you&amp;#8217;ve been using the generated example &lt;code&gt;logging.rb&lt;/code&gt;, rewrite &lt;code&gt;PickyLog =&lt;/code&gt; to &lt;code&gt;Picky.logger =&lt;/code&gt; and do not wrap the &lt;code&gt;::Logger.new&lt;/code&gt; in a &lt;code&gt;Loggers::Search.new&lt;/code&gt;, but just pass the &lt;code&gt;=&lt;/code&gt; the logger.&lt;/li&gt;
	&lt;li&gt;Note that the generator for a Picky project is now called the &amp;#8220;classic&amp;#8221; generator, as opposed to the Sinatra generator.&lt;/li&gt;
	&lt;li&gt;Note that a &amp;#8220;All In One&amp;#8221; generator has been added, which generates a combined server/client for use mainly on e.g. Heroku.&lt;/li&gt;
	&lt;li&gt;If you use &lt;code&gt;Results#to_log&lt;/code&gt;, note that it has been renamed to &lt;code&gt;Results#to_s&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;In the client, using &lt;code&gt;#allocations_size&lt;/code&gt; does not work anymore on results (that have been extended by &lt;code&gt;Picky::Convenience&lt;/code&gt;). Replace with &lt;code&gt;results.allocations.size&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These are the main &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; changes in 3.0.&lt;/p&gt;
&lt;p&gt;Thanks for reading and have fun!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Ego&amp;nbsp;Trippin&amp;#8217;</title>
   <link href="http://florianhanke.com/blog/2011/08/17/ego-trippin.html"/>
   <updated>2011-08-17T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/17/ego-trippin</id>
   <content type="html">&lt;p&gt;During the last year, I started noticing a surge in &lt;a href=&quot;http://www.urbandictionary.com/define.php?term=ego%20trip&quot;&gt;&lt;em&gt;ego tripping&lt;/em&gt;&lt;/a&gt; in the Ruby community.&lt;/p&gt;
&lt;p&gt;Some open source projects come with a big ego attached. And if a project is released that fills a niche next to that project, that ego feels threatened.&lt;/p&gt;
&lt;p&gt;I get that a project can be like one&amp;#8217;s baby. And you may cherish it. But you are not your baby.&lt;/p&gt;
&lt;p&gt;If you feel personally attacked by someone releasing a project similar to yours, that&amp;#8217;s a signal to take it easy for a few days. Yes, your project will lose some users. But they might come back.
Despite all the early hype and enthusiasm: In the long run, people use what&amp;#8217;s good.&lt;/p&gt;
&lt;p&gt;And what&amp;#8217;s good usually went at least through some pressure and inspiration from other projects. &lt;sup&gt;1&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Conversely, I noticed that, instead of contributing to existing projects, some egos needed to have their own.&lt;/p&gt;
&lt;p&gt;Yes, &amp;#8220;I saw that the core method didn&amp;#8217;t work the way I wanted&amp;#8221; etc., but did you really try and discuss it with the owner, or send a pull request?&lt;/p&gt;
&lt;p&gt;Now, this is not about not having a voice of one&amp;#8217;s own. This is not about you wanting a bit of recognition for your hard learned skills. This is simply a call for a bit of humility and respect for the work of others. And a call to learn from what others might do better in their projects, and what you can learn from it. And also a call to try to teach and improve someone else&amp;#8217;s project.&lt;/p&gt;
&lt;p&gt;Discuss the thing, and not the egos.&lt;/p&gt;
&lt;p&gt;Since in the end, giving (and receiving) the gift of knowledge and respect is one of the greatest you can give.&lt;/p&gt;
&lt;p&gt;So try to be humble.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I wanted to thank two guys especially who recently gave and are giving me great feedback on Picky: &lt;a href=&quot;http://github.com/rogerbraun&quot;&gt;http://github.com/rogerbraun&lt;/a&gt; and &lt;a href=&quot;http://github.com/clintkrollwood&quot;&gt;http://github.com/clintkrollwood&lt;/a&gt;. They, like &lt;a href=&quot;http://github.com/floere/picky/wiki/Contributions-and-contributors&quot;&gt;all contributors&lt;/a&gt;, continue to give great feedback and code. All these people are the real, unsung heroes. So, thanks!&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Some good further reads:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;http://blog.nodejitsu.com/getting-refunds-on-open-source-projects&quot;&gt;http://blog.nodejitsu.com/getting-refunds-on-open-source-projects&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://blog.steveklabnik.com/2011/08/12/we-forget-that-open-source-is-made-of-people.html&quot;&gt;http://blog.steveklabnik.com/2011/08/12/we-forget-that-open-source-is-made-of-people.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;sup&gt;1&lt;/sup&gt; Picky got positive pressure from &lt;a href=&quot;http://github.com/karmi/tire&quot;&gt;Tire&lt;/a&gt;. Very thankful for that.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Happy&amp;nbsp;1st&amp;nbsp;Birthday!</title>
   <link href="http://florianhanke.com/blog/2011/08/16/picky-happy-1st-birthday.html"/>
   <updated>2011-08-16T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/16/picky-happy-1st-birthday</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;h2&gt;A big fat 1. Congratulations!&lt;/h2&gt;
&lt;p&gt;Unbelievably, a whole year has passed since the small pink octopus has left the private womb for the big world of wide open source.&lt;/p&gt;
&lt;p&gt;Since then, it has seen almost any type of project, mastered almost all challenges and helped quite a few people, many of whom seem to be very glad to be his buddies.&lt;/p&gt;
&lt;p&gt;It also has grown in experience, but has lost a lot of its baby fat at the same time.&lt;/p&gt;
&lt;p&gt;As a gift to Picky, the team gave him a &lt;a href=&quot;http://sinatrarb.com&quot;&gt;Sinatra&lt;/a&gt; collection. A new tune that you can play on release 3.0 that came out today! Picky could sing Sinatra songs all day in the rain. Man, he loves that stuff. So much Ruby goodness!&lt;/p&gt;
&lt;p&gt;Also, he got a spanking new &lt;a href=&quot;http://florianhanke.com/picky/documentation.html&quot;&gt;Single Page Help&lt;/a&gt; inspired by the &lt;a href=&quot;http://www.sinatrarb.com/intro&quot;&gt;Sinatra &lt;span class=&quot;caps&quot;&gt;README&lt;/span&gt;&lt;/a&gt;, which the team just loves.&lt;/p&gt;
&lt;p&gt;So, congratulations Picky! He and &lt;a href=&quot;http://github.com/floere/picky/wiki/Contributions-and-contributors&quot;&gt;the team&lt;/a&gt; will be &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;partying&lt;/a&gt; (see logo) and going out for Sushi and other fishy goods all night!&lt;/p&gt;
&lt;p&gt;We probably won&amp;#8217;t be answering any issues or pull requests until the sake is out of our system. Also, any blog posts on the new goodness that is 3.0 will have to wait a little.&lt;/p&gt;
&lt;p&gt;Picky would especially like to thank the &lt;a href=&quot;http://github.com/floere/picky/wiki/Contributions-and-contributors&quot;&gt;whole team&lt;/a&gt;. He wouldn&amp;#8217;t be what he is without their guidance and support. Thanks!&lt;/p&gt;
&lt;p&gt;What? Not &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;tried it yet&lt;/a&gt;?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;3.0&amp;#58;&amp;nbsp;It's&amp;nbsp;all&amp;nbsp;Ruby!&amp;nbsp;(Part&amp;nbsp;1)</title>
   <link href="http://florianhanke.com/blog/2011/08/15/picky-30-its-all-ruby-part-1.html"/>
   <updated>2011-08-15T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/08/15/picky-30-its-all-ruby-part-1</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;This is a quick look at the customizability of Picky in the upcoming 3.0 release.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#part1&quot;&gt;Too much intro? Jump down to the code!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#summary&quot;&gt;Even too much code? Jump down to the summary!&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;intro&quot;&gt;Intro&lt;/h2&gt;
&lt;p&gt;Remember when you wrote your first Ruby code?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;bananas.each do { |banana| banana.peel }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/wizard.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;You probably felt more powerful that the freakish wizard at the beginning of &lt;a href=&quot;http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/&quot;&gt;Structure &amp;amp; Interpretation of Computer Programs&lt;/a&gt; by Abelson and Sussman&lt;/p&gt;
&lt;p&gt;Finally, no more initializing an anonymous class and overriding its methods just to traverse an array like a mere acolyte.&lt;/p&gt;
&lt;p&gt;Accusatorily, you shake your magic wand at me. Yes, we can even write&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;bananas.each &amp;amp;:peel&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The point here is:
Ruby is powerful. Or more importantly: Ruby does not take away the possibilities. There is a way, always, whereas with other, more restrictive languages I usually hit a wall and then have a feeling of powerlessness wash over me.&lt;/p&gt;
&lt;p&gt;I don&amp;#8217;t know you, but chances are, you feel the same.&lt;/p&gt;
&lt;h2 id=&quot;power&quot;&gt;Powerlessness and the Power of Ruby&lt;/h2&gt;
&lt;p&gt;A quick story:
Back when I still worked with Java Lucene servers, I found myself often deep in rather big &lt;span class=&quot;caps&quot;&gt;XML&lt;/span&gt; files.&lt;/p&gt;
&lt;p&gt;The way it worked is that you wrote down a string on what tokenizer you&amp;#8217;d like to use. For example, &lt;code&gt;&quot;whitespace&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Lo and behold, the beast roared and duly split search text on whitespaces.&lt;/p&gt;
&lt;p&gt;Sometimes a typo creeped in: &lt;code&gt;&quot;whitspace&quot;&lt;/code&gt;. The beast just lifted an eyebrow and continued doing… nothing.&lt;/p&gt;
&lt;p&gt;This is bad. Why?&lt;/p&gt;
&lt;p&gt;Strings are the weakest of command words. If you have to step down from a type down to a String you have already lost.&lt;/p&gt;
&lt;p&gt;You have just lost a lot of information that only a type can carry.&lt;/p&gt;
&lt;p&gt;More often than not – since you usually needed a very specific sort of tokenizer for that given project – I was not quite happy with any of the tokenizers.&lt;/p&gt;
&lt;p&gt;It was time to leave the world of &lt;span class=&quot;caps&quot;&gt;XML&lt;/span&gt; to the world of Java classes. This was not acolyte school anymore. This was the &amp;#8220;Dark Forest&amp;#8221;, with creepy trees and bugs lurking left and right.&lt;/p&gt;
&lt;p&gt;After valiantly capturing a tokenizer you dragged your ungodly creation out of the forest back to the acolyte school to then proudly write its name down on the &lt;span class=&quot;caps&quot;&gt;XML&lt;/span&gt; scroll: &lt;code&gt;&quot;com.florianhanke.tokenizers.NotQuiteAWhitespaceTokenizer&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Beautiful *cough*&lt;/p&gt;
&lt;p&gt;Of course, now that you know Ruby, you&amp;#8217;d rather use objects than Strings.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s leave the world of wizards and beasts and enter the land of rainbows and rubies.&lt;/p&gt;
&lt;h2 id=&quot;part1&quot;&gt;Part I: Derived Indexes.&lt;/h2&gt;
&lt;p&gt;Indexing is very customizable in Picky.&lt;/p&gt;
&lt;p&gt;Most search engines use some sort of &lt;a href=&quot;http://en.wikipedia.org/wiki/Inverted_index&quot;&gt;inverted index&lt;/a&gt;. Picky also does that. In addition, it generates 3 other derived indexes from that inverted index.&lt;/p&gt;
&lt;p&gt;These generators can be passed into a&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category   :title,
           weights:    Picky::Weights::Logarithmic.new,            # Default
           partial:    Picky::Partial::Substring.new(:from =&amp;gt; -3), # Default
           similarity: Picky::Similarity::DoubleMetaphone.new(2)   # Default is ::None.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&amp;#8217;s look at the inverted index first:&lt;/p&gt;
&lt;h2 id=&quot;inverted&quot;&gt;Inverted Index&lt;/h2&gt;
&lt;p&gt;An &lt;a href=&quot;http://en.wikipedia.org/wiki/Inverted_index&quot;&gt;inverted index&lt;/a&gt; in Picky is simply a Hash that consists of &lt;code&gt;:symbols =&amp;gt; [ids]&lt;/code&gt;. For example if we have things like&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Thing(id: 1, text: &quot;Hello Picky&quot;)
Thing(id: 2, text: &quot;Hello!&quot;)
Thing(id: 3, text: &quot;Hello, hello.&quot;)
Thing(id: 5, text: &quot;PICKY&quot;)
Thing(id: 11, text: &quot;Picky, hello.&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;an inverted index would probably look like this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :hello =&amp;gt; [1, 3, 2, 11],
  :picky =&amp;gt; [1, 5, 11]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, the things we indexed had &amp;#8220;Hello&amp;#8221; and &amp;#8220;Picky&amp;#8221; in the texts. Some had both, some only one of these.&lt;/p&gt;
&lt;p&gt;If you search for &lt;code&gt;&quot;picky&quot;&lt;/code&gt;, you will get &lt;code&gt;[1, 5, 11]&lt;/code&gt;, since – simplified – Picky does a hash lookup.
That means when you search for just &lt;code&gt;&quot;pic&quot;&lt;/code&gt;, Picky will not find anything.&lt;/p&gt;
&lt;p&gt;For that it needs a partial index.&lt;/p&gt;
&lt;h2 id=&quot;partial&quot;&gt;Partial Index&lt;/h2&gt;
&lt;p&gt;A partial index is an index where we also find pieces of the words above. Say, we want to also find &lt;code&gt;[1, 5, 11]&lt;/code&gt;
when looking for &lt;code&gt;&quot;pic&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What you need to to is provide Picky with a generator that generates a new inverted index just for partial matches.&lt;/p&gt;
&lt;p&gt;Picky already provides one:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;partial: Picky::Partial::Substring.new(:from =&amp;gt; -3)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This one generates the following index from the above one:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :hello =&amp;gt; [1, 3, 2, 11],
  :hell =&amp;gt; [1, 3, 2, 11],
  :hel =&amp;gt; [1, 3, 2, 11],
  :picky =&amp;gt; [1, 5, 11],
  :pick =&amp;gt; [1, 5, 11],
  :pic =&amp;gt; [1, 5, 11]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Incidentally, this &lt;code&gt;(from: -3)&lt;/code&gt; is the default one.&lt;/p&gt;
&lt;p&gt;If you don&amp;#8217;t want a partial index, use &lt;code&gt;partial: Picky::Partial::None.new&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now, this might not be what you want. How do you write your own?&lt;/p&gt;
&lt;h3&gt;Your own?&lt;/h3&gt;
&lt;p&gt;All derived indexes implement the method &lt;code&gt;#generate_from(inverted_index)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;A partial generator should return an inverted index with &lt;code&gt;Symbols&lt;/code&gt; as keys and id arrays as values.&lt;/p&gt;
&lt;p&gt;Read more about it in &lt;a href=&quot;/2011/01/17/searching-with-picky-partial-search.html&quot;&gt;Searching with Picky Partial Search&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Also, who said they need to be actual partials? Go wild!
(And remember that Picky looks in the partial indexes when a &lt;code&gt;*&lt;/code&gt;
is used in the queries or on the last word of a query, the implicit &lt;code&gt;*&lt;/code&gt; at the end)&lt;/p&gt;
&lt;p&gt;When would you use this? For example, you&amp;#8217;d like to have partial searches, but from the front. So, &lt;code&gt;picky&lt;/code&gt;, &lt;code&gt;icky&lt;/code&gt;, &lt;code&gt;cky&lt;/code&gt;, &lt;code&gt;ky&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; would match.&lt;/p&gt;
&lt;p&gt;Next up is weighing symbols.&lt;/p&gt;
&lt;h2 id=&quot;weights&quot;&gt;Weight Index&lt;/h2&gt;
&lt;p&gt;Weights are assigned to all the symbols and are used to weigh the results.&lt;/p&gt;
&lt;p&gt;A weight generator also implements &lt;code&gt;#generate_from(inverted_index)&lt;/code&gt;, but should not return id arrays as values of the inverted index, but weights.&lt;/p&gt;
&lt;p&gt;So, a weight index derived from the above inverted index might look like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :hello =&amp;gt; 0.6,
  :picky =&amp;gt; 0.48
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The default weight index generator is &lt;code&gt;Picky::Weights::Default&lt;/code&gt;, which is equal to the &lt;code&gt;Picky::Weights::Logarithmic&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If you don&amp;#8217;t want all indexed words to be equally treated, you&amp;#8217;d pass in something like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class EqualWeightsForAll

  def generate_from inverted_index
    equality = {}
    inverted_index.each do |sym, ids|
      equality[sym] = 0
    end
    equality
  end

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When would you use this? For example, you&amp;#8217;d like to have words that are used more often be more important. You could implement a &lt;code&gt;LinearWeight&lt;/code&gt; – the weight is equal to the size of the ids array.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it!&lt;/p&gt;
&lt;h2 id=&quot;similarity&quot;&gt;Similarity Index&lt;/h2&gt;
&lt;p&gt;The similarity index should have the structure &lt;code&gt;:encoded_symbol =&amp;gt; :original_symbol_from_inverted_index&lt;/code&gt;. For example, the original could have been encoded with the metaphone algorithm.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :HL =&amp;gt; [:hello]
  :PK =&amp;gt; [:picky]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;:HL&lt;/code&gt; is the encoded symbol for &lt;code&gt;:hello&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;To generate this index, just offer a &lt;code&gt;generate_from(inverted_index)&lt;/code&gt; and a &lt;code&gt;encoded(original_symbol) # =&amp;gt; encoded_symbol&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;If you have a phonetic encoding, you could just implement &lt;code&gt;encoded(original_symbol)&lt;/code&gt; and derive from &lt;code&gt;Picky::Generators::Similarity::Phonetic&lt;/code&gt;, like in &lt;a href=&quot;http://github.com/floere/picky/blob/master/server/lib/picky/generators/similarity/metaphone.rb&quot;&gt;this example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When would you use this? For example, you&amp;#8217;d like to implement a chinese tone similarity algorithm instead of the more western oriented ones that come with Picky.&lt;/p&gt;
&lt;p&gt;(If you do, please send us a pull request)&lt;/p&gt;
&lt;p&gt;What can I do again?&lt;/p&gt;
&lt;h2 id=&quot;summary&quot;&gt;In short&lt;/h2&gt;
&lt;p&gt;Picky offers you to inject your own functionality.&lt;/p&gt;
&lt;p&gt;You pass options &lt;code&gt;partial&lt;/code&gt;, &lt;code&gt;weights&lt;/code&gt;, and &lt;code&gt;similarity&lt;/code&gt; to the &lt;code&gt;category&lt;/code&gt; method inside an index block. You give it an instance either of the built-in types or create your own.&lt;/p&gt;
&lt;p&gt;Like so:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category   :title,
           weights:    Picky::Weights::Logarithmic.new,            # Default
           partial:    Picky::Partial::Substring.new(:from =&amp;gt; -3), # Default
           similarity: Picky::Similarity::DoubleMetaphone.new(2)   # Default is ::None.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or with your own:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category   :title,
           weights:    AllWeightsAreOne.new,            # Default
           partial:    StarInFrontSubstringPartial.new, # Default
           similarity: JapaneseSimilarity.new           # Default is ::None.&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Creating your own. How?&lt;/h3&gt;
&lt;h4&gt;Partial&lt;/h4&gt;
&lt;p&gt;Implement method &lt;code&gt;#generate_from(inverted_index)&lt;/code&gt; which returns an inverted index with &lt;code&gt;{ :partial_symbol =&amp;gt; [ids array] }&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;Weights&lt;/h4&gt;
&lt;p&gt;Implement method &lt;code&gt;#generate_from(inverted_index)&lt;/code&gt; which returns an inverted index with &lt;code&gt;{ :original_symbol =&amp;gt; some_weight_number }&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;Similarity&lt;/h4&gt;
&lt;p&gt;Implement method &lt;code&gt;#generate_from(inverted_index)&lt;/code&gt; which returns an inverted index with &lt;code&gt;{ :encoded_symbol =&amp;gt; [:original_sym1, :original_sym2] }&lt;/code&gt;
and also implements &lt;code&gt;encoded(original_symbol)&lt;/code&gt; returning an encoded symbol. The encoded symbol should correspond to the one in the returned inverted index.&lt;/p&gt;
&lt;h2 id=&quot;nextup&quot;&gt;Next up?&lt;/h2&gt;
&lt;p&gt;This is how you customize the derived indexes.&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s much more. Next time we will be writing about tokenizing and character substituters!&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Picky is all Ruby, all the time.&lt;/li&gt;
	&lt;li&gt;that you can customize the indexes a lot.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">James&amp;#58;&amp;nbsp;Code&amp;nbsp;Brawl</title>
   <link href="http://florianhanke.com/blog/2011/07/13/james-code-brawl.html"/>
   <updated>2011-07-13T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/07/13/james-code-brawl</id>
   <content type="html">&lt;h2&gt;First Rule: You do not talk about Code Brawl&lt;/h2&gt;
&lt;p&gt;Mischief. Mayhem. Ruby.&lt;/p&gt;
&lt;p&gt;You might have read all about James in the previous post…
Thanks to &lt;a href=&quot;http://jeffkreeftmeijer.com/&quot;&gt;Jeff Kreeftmeier&lt;/a&gt;,
now is your chance to show off with whatever crazy dialog you can come up with!&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s only after we&amp;#8217;ve lost everything that we&amp;#8217;re free to do anything.&lt;/p&gt;
&lt;p&gt;Will you install an Asterisk phone system that will make you able to call James at home where he will
do various things for you, like switch lights on/off, feed the hamster, or yell at the kids?&lt;/p&gt;
&lt;p&gt;Or will you go the way of the informative, connecting it to your local train information system,
so that James can say &amp;#8220;Dude, you should run!&amp;#8221; if you ask him &amp;#8220;When does my train go&amp;#8221;?&lt;/p&gt;
&lt;p&gt;OR will you program some sort of voice based text adventure like &lt;a href=&quot;http://en.wikipedia.org/wiki/Zork&quot;&gt;Zork&lt;/a&gt;,
where you control the main character by the powers of your voice only?&lt;/p&gt;
&lt;p&gt;Go here and fulfil your wildest dreams of &lt;a href=&quot;http://codebrawl.com/contests/james-your-very-own-voice-commanded-servant&quot;&gt;talking to a computer&lt;/a&gt;.
Or here, and &lt;a href=&quot;http://github.com/floere/james/wiki&quot;&gt;enter some ideas&lt;/a&gt; if you only feel like thinking, but not typing.&lt;/p&gt;
&lt;p&gt;Without pain, without sacrifice, we would have nothing.&lt;/p&gt;
&lt;p&gt;No shirts, no shoes. If this is your first night at Code Brawl, &lt;a href=&quot;http://codebrawl.com/contests/james-your-very-own-voice-commanded-servant&quot;&gt;you have to brawl&lt;/a&gt;!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">James</title>
   <link href="http://florianhanke.com/blog/2011/06/15/james.html"/>
   <updated>2011-06-15T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/06/15/james</id>
   <content type="html">&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;This article contains stuff related to speech synthesis:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;What the Amiga 1000 could do.&lt;/li&gt;
	&lt;li&gt;The &lt;a href=&quot;http://www.youtube.com/watch?v=Hh3C0vyyttk&quot;&gt;famous Scotty scene&lt;/a&gt; where he talks into a mouse.&lt;/li&gt;
	&lt;li&gt;Speech Synthesis is hard.&lt;/li&gt;
	&lt;li&gt;Have your Mac say something.&lt;/li&gt;
	&lt;li&gt;Better voices for your Mac.&lt;/li&gt;
	&lt;li&gt;James, a non-walking, talking butler, a dialog system, a MacRuby gem.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;As far back as I can remember, I always wanted to be a gangster.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;cough&lt;/strong&gt; Let&amp;#8217;s try that again…&lt;/p&gt;
&lt;p&gt;When I was around 8, my dad and I went shopping for an Amiga 1000.&lt;/p&gt;
&lt;p&gt;Here it is in its full glory:&lt;/p&gt;
&lt;p&gt;&lt;iframe width=&quot;600&quot; height=&quot;493&quot; src=&quot;http://www.youtube.com/embed/ovi4KC-PkRE&quot; frameborder=&quot;0&quot; allowfullscreen&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m pretty sure I heard these synthesized organs when unwrapping it! :)&lt;/p&gt;
&lt;p&gt;Now, apart from the incredible bouncing ball and the amazing 4096 colors it had (8-year old me is writing this), it could synthesize speech. Skip to 0:35 to see the guy enter some text for the Amiga to speak.&lt;/p&gt;
&lt;p&gt;&lt;iframe width=&quot;600&quot; height=&quot;493&quot; src=&quot;http://www.youtube.com/embed/6cyZ99W9QL0&quot; frameborder=&quot;0&quot; allowfullscreen&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;Doesn&amp;#8217;t sound much worse than what you get on a Mac these days. Run this in a Terminal:&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;say 'Hello there, sexy!'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why isn&amp;#8217;t it much better these days? &lt;a href=&quot;http://en.wikipedia.org/wiki/Speech_synthesis&quot;&gt;Speech Synthesis&lt;/a&gt; is hard.&lt;/p&gt;
&lt;p&gt;Not only that, but it needs to be done for each language separately. Chinese intonation is complicated, for example, and real people don&amp;#8217;t pronounce the four pitched tones in the same way. They&amp;#8217;re pronounced differently or not at all, depending which tone went before, and which came after, also depending on mood and health of the speaker.&lt;/p&gt;
&lt;p&gt;On &lt;span class=&quot;caps&quot;&gt;OSX&lt;/span&gt;, there&amp;#8217;s two possibilities to improve the existing voices. Try the demos:
&lt;a href=&quot;http://www.assistiveware.com/ivoxsamples.php&quot;&gt;AssistiveWare iVox Samples&lt;/a&gt; and &lt;a href=&quot;https://www.cepstral.com/demos/&quot;&gt;Cepstral Demos&lt;/a&gt;. I prefer iVox for european voices. Love the french &amp;amp; swedish women. … voices, I mean.&lt;/p&gt;
&lt;p&gt;But still, even if it has a long way to go, you can already use this in clever ways:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;http://imgs.xkcd.com/comics/im_an_idiot.png&quot; width=&quot;600px&quot;&gt;&lt;/img&gt;&lt;/p&gt;
&lt;p&gt;Best &lt;a href=&quot;http://xkcd.com&quot;&gt;xkcd&lt;/a&gt; ever!&lt;/p&gt;
&lt;p&gt;But apart from playful applications, speech synthesis is very important. Many people &lt;a href=&quot;http://www.assistiveware.com/videos.php&quot;&gt;rely on it&lt;/a&gt; every day.&lt;/p&gt;
&lt;h2&gt;James&lt;/h2&gt;
&lt;p&gt;Imagine you are either an 8-year old boy wanting to control a computer using only his voice – or imagine being in pain, and need to sit down often, and don&amp;#8217;t always have a device with you.&lt;/p&gt;
&lt;p&gt;For this, I wrote &lt;a href=&quot;http://github.com/floere/james&quot;&gt;James&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Get the gem for &lt;a href=&quot;http://www.macruby.org/&quot;&gt;MacRuby&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;$ rvm use macruby
$ gem install james&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a file called &lt;code&gt;time_dialog.rb&lt;/code&gt; and copy this code into it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;James.dialog do

  hear 'What time is it?' =&amp;gt; :time

  state :time do
    hear ['What time is it?', 'And now?'] =&amp;gt; :time
    into { time = Time.now; &quot;It is currently #{time.hour} #{time.min}.&quot; }
    exit {} # Optional, listed for completeness.
  end

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;then run it using&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;james time_dialog.rb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Terminal will show you the available options.&lt;/p&gt;
&lt;p&gt;This is a dialog consisting only of one state, &lt;code&gt;time&lt;/code&gt;. The dialog (and &lt;code&gt;time&lt;/code&gt; state) is entered when saying &amp;#8220;What time is it?&amp;#8221;. When it enters, it will say the current time, or whatever is returned by the &lt;code&gt;into&lt;/code&gt; block.&lt;/p&gt;
&lt;p&gt;James already provides a simple entry dialog to control where you are. &amp;#8220;Thanks, James&amp;#8221; for example will exit the current dialog.&lt;/p&gt;
&lt;p&gt;Easy, isn&amp;#8217;t it?&lt;/p&gt;
&lt;p&gt;If you want more dialogs, just load more:&lt;/p&gt;
&lt;pre class=&quot;sh_shell&quot;&gt;&lt;code&gt;james {time,twitter,stocks}_dialog.rb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it! You can write more complex dialogs, but this is out of scope for this article.&lt;/p&gt;
&lt;p&gt;More &lt;a href=&quot;http://github.com/floere/james/tree/master/examples&quot;&gt;examples&lt;/a&gt; and &lt;a href=&quot;http://github.com/floere/james/wiki&quot;&gt;ideas for examples&lt;/a&gt;. Just add your own, if you want :)&lt;/p&gt;
&lt;h2&gt;How about…?&lt;/h2&gt;
&lt;p&gt;So if you&amp;#8217;ve written up a few nice James dialogs, why not take that old MacMini, install MacRuby and James, attach a few microphones, and distribute them around the house?&lt;/p&gt;
&lt;h2&gt;Closing&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;m looking forward to the day where I can perform basic operations like looking up the weather etc. while eating breakfast and not having to context switch.&lt;/p&gt;
&lt;p&gt;&amp;#8220;James?&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;Yes?&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;What is the weather going to be like today?&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;Warm and sunny.&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;Great! I&amp;#8217;ll be outside, doing some cycling then.&amp;#8221;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;doors slam one by one&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&amp;#8220;I&amp;#8217;m sorry Dave, I&amp;#8217;m afraid I can&amp;#8217;t allow that.&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;Not again! You #$&amp;amp;@@^%!&amp;#8221;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;James keeps silent&lt;/strong&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Designing&amp;nbsp;an ORM&amp;nbsp;Integration&amp;nbsp;1</title>
   <link href="http://florianhanke.com/blog/2011/05/30/picky-designing-orm-integration-1.html"/>
   <updated>2011-05-30T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/05/30/picky-designing-orm-integration-1</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;In this post, I want you to peek over my shoulder as I go through some of my thoughts regarding Picky &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration.&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;Picky needs to be more accessible. How can we do this? We provide a simple &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; to be used in an ActiveModel which provides indexing and searching.&lt;/p&gt;
&lt;p&gt;The result: A possible Picky &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;Intro&lt;/h2&gt;
&lt;p&gt;Now Picky is cool, sports quite a few features, and is written in Ruby so you can easily extend it. I also think it fills a feature gap that &amp;#8220;Generic Search Engine X&amp;#8221; and &amp;#8220;Hyperfast Russian Text Looker-Througher&amp;#8221; (I write this lovingly) do not address. Etc etc, yadda yadda.&lt;/p&gt;
&lt;p&gt;So what is the problem I&amp;#8217;m addressing?&lt;/p&gt;
&lt;p&gt;El problemo: Picky is not as &lt;strong&gt;accessible&lt;/strong&gt; as other search engines.&lt;/p&gt;
&lt;p&gt;What do I mean by accessible?&lt;/p&gt;
&lt;h2&gt;Accessibility?&lt;/h2&gt;
&lt;p&gt;One example for accessibility is &lt;a href=&quot;http://karmi.cz&quot;&gt;Karel Minařik&amp;#8217;s&lt;/a&gt; &lt;a href=&quot;http://github.com/karmi/tire&quot;&gt;Tire frontend&lt;/a&gt; for ElasticSearch.&lt;/p&gt;
&lt;p&gt;He did a great job in making it accessible through &lt;a href=&quot;http://gist.github.com/951343&quot;&gt;this script&lt;/a&gt;. The gist installs Rails &amp;amp; ElasticSearch in one fell swoop.
Let&amp;#8217;s call this kind of accessibility the &amp;#8220;Boom&amp;#8221; factor.&lt;/p&gt;
&lt;p&gt;Remember Steve Jobs? &amp;#8220;Boom&amp;#8221; this and &amp;#8220;Boom&amp;#8221; that. Magique!&lt;/p&gt;
&lt;p&gt;Now, sure, Picky does have a &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; that does exactly that in 5 minutes, including an in-site manual. And to be fair, it &lt;em&gt;also&lt;/em&gt; generates the views including a full search interface.&lt;/p&gt;
&lt;p&gt;But still. The question remains: If I have an existing Rails app, how does this work? Can&amp;#8217;t I just add Picky to my model and have a search?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  pickify
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and then&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.indexes(:mi5, :cia, :kgb).offset(30).search 'bond, james'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Not yet. I do have my reservations about this approach (see last post), but I see its appeal: People have a nice starting point to get into the finer details of searching (which is exactly what I want people to do – build better searches!).&lt;/p&gt;
&lt;p&gt;In short: Picky needs to up its Boom Factor!&lt;/p&gt;
&lt;h2&gt;The Boom Factor&lt;/h2&gt;
&lt;p&gt;Between us and going to Boom Factor 11 stands a lot of code.&lt;/p&gt;
&lt;p&gt;But before the code, a lot of thinking of how the code is supposed to look.&lt;/p&gt;
&lt;p&gt;And before we can even begin to think, we should know what we want, and what information we need in the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;What do we want?&lt;/h2&gt;
&lt;p&gt;A few things:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;We want a nice &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, which &amp;#8220;helps the user find what he wants&amp;#8221; (The sacred Picky design goal).&lt;/li&gt;
	&lt;li&gt;We want it to interact nicely with &lt;a href=&quot;http://yehudakatz.com/2010/01/10/activemodel-make-any-ruby-object-feel-like-activerecord/&quot;&gt;ActiveModel&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;We also want to make it easy in a controller to interact with the Picky Javascript interface.&lt;/li&gt;
	&lt;li&gt;We&amp;#8217;d also like to have the juiciest food the whole of France has to offer, but this is another story completely.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is what we want. What information do we need?&lt;/p&gt;
&lt;h2&gt;What do we need?&lt;/h2&gt;
&lt;p&gt;We need different things for searching and for indexing.&lt;/p&gt;
&lt;p&gt;For searching, we need to be able to tell Picky:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;how to prepare the search text.&lt;/li&gt;
	&lt;li&gt;which indexes to search.&lt;/li&gt;
	&lt;li&gt;the offset the results should have.&lt;/li&gt;
	&lt;li&gt;what to search (obviously).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Quite a bit of information!&lt;/p&gt;
&lt;p&gt;For indexing, we need to be able to tell Picky:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;how to prepare the text to be indexed.&lt;/li&gt;
	&lt;li&gt;which index(es) to save it to.&lt;/li&gt;
	&lt;li&gt;how to categorize the data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Not bad either…&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s try a few variations!&lt;/p&gt;
&lt;h2&gt;&lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; Designs&lt;/h2&gt;
&lt;p&gt;All this goes into a special gem called &lt;code&gt;picky-activemodel&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s say we start with the obvious, telling the class that it can be pickified.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  include Picky
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is snappy and short. Maybe too short? Let&amp;#8217;s take a look at indexing.&lt;/p&gt;
&lt;h3&gt;Indexing&lt;/h3&gt;
&lt;p&gt;Since Picky does not yet offer incremental indexing (most people don&amp;#8217;t need it even if they think so), we&amp;#8217;d have to provide an explicit &lt;code&gt;index!&lt;/code&gt; method of sorts.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.index!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But how would we define the indexing? In Picky you can define index text preparation for all indexes, for each index separately, even for each category separately.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s see. (Using just split_on in the example)&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  include Picky

  index.split_on /[\s]/

  index do
    split_on /\W/

    category :first_name do
      split_on /\s/
      partial :substring, 1
    end
    category :name do
      from :last_name
    end
  end

  index :advertisements do
    split_on /\s/
    category :last_name do
      qualifiers [:ad_name, :an]
    end
  end
end

Person.index!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That means that generally, index text is split on &lt;code&gt;/\s/&lt;/code&gt;. Then, make an index with the implicitly pluralized name &lt;code&gt;&quot;persons&quot;&lt;/code&gt;, which splits on &lt;code&gt;/\W/&lt;/code&gt;. It indexes two categories, the first name which is specially split, and indexed for partial searching.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :first_name do
  split_on /\s/
  partial :substring, 1
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There&amp;#8217;s an interesting question there: Should it be&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;partial :substring, 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;using a weak symbol/number parameter based config or a more powerful&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;partial Picky::Partial::Substring.new(1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with the problem that we now need the Substring class defined not only in Picky, but also in the &lt;code&gt;picky-activemodel&lt;/code&gt; gem.&lt;/p&gt;
&lt;p&gt;Not too easy indeed. I&amp;#8217;m not a big fan of String definitions. It&amp;#8217;s just so incredibly weak.&lt;/p&gt;
&lt;p&gt;Anyway, back to the example.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;category :name do
  from :last_name
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What does this mean? It means that the data for category &lt;code&gt;:name&lt;/code&gt; is taken from the attribute &lt;code&gt;:last_name&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Further down, we have another index definition, &lt;code&gt;:advertisements&lt;/code&gt;, which is explicitly named.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index :advertisements do&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Last but not least, we index explicitly using &lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.index!&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;h3&gt;Searching&lt;/h3&gt;
&lt;p&gt;Searching is quite interesting.&lt;/p&gt;
&lt;p&gt;On the one hand, we could have a fluent interface for which indexes to search, and with what parameters. Let&amp;#8217;s look at it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.search.indexes(:advertisements).offset(30).ids(20).with(&quot;Bond, James&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to search with text &amp;#8220;Bond, James&amp;#8221; in index :advertisements, getting 20 result ids starting after the first 30.&lt;/p&gt;
&lt;p&gt;The short form&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.search(&quot;Bond, James&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;would be much more crisp, searching in the default, unnamed index with offset 0 and 20 result ids.&lt;/p&gt;
&lt;p&gt;This would not return an array of ids, but the Picky result hash, which contains weights, categories, totals, search duration.&lt;/p&gt;
&lt;p&gt;An alternative would be&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.search do
  indexes :advertisements
  offset  30
  ids     20
  with    &quot;Bond, James&quot;
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or any combination thereof. I&amp;#8217;m inclined to allow both, or a combination of all.&lt;/p&gt;
&lt;p&gt;This was the easy part. But where do I tell Picky how to prepare the search text? (How to split and so on?)&lt;/p&gt;
&lt;p&gt;One idea is to put this in the model as well.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  include Picky

  searching do
    split_on /\s/
  end

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Sound good, but is the way we prepare the search text really model-specific?&lt;/p&gt;
&lt;p&gt;Not really. Let&amp;#8217;s try the search request:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.search(&quot;Bond, James&quot;) do
  split_on /\s/
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Not too sexy either. Perhaps also chained?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.search.split_on(/\s/).with(&quot;Bond, James&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Could work but is too wordy.&lt;/p&gt;
&lt;p&gt;How about we use a simple method?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  def self.simple_splitting_search
    @simple_splitting_search ||= search.split_on(/\s/).removes_characters(/[\&amp;amp;\-]/)
  end
end

Person.simple_splitting_search.with(&quot;Bond, James&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now this would be Ruby-esque! Methods and stuff. Who needs scopes? :)&lt;/p&gt;
&lt;p&gt;Also, the truly dynamic part would be exposed, the semi-fixed part would be summarized in the method name. Also one could decide to memoize it, as above.&lt;/p&gt;
&lt;p&gt;I think we can work with something like that.&lt;/p&gt;
&lt;p&gt;But the case where we just index a Person is the easy case. What if we also want to index its addresses, which are saved as a separate model, together in a single index?&lt;/p&gt;
&lt;h3&gt;Indexing relations&lt;/h3&gt;
&lt;p&gt;The best way in my humble opinion would be to define a very specific model, just for searching – to avoid cluttering the normal model, obey the &lt;acronym title=&quot;Single Responsibility Principle&quot;&gt;&lt;span class=&quot;caps&quot;&gt;SRP&lt;/span&gt;&lt;/acronym&gt;.&lt;/p&gt;
&lt;p&gt;But probably this is not what many people would want.&lt;/p&gt;
&lt;p&gt;So let&amp;#8217;s give it a go with the abovementioned addresses relation:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  include Picky

  index do
    category :first_name do
      # ...
    end
    category :street do
      from { addresses.map(&amp;amp;:street).join(&quot; &quot;) }
    end
  end

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Yep. I wouldn&amp;#8217;t conjure up a complicated &lt;span class=&quot;caps&quot;&gt;DSL&lt;/span&gt;, but use the trusty &lt;code&gt;from&lt;/code&gt; method, and then just give it a block which is evaluated in each model instance, just taking the data the block returns.&lt;/p&gt;
&lt;h3&gt;Possible problems&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;search&lt;/code&gt; and &lt;code&gt;index&lt;/code&gt; methods could already have been installed by other libraries. So what could we do in this case?&lt;/p&gt;
&lt;p&gt;The Picky way of doing things would be to play nice:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Person
  include Picky

  picky.index do
    category :first_name do
      split_on /\s/
    end
  end

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So if the &lt;code&gt;index&lt;/code&gt;, &lt;code&gt;index!&lt;/code&gt; or &lt;code&gt;search&lt;/code&gt; method was already installed, it would just install a – presumably yet uninstalled method named &lt;code&gt;picky&lt;/code&gt; that acts as a proxy.&lt;/p&gt;
&lt;p&gt;Also in searching,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Person.picky.search(&quot;Bond, James&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;reads quite ok.&lt;/p&gt;
&lt;p&gt;One idea might be to call it &lt;code&gt;picky_search&lt;/code&gt;, but not too partial to that.&lt;/p&gt;
&lt;p&gt;So yeah, hope you enjoyed looking over my shoulder. There&amp;#8217;s a lot to do still, but this looks like a hopeful start. I&amp;#8217;d give it a Boom Factor of 10 :)&lt;/p&gt;
&lt;p&gt;If you find any problems or have ideas, let me know in the comments!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how you might go about designing an &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Plumbing&amp;nbsp;Overview</title>
   <link href="http://florianhanke.com/blog/2011/05/19/picky-plumbing-overview.html"/>
   <updated>2011-05-19T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/05/19/picky-plumbing-overview</id>
   <content type="html">&lt;p&gt;This is a (admittedly a bit ranty and chaotic, but bear with me – recipes will follow) post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;ve gotten a lot of feedback on Picky. Many people write in to tell me how cool everything looks, but often I don&amp;#8217;t hear how it is working out later.&lt;/p&gt;
&lt;p&gt;This led to me wondering if Picky is initially attracting users, but then losing them due to missing simple recipes on how everything is put together.&lt;/p&gt;
&lt;p&gt;Out of thin air I get this feedback:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&amp;#8220;for those just looking to get a glance at how the model, view and controller layers are set up for Picky there isn&amp;#8217;t much in your docs to give that high-level glance. […] but there wasn&amp;#8217;t anything in there […] detailing the actual plumbing that ties the app and data to picky.&amp;#8221;&lt;/em&gt; (ellipses mine)&lt;/p&gt;
&lt;p&gt;He&amp;#8217;s right.&lt;/p&gt;
&lt;p&gt;There is the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;overview image&lt;/a&gt; on the &lt;em&gt;getting started&lt;/em&gt; page, but it isn&amp;#8217;t very clear on how everything fits together.&lt;/p&gt;
&lt;p&gt;There is also the &lt;a href=&quot;http://github.com/floere/picky/wiki/Best-Practices-Setup&quot;&gt;best practices setup&lt;/a&gt; in &lt;a href=&quot;http://github.com/floere/picky/wiki/&quot;&gt;the Wiki&lt;/a&gt;, but that does not really show any code, just how it is connected on an abstract level.&lt;/p&gt;
&lt;p&gt;So, let me clear up a few things. This is the current state of how Picky is used:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-05-19-overview.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;We have multiple areas:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The Picky server (gem &lt;code&gt;picky&lt;/code&gt;) is a &lt;em&gt;standalone&lt;/em&gt; server. You can send it &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; requests and it will return &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; responses with a &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; body.&lt;/li&gt;
	&lt;li&gt;The Picky Client (gem &lt;code&gt;picky-client&lt;/code&gt;) is a way to query the server comfortably using Ruby instead of having to put together the queries yourself.&lt;/li&gt;
	&lt;li&gt;You use this Picky Client in your webapp to get &lt;em&gt;result ids&lt;/em&gt; from the server.&lt;/li&gt;
	&lt;li&gt;Picky also offers a Javascript interface that can display rendered results and a result count. The results need to be rendered in the webapp, the server only returns result ids.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;strong&gt;absolute best way&lt;/strong&gt; to see all this in code and in action is to try the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;getting started&lt;/a&gt;. If you haven&amp;#8217;t tried it, do so now, run it, and take a look at the code (especially in the server &lt;code&gt;app/application.rb&lt;/code&gt;, in the client &lt;code&gt;app.rb&lt;/code&gt;, the Sinatra app).&lt;/p&gt;
&lt;h2&gt;Picky is &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; agnostic&lt;/h2&gt;
&lt;p&gt;(This part is divided into my reasoning/ranting ;) for not offering &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; support and code examples on how to handle this)&lt;/p&gt;
&lt;h3&gt;The &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; rant&lt;/h3&gt;
&lt;p&gt;Most people trying Picky for the first time are expecting some sort of ActiveRecord or other &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration.&lt;/p&gt;
&lt;p&gt;Let me tell you upfront: There is none. Yes, no requiring a gem and slapping on a module in Picky.&lt;/p&gt;
&lt;p&gt;Why? Many other search engine Ruby adapters offer some sort of nice &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; support, which lets me easily search and find data.&lt;/p&gt;
&lt;p&gt;While I would &lt;strong&gt;love&lt;/strong&gt; to provide some sort &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration, let me tell you why I don&amp;#8217;t support an &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; (yet):&lt;/p&gt;
&lt;p&gt;It costs a lot of effort/resources to do right and I wanted to spend that time for making Picky good and have a great Javascript user interface.&lt;/p&gt;
&lt;p&gt;Since for me the &lt;strong&gt;hard part is not the loading the data from some model into the index&lt;/strong&gt; (that is mostly easy), &lt;strong&gt;but making a really good user interface and having the data indexed and searched really correctly&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I always felt that comfortable &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integrations, while being comfortable, mostly hide the way your data is indexed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;They provide you an easy solution to an easy problem.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If your data is hard to index, your data might be too complicated, too normalized.&lt;/p&gt;
&lt;p&gt;Picky on the other hand, gives you the power of doing searching right. &lt;strong&gt;In Ruby.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Because search engines never work the same:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The last search engine you built simply had different data.&lt;/li&gt;
	&lt;li&gt;There always will be edge cases, people not finding their data. Ever ran &lt;code&gt;rake 'try[some words]'&lt;/code&gt; in the server directory? This will tell you exactly how Picky indexes these words, or preprocesses them before searching.&lt;/li&gt;
	&lt;li&gt;There always will be the pointy haired boss finding the way to your desk, asking why his best friend doesn&amp;#8217;t find X, but Y instead. This can be shown,  &lt;a href=&quot;http://florianhanke.com/blog/2011/04/17/picky-integration-testing.html&quot;&gt;integration tested&lt;/a&gt; and fixed in minutes. Result: Friend finds X.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Although it might be enticing to have a search set up really fast, it is most of the time paid later: When all is about making the search work really well and edge cases crop up (due to the fact that most data is rather freeform).&lt;/p&gt;
&lt;p&gt;Then again, you might not care about all these edge cases or having a really good search. Then again, why are you reading this exactly?&lt;/p&gt;
&lt;h4&gt;&lt;span class=&quot;caps&quot;&gt;BIG&lt;/span&gt; &lt;strong&gt;&lt;span class=&quot;caps&quot;&gt;BUT&lt;/span&gt;&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Let me say though that I see the appeal of having an &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration, and the next few months may see our efforts shifted towards having a Picky &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration. This is a result of a long discussion with &lt;a href=&quot;http://github.com/karmi&quot;&gt;Karel Minařik&lt;/a&gt;, aka &lt;a href=&quot;http://github.com/karmi/tire&quot;&gt;Mr. Tire&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It will probably take place first in the form of having a flexible external interface in the server through which data is sent and indexed.&lt;/p&gt;
&lt;p&gt;The indexing definition would still be in the server, but the selection and sorting of data would be in the Rails / Sinatra etc. application.&lt;/p&gt;
&lt;p&gt;In short:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Your webapp &lt;strong&gt;selects and sorts&lt;/strong&gt; the data, sending it to the server.&lt;/li&gt;
	&lt;li&gt;The Picky server &lt;strong&gt;indexes&lt;/strong&gt; your data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But I need to think about this – your feedback is much appreciated!&lt;/p&gt;
&lt;h3&gt;How to index your Rails data&lt;/h3&gt;
&lt;p&gt;There are many ways to index your data. See &lt;a href=&quot;http://florianhanke.com/blog/2011/04/14/picky-two-point-two-point-oh.html&quot;&gt;the part under Flexible Sources&lt;/a&gt; which explains how to use the &lt;code&gt;#each&lt;/code&gt; method on your models to index.&lt;/p&gt;
&lt;h4&gt;Whatevs, pickle face! I want to index my models!&lt;/h4&gt;
&lt;p&gt;Don&amp;#8217;t give in to the rage. Ruby is your Jedi weapon.&lt;/p&gt;
&lt;p&gt;A few suggestions.&lt;/p&gt;
&lt;p&gt;You have a model &lt;code&gt;Book&lt;/code&gt; in your Rails app.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Book &amp;lt; ActiveRecord::Base
  # your supermodel
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and you&amp;#8217;d like to reuse this in Picky.&lt;/p&gt;
&lt;p&gt;Try this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Get the model.
#
require &quot;#{PICKY_ROOT}/../rails_app/app/models/book&quot;

# Get the database configuration from the Rails app.
#
db_config = YAML.load(File.open(&quot;#{PICKY_ROOT}/../rails_app/config/database.yml&quot;))

# Establish a connection using the right environment.
#
Book.establish_connection db_config[PICKY_ENVIRONMENT]

# Utilize the #each method on e.g. Book.some_named_scope to index.
#
book_index = Index::Memory.new :book_each do
  source     Book.order('title ASC')
  category   :title
  category   :author
  # ...
end
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Yes, sometimes the models are much more complicated, using &lt;code&gt;acts_as_something&lt;/code&gt; (or the modern versions thereof) and class methods from them.&lt;/p&gt;
&lt;p&gt;In that case, either require your rails app/environment, or just load the data from the database:&lt;/p&gt;
&lt;h4&gt;Relationship status: It&amp;#8217;s complicated&lt;/h4&gt;
&lt;p&gt;Sometimes you need to index a complex combination of data (with a &lt;code&gt;JOIN&lt;/code&gt; or so). For this you can use a database source in the server:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;book_index = Index::Memory.new :book_each do
  source     Sources::DB.new(
               'SELECT b.id, b.title, a.name
                FROM books b INNER JOIN authors a
                ON a.id = b.author_id',
               :file =&amp;gt; &quot;#{PICKY_ROOT}/rails_app/config/#{PICKY_ENVIRONMENT}/db.yml&quot;
             )
  category   :title
  category   :author
  # ...
end&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The Picky server is a standalone server&lt;/h2&gt;
&lt;p&gt;The server (currently) is completely independent of your Rails / Sinatra / ActiveRecord application.&lt;/p&gt;
&lt;p&gt;That means it lives in a separate directory. It does not use your Rails environment.&lt;/p&gt;
&lt;p&gt;The server offers a &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; interface, returning &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; payload.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s look at an example. In the server configuration &lt;code&gt;app/application.rb&lt;/code&gt; you will have a route defined:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/media\Z} =&amp;gt; Search.new(books_index, mp3_index)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This does exactly what it says and will route search requests on &lt;code&gt;/media&lt;/code&gt; to a search using the &lt;code&gt;books_index&lt;/code&gt; and the &lt;code&gt;mp3_index&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To directly query the server, you can use &lt;code&gt;curl&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So, &lt;code&gt;curl 'localhost:8080/media?query=Pirates&amp;amp;ids=20&amp;amp;offset=0'&lt;/code&gt; will return e.g. the id of &amp;#8220;Pirates of the Carribean&amp;#8221;.&lt;/p&gt;
&lt;p&gt;But it won&amp;#8217;t be just a list of the ids, but a &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; response. Let&amp;#8217;s look at it:
&lt;pre class=&quot;sh_json&quot;&gt;&lt;code&gt;{
 &quot;allocations&quot;:[
  [&quot;books&quot;,8.56,13,[[&quot;title&quot;,&quot;pirates&quot;,&quot;Pirates&quot;]],[59,65,106,110,164,166,174,218,235,249,344,413,425]],
  [&quot;mp3s&quot;,5.48,241,[[&quot;title&quot;,&quot;pirates&quot;,&quot;Pirates&quot;]],[5,6,7,8,12,13,161]]
 ],
 &quot;offset&quot;: 0,
 &quot;duration&quot;: 0.009041,
 &quot;total&quot;: 254
}&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;We have several parts:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;allocations: In what index it was found, and also in what categories in that index, including the 20 top ids (in this example).&lt;/li&gt;
	&lt;li&gt;offset: The offset that was used to search.&lt;/li&gt;
	&lt;li&gt;duration: The time it took Picky to find the results.&lt;/li&gt;
	&lt;li&gt;total: The total number of result ids.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, because it is a bit tedious to extract data from the &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; string, we wrote…&lt;/p&gt;
&lt;h2&gt;The Picky client gem&lt;/h2&gt;
&lt;p&gt;The Picky client handles the wrapping of the query and the unwrapping of the result &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; for you. For example, the command &lt;code&gt;picky search some_url&lt;/code&gt; or the integration tests use the client to make accessing the result data much easier.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;gem install picky-client&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;First, configure the client. It is always configured to point at a specific search (path):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;MediaSearch = Picky::Client.new :host =&amp;gt; 'localhost', :port =&amp;gt; 8080, :path =&amp;gt; '/media'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can use it like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = MediaSearch.search 'some query text', :ids =&amp;gt; 20, :offset =&amp;gt; 0&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;results&lt;/code&gt; variable now simply holds a hash with the &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; data. Extend it with &lt;code&gt;Picky::Convenience&lt;/code&gt; to get a few nice methods on this hash.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results.extend Picky::Convenience
results.ids # =&amp;gt; array of the ids
results.total # =&amp;gt; amount of total ids (not just the 20)
results.empty? # =&amp;gt; Do we have results?&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also nice is this one, which will take the result ids of the books, and load each corresponding Book model, then yield it to the block where you can render it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results.populate_with Book do |book|
  book.to_s
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It&amp;#8217;s best if you look at it in the Sinatra example application from the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Picky is a standalone server.&lt;/li&gt;
	&lt;li&gt;that Picky does not yet offer an &lt;span class=&quot;caps&quot;&gt;ORM&lt;/span&gt; integration.&lt;/li&gt;
	&lt;li&gt;what you can do with the Picky client gem.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Phony&amp;#58;&amp;nbsp;Phone&amp;nbsp;Numbers</title>
   <link href="http://florianhanke.com/blog/2011/05/01/phony-phone-numbers.html"/>
   <updated>2011-05-01T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/05/01/phony-phone-numbers</id>
   <content type="html">&lt;p&gt;This is a post about &lt;a href=&quot;http://florianhanke.com/phony/&quot;&gt;Phony 1.4.1+&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;a href=&quot;#intro&quot;&gt;Intro&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#problem&quot;&gt;The Problem&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#phony&quot;&gt;Phony&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#try&quot;&gt;Try it&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#api&quot;&gt;Internal &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#e164&quot;&gt;E.164&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#model&quot;&gt;Model/Representation Aside – in ActiveRecord&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#status&quot;&gt;Status&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#endnote1&quot;&gt;Endnote 1&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#endnote2&quot;&gt;Endnote 2&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;intro&quot;&gt;Intro&lt;/h2&gt;
&lt;p&gt;Imagine…&lt;/p&gt;
&lt;p&gt;You own a little startup, which has created apps that were only relevant for the domestic market. Until now.&lt;/p&gt;
&lt;p&gt;Suddenly, the big breakthrough – your online car/music/housing/pet/houseboatlover&amp;#8217;s website has been an overnight (5+ yrs) success, and people demand it be available all over the world, including customers from all over the world.&lt;/p&gt;
&lt;p&gt;Coding goes very well, until suddenly one of your customers notices that their phone number is all awry. Instead of the melodious french 2-digit grouping &lt;code&gt;33 1 12 34 56 78&lt;/code&gt;, it is a horrible jumble of north american clumping: &lt;code&gt;3 (311) 234-5678&lt;/code&gt;. This is an outrage! Sacrebleu!&lt;/p&gt;
&lt;p&gt;France invades the US on the very next day. Freedom fries are forbidden and … well, you know how the story goes.&lt;/p&gt;
&lt;p&gt;This could all have been avoided if you had used &lt;a href=&quot;http://florianhanke.com/phony/&quot;&gt;Phony&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;problem&quot;&gt;The problem&lt;/h2&gt;
&lt;p&gt;The big problem is that countries all over the world have different ways of splitting and formatting their phone numbers.&lt;/p&gt;
&lt;p&gt;For example, Switzerland uses a 2-digit national destination code, like &lt;code&gt;+41 44 123 12 12&lt;/code&gt; – the &lt;code&gt;44&lt;/code&gt; is the national destination code, which originally was geographic in nature, but isn&amp;#8217;t anymore.&lt;/p&gt;
&lt;p&gt;Germany is different in that it has a variable length &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt;, from 1 to 5, for example Freiburg im Breisgau uses 3: &lt;code&gt;+49 761 476 7676&lt;/code&gt;, and Berlin uses 2: &lt;code&gt;+49 30 386 25454&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Denmark on the other hand has no &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt; at all. And let&amp;#8217;s not talk about Italy. No, let&amp;#8217;s not.&lt;/p&gt;
&lt;p&gt;You see? Big mess.&lt;/p&gt;
&lt;p&gt;Well, there is some standardization called &lt;a href=&quot;http://florianhanke.com/phony/e164.html&quot;&gt;E164&lt;/a&gt;, and I&amp;#8217;ll talk about it below. But first, Phony.&lt;/p&gt;
&lt;h2 id=&quot;phony&quot;&gt;Phony&lt;/h2&gt;
&lt;p&gt;Phony does the ugly and dirty work of correctly formatting international phone numbers for you.&lt;/p&gt;
&lt;p&gt;It can format, split, and normalize:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Austria: &lt;code&gt;Phony.format('43198110', :format =&amp;gt; :international, :spaces =&amp;gt; :-) # =&amp;gt; '+43-1-98110'&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;France: &lt;code&gt;Phony.split('33112345678') # =&amp;gt; ['33', '1', '12','34','56','78']&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;North America: &lt;code&gt;Phony.normalize('1 (703) 451-5115') # =&amp;gt; '17034515115'&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And it does it very fast. Each of these ops for 5 numbers is &lt;a href=&quot;http://github.com/floere/phony/blob/master/spec/lib/phony_spec.rb#L178-215&quot;&gt;around 1 10&amp;#8217;000th of a second&lt;/a&gt; on my &lt;span class=&quot;caps&quot;&gt;MBP&lt;/span&gt; using Ruby 1.9.2.&lt;/p&gt;
&lt;p&gt;Normalizing you use before saving a phone number into a database etc.&lt;/p&gt;
&lt;p&gt;Splitting is helpful if you want to do your own special formatting, or remove certain parts.&lt;/p&gt;
&lt;p&gt;Although that is probably not needed, as Phony can take care of that for you: Formatting render a number in international/national/local form, with zeroes, &lt;code&gt;00&lt;/code&gt;, plus &lt;code&gt;+&lt;/code&gt;
and special spaces, if you need them (&lt;code&gt;&quot; &quot;&lt;/code&gt; is default).&lt;/p&gt;
&lt;p&gt;Look at &lt;a href=&quot;http://florianhanke.com/phony/examples.html&quot;&gt;a few more examples&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;try&quot;&gt;Try it&lt;/h2&gt;
&lt;p&gt;First, get the gem: &lt;code&gt;gem install phony&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Then,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'phony'

p Phony.format('43198110', :format =&amp;gt; :international, :spaces =&amp;gt; :-) # =&amp;gt; '+43-1-98110'
p Phony.split('33112345678') # =&amp;gt; ['33', '1', '12','34','56','78']
p Phony.normalize('1 (703) 451-5115') # =&amp;gt; '17034515115'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;My country is not formatted correctly! What do I do?&lt;/p&gt;
&lt;h2 id=&quot;api&quot;&gt;Internal &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Sometimes I have a nice document to go on, most of the time I don&amp;#8217;t, and not even in any of the languages or writing systems I know. Sometimes I simply made a mistake. This is where you can help Phony!&lt;/p&gt;
&lt;p&gt;To add your &amp;#8220;missing&amp;#8221; country, fork Phony and look at the &lt;a href=&quot;http://github.com/floere/phony/blob/master/lib/phony/countries.rb&quot;&gt;lib/phony/countries.rb file&lt;/a&gt;. It contains (almost) all the definitions. The more complicated ones – like Germany, Italy, etc. – are in their own files.&lt;/p&gt;
&lt;p&gt;The internal &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; uses a little &lt;span class=&quot;caps&quot;&gt;DSL&lt;/span&gt; to make managing and coding all the different formats easier.&lt;/p&gt;
&lt;p&gt;The phone numbers of France, for example, have a very elegant structure:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;country '33', fixed(1) &amp;gt;&amp;gt; split(2,2,2,2)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This says, that the country with country code &lt;code&gt;33&lt;/code&gt; should have an &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt; of &lt;code&gt;fixed&lt;/code&gt; length &lt;code&gt;1&lt;/code&gt;,
followed (&lt;code&gt;&amp;gt;&amp;gt;&lt;/code&gt;) by a national code that is &lt;code&gt;split&lt;/code&gt; in groups of &lt;code&gt;2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;As another quick example, the freshly added Slovakia:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;country '421', match(/^(9\d\d).+$/) &amp;gt;&amp;gt; split(6) | # Mobile
               one_of('2')          &amp;gt;&amp;gt; split(8) | # Bratislava
               fixed(2)             &amp;gt;&amp;gt; split(7)&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;This says that Slovakia uses &lt;code&gt;421&lt;/code&gt; as country code. If a phone number with &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt; &lt;code&gt;9xx&lt;/code&gt; is found, &lt;code&gt;split&lt;/code&gt; the national part into one big part with &lt;code&gt;6&lt;/code&gt;
digits. If not, go and check if the &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt; is a &lt;code&gt;2&lt;/code&gt;, if yes, &lt;code&gt;split&lt;/code&gt; it into a thing with &lt;code&gt;8&lt;/code&gt; digits as national. If not, it must be a 2-digit &lt;span class=&quot;caps&quot;&gt;NDC&lt;/span&gt;, with 7 digits following.&lt;/p&gt;
&lt;p&gt;So:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;421912123456 # =&amp;gt; 421 912 123456&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;421212345678 # =&amp;gt; 421 2 12345678&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;421371234567 # =&amp;gt; 421 37 1234567&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The description of what matching/splitting is available is at the top of the file.&lt;/p&gt;
&lt;p&gt;First, add specs with a few example numbers, then fix, and send me a pull request. Get big thanks in &lt;a href=&quot;http://github.com/floere/phony/wiki/Contributors&quot;&gt;the contributors entries&lt;/a&gt;. Try to beat Keith Bingman! :)&lt;/p&gt;
&lt;p&gt;But let&amp;#8217;s get back to phone numbers.&lt;/p&gt;
&lt;h2 id=&quot;e164&quot;&gt;E.164&lt;/h2&gt;
&lt;p&gt;Or E164 for short is a recommendation which defines a numbering scheme and phone number formats. The &lt;a href=&quot;http://en.wikipedia.org/wiki/E.164&quot;&gt;Wikipedia entry&lt;/a&gt; is very helpful.&lt;/p&gt;
&lt;p&gt;For coders, there are 2 important facts to be gleaned:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Length is maximally = 15.&lt;/li&gt;
	&lt;li&gt;Country code is a 1-3 digits &lt;a href=&quot;http://en.wikipedia.org/wiki/Prefix_code&quot;&gt;prefix code&lt;/a&gt;. This is defined in E164. After that it is a horrible mess.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, in e.g. ActiveRecord you can exploit fact #1 like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;t.string &quot;normalized_phone&quot;, :limit =&amp;gt; 15&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;Fact #2 is harder to exploit, and this is what Phony is here for.&lt;/p&gt;
&lt;h2 id=&quot;model&quot;&gt;Model/Representation Aside&lt;/h2&gt;
&lt;p&gt;Btw, if you have customers who want to enter specific phone numbers (like &amp;#8220;+34/123-(555)001!&amp;#8221;), you could code it up like this in ActiveRecord:&lt;/p&gt;
&lt;p&gt;Before saving, you could normalize it quickly if it is dirty, to see if it needs to be saved in the specific_phone attribute (if normalized != given_specific). This just off the top of my head.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def phone
  read_attribute(:specific_phone) || read_attribute(:normalized_phone)
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, in the view, use e.g.:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;= Phony.format(user.phone)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Even better to use representers/&lt;a href=&quot;https://github.com/floere/view_models&quot;&gt;view models&lt;/a&gt;, in which you just define a method:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def phone
  Phony.format(model.phone)
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, in the view it becomes:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;= user.phone&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I really like that last line.&lt;/p&gt;
&lt;h2 id=&quot;status&quot;&gt;Status&lt;/h2&gt;
&lt;p&gt;At the time of this writing, we include 44 countries, and counting. See &lt;a href=&quot;http://github.com/floere/phony/blob/master/README.textile&quot;&gt;the &lt;span class=&quot;caps&quot;&gt;README&lt;/span&gt;&lt;/a&gt; for a list.&lt;/p&gt;
&lt;h2 id=&quot;endnote1&quot;&gt;Endnote 1&lt;/h2&gt;
&lt;p&gt;Q: Why are this dude&amp;#8217;s libraries named after negative attributes?&lt;/p&gt;
&lt;p&gt;A: No.&lt;/p&gt;
&lt;h2 id=&quot;endnote2&quot;&gt;Endnote 2&lt;/h2&gt;
&lt;p&gt;If I&amp;#8217;ve found out just one thing about phone numbers then it is this formula:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;1 / (standardization + well-oiled-bureaucracy) = phone-number-structure-mess-quantifier&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Switzerland has a well oiled bureaucracy, 1, but not a big drive for standardization, 0, = 1.&lt;/p&gt;
&lt;p&gt;France does not have a well oiled bureaucracy, 0, but a big drive for standardization, 1, = 1.&lt;/p&gt;
&lt;p&gt;For Italy, the result is around 1.825&amp;#215;10e7. Booo.&lt;/p&gt;
&lt;p&gt;A special thank you goes to Belgium which uses 4xx as its mobile phone prefix, but has a region, Liège, which uses 4 as its land line prefix. Belgium, do you know what a bloody prefix code is? &lt;span class=&quot;caps&quot;&gt;OTOH&lt;/span&gt;, this led me to rewrite Phony a second time, and all is much better.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Phony can normalize a phone number.&lt;/li&gt;
	&lt;li&gt;that Phony can split a phone number into its constituent parts.&lt;/li&gt;
	&lt;li&gt;that Phony can format a phone number for you.&lt;/li&gt;
	&lt;li&gt;that it does all this very fast.&lt;/li&gt;
	&lt;li&gt;what E164 is.&lt;/li&gt;
	&lt;li&gt;what the lib status is.&lt;/li&gt;
	&lt;li&gt;that some countries &lt;span class=&quot;caps&quot;&gt;ARE&lt;/span&gt; better than others ;)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Geosearch&amp;nbsp;2</title>
   <link href="http://florianhanke.com/blog/2011/04/26/picky-geosearch-2.html"/>
   <updated>2011-04-26T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/26/picky-geosearch-2</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;In this quick one I&amp;#8217;ll be using my own iPhone&amp;#8217;s geodata as data for a space/time Picky search.&lt;/p&gt;
&lt;p&gt;Lean back and enjoy the screencast.&lt;/p&gt;
&lt;h2&gt;Enjoy the show&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;ll be searching time and space for my own footprints in Switzerland, Germany and Australia.&lt;/p&gt;
&lt;p&gt;Best viewed in full-screen. Warning: Safe for work with the possible exception of my voice, which has in the past triggered attacks by various animals/politicians.&lt;/p&gt;
&lt;p&gt;View &lt;a href=&quot;http://www.universalsubtitles.org/en/videos/Xdsz5BqRPH2q/info/&quot;&gt;with subtitles&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;iframe src=&quot;http://player.vimeo.com/video/22889123&quot; width=&quot;707&quot; height=&quot;927&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;(When I say &amp;#8220;Apple is collecting&amp;#8221;, I mean &amp;#8220;&amp;#8216;Apple&amp;#8217; is collecting&amp;#8221; – the phone)&lt;/p&gt;
&lt;p&gt;So how do you get your iPhone&amp;#8217;s geodata?&lt;/p&gt;
&lt;h2&gt;iPhone geodata&lt;/h2&gt;
&lt;p&gt;First of all, let me direct you to a nice &lt;span class=&quot;caps&quot;&gt;OSX&lt;/span&gt; application:
&lt;a href=&quot;http://petewarden.github.com/iPhoneTracker/&quot;&gt;http://petewarden.github.com/iPhoneTracker/&lt;/a&gt;
This enables you to view your data nicely.&lt;/p&gt;
&lt;p&gt;The third question in the &lt;span class=&quot;caps&quot;&gt;FAQ&lt;/span&gt; explains how to get your data out of the phone:
&lt;a href=&quot;http://petewarden.github.com/iPhoneTracker/#2&quot;&gt;How can I examine the data without running the application?&lt;/a&gt;
(Also look at the updates)&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it. At the end you should have access to a SQLite database, from where I extracted &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; data into the file &lt;code&gt;data/iphone_locations.csv&lt;/code&gt; (with header data removed).&lt;/p&gt;
&lt;p&gt;What did I do with the data?&lt;/p&gt;
&lt;h2&gt;The code&lt;/h2&gt;
&lt;p&gt;We&amp;#8217;ll first be looking at the server, then at the client.&lt;/p&gt;
&lt;h3&gt;Server&lt;/h3&gt;
&lt;p&gt;In the server, define an index like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;iphone_locations = Index::Memory.new :iphone do
  source Sources::CSV.new(
    :mcc,
    :mnc,
    :lac,
    :ci,
    :timestamp,
    :latitude,
    :longitude,
    :horizontal_accuracy,
    :altitude,
    :vertical_accuracy,
    :speed,
    :course,
    :confidence,
    file: 'data/iphone_locations.csv'
  )
  ranged_category :timestamp, 86_400, precision: 5, qualifiers: [:ts, :timestamp]
  geo_categories  :latitude, :longitude, 25, precision: 3
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you can see, I&amp;#8217;m only using timestamp, latitude and longitude. And wrote all the possible data fields for completeness&amp;#8217; sake if I need to refer to one of these later on.&lt;/p&gt;
&lt;p&gt;The timestamp uses a &amp;#8220;radius&amp;#8221; of 86&amp;#8217;400 seconds (a day). That means it includes all results around the given timestamp in a range of ts-1.day..ts+1.day.&lt;/p&gt;
&lt;p&gt;It also sets a short qualifier (&amp;#8220;ts&amp;#8221;) such that the search input field is not completely filled, i.e. searching for &amp;#8220;ts:&amp;#8230;&amp;#8221; is equivalent to searching for &amp;#8220;timestamp:&amp;#8230;&amp;#8221;.&lt;/p&gt;
&lt;p&gt;The geodata uses &lt;code&gt;geo_categories&lt;/code&gt; (see last post), with 25 km as radius and an average precision of 3 (1 = low, 5 = high).&lt;/p&gt;
&lt;p&gt;Now you already could search your data e.g. with &lt;code&gt;curl 'localhost:8080/iphone?query=longitude:8.2'&lt;/code&gt;. Note that the timestamp data is saved as seconds since January 1st 2001 (as per the Apple data).&lt;/p&gt;
&lt;h3&gt;Client&lt;/h3&gt;
&lt;p&gt;The client actually stayed almost exactly the same since the last blog post, with the geo data piggybacking on the results hash.&lt;/p&gt;
&lt;p&gt;The only notable addition is the HTML5 slider, which is a simple &lt;code&gt;input[type=range]&lt;/code&gt;, with a &lt;code&gt;change&lt;/code&gt; listener defined on it, which triggers the insertion of the (&amp;#8220;ts:&amp;#8221; qualified) search string.&lt;/p&gt;
&lt;p&gt;One problem I had was that I did not know that Javascript defines months in the range (0..11), but not the years, so 1977 &lt;strong&gt;is&lt;/strong&gt; 1977, and not 1978, thankfully. But still, quite a stumbling block if you&amp;#8217;re unaware of it.&lt;/p&gt;
&lt;h2&gt;Finally&lt;/h2&gt;
&lt;p&gt;Have fun doing crazy space/time searches!&lt;/p&gt;
&lt;p&gt;… and don&amp;#8217;t run into time paradoxes. Those are nasty. Watch Back to the Future 1 for tips and tricks. First one is free: Learn to play an electric guitar.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how to extract your iPhone&amp;#8217;s geodata.&lt;/li&gt;
	&lt;li&gt;that you can search space/time.&lt;/li&gt;
	&lt;li&gt;how you might write your own.&lt;/li&gt;
	&lt;li&gt;that Javascript Date handling – although lauded by many &lt;span class=&quot;caps&quot;&gt;PHP&lt;/span&gt; programmers – is crap.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Geosearch&amp;nbsp;1</title>
   <link href="http://florianhanke.com/blog/2011/04/19/picky-geosearch-1.html"/>
   <updated>2011-04-19T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/19/picky-geosearch-1</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;Let me show you how to do a simple and fun geo search in Picky.&lt;/p&gt;
&lt;p&gt;But first, lean back.&lt;/p&gt;
&lt;h2&gt;Enjoy the show&lt;/h2&gt;
&lt;p&gt;The index contains around 21&amp;#8217;000 Swiss places, taken from Wikipedia.&lt;/p&gt;
&lt;p&gt;First, I click a little around – Picky gives me places around the clicked location.&lt;/p&gt;
&lt;p&gt;After that I show what happens if I just give Picky a latitude or a longitude. Then, combined with the place text, finally, just with the place text.&lt;/p&gt;
&lt;p&gt;You&amp;#8217;ll understand when you see it :)&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s best to switch to full-screen:&lt;/p&gt;
&lt;p&gt;&lt;iframe src=&quot;http://player.vimeo.com/video/22594668&quot; width=&quot;707&quot; height=&quot;726&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;The blob in the middle is Switzerland, by the way ;)&lt;/p&gt;
&lt;p&gt;How do we do it?&lt;/p&gt;
&lt;h2&gt;The server code&lt;/h2&gt;
&lt;p&gt;The server … you probably could have done sleeping if you&amp;#8217;ve been reading this blog dilligently ;)&lt;/p&gt;
&lt;p&gt;The data comes from the &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file &lt;code&gt;data/swiss_places.csv&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;places = Index::Memory.new :geo do
  source         Sources::CSV.new(:location, :north, :east, file: 'data/swiss_places.csv')
  category       :location, partial: Partial::Substring.new(from: 1)
  geo_categories :north, :east, 1, precision: 3
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What&amp;#8217;s interesting here is the &lt;code&gt;geo_categories&lt;/code&gt; method. It takes two categories, &lt;code&gt;north&lt;/code&gt;, and &lt;code&gt;east&lt;/code&gt;, which are both in the lat/lng format, e.g. &lt;code&gt;47.2&lt;/code&gt;, &lt;code&gt;8.3&lt;/code&gt;. (It also takes options &lt;code&gt;lat_from&lt;/code&gt;, and &lt;code&gt;lng_from&lt;/code&gt; if the categories don&amp;#8217;t have the same names as in the data source)&lt;/p&gt;
&lt;p&gt;Also, the 1 parameter in &lt;code&gt;geo_categories&lt;/code&gt; denotes that we search 1 km around the clicked location.&lt;/p&gt;
&lt;p&gt;This is actually the simple part. It does no exact calculation, but an approximate one that&amp;#8217;s most correct in temperate zones. But as you see in the video, it works well. Especially in a &amp;#8220;what&amp;#8217;s around me&amp;#8221; type search.&lt;/p&gt;
&lt;p&gt;Still in the server config &lt;code&gt;app/application.rb&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/places\Z} =&amp;gt; Search.new(places)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Self-explanatory, eh? As regexp, you could also use &lt;code&gt;%r{^/places$}&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it for the server. Nothing special so far.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rake index; rake start&lt;/code&gt; and off we go.&lt;/p&gt;
&lt;h2&gt;The client code&lt;/h2&gt;
&lt;p&gt;In this part we&amp;#8217;re going to install the map.&lt;/p&gt;
&lt;p&gt;So we&amp;#8217;re using the generated code, but add a little more information to the returned json hash.&lt;/p&gt;
&lt;p&gt;We not only need the list results, but also the coordinates themselves. So we&amp;#8217;re going to add them to the results separately.&lt;/p&gt;
&lt;p&gt;We (ab)use &lt;code&gt;populate_with&lt;/code&gt;, the method that makes models out of the returned ids and yields them to the block to be rendered.&lt;/p&gt;
&lt;p&gt;We then use the models to add geo coordinates to the result hash that is sent to the client.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;results = Geo.search params[:query], :ids =&amp;gt; params[:ids], :offset =&amp;gt; params[:offset]
results.extend Picky::Convenience
results[:geo] ||= [] # &amp;lt;= We initialize an array of coordinates in the results hash.
results.populate_with Location do |location|
  results[:geo] &amp;lt;&amp;lt; [location.north, location.east] # &amp;lt;- and we populate it with the coordinates.
  location.to_s
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So essentially, our geo data piggybacks to the Javascript client. JS, here we come!&lt;/p&gt;
&lt;h2&gt;The javascript client code&lt;/h2&gt;
&lt;p&gt;The javascript client requires a bit more work. Well, the map does.&lt;/p&gt;
&lt;p&gt;We insert this after the &lt;code&gt;PickyClient&lt;/code&gt; code. The first 6 lines are noise and map preparation.&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;// The map
//
$(document).ready(function() {
  if (GBrowserIsCompatible()) {
    // Map setup.
    //
    map = new GMap2(document.getElementById('map_div'));
    map.addControl(new GSmallMapControl());
    map.setCenter(new GLatLng(46.85, 8.05), 13);
    map.setZoom(7);

    // Click listener.
    //
    GEvent.addListener(map, &quot;click&quot;, function(overlay, latlng) {
      if (latlng) {
        pickyClient.insert(Math.round(latlng.lat()*1000)/1000 + ' ' + Math.round(latlng.lng()*1000)/1000);
      }
    });
  }
});&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, we add the most important part: A click &lt;code&gt;listener&lt;/code&gt; that inserts the coordinates (rounded to 3 digits) in the search field, as you have seen in the video.&lt;/p&gt;
&lt;p&gt;Now, searches are already sent off to Picky and come back. Whoosh!&lt;/p&gt;
&lt;p&gt;What do we need to do now? Yes, draw some markers in the map. The &lt;code&gt;PickyClient&lt;/code&gt; offers a callback that is called after Picky has updated the results (there are also &lt;code&gt;before&lt;/code&gt; and &lt;code&gt;success&lt;/code&gt;):&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;after: function(data, query) {
  map.clearOverlays();

  var geo = data.original_hash.geo;
  if (geo) {
    for (var i = 0; i &amp;lt; geo.length; i++) {
      map.addOverlay(new GMarker(new GLatLng(geo[i][0], geo[i][1])));
    };
  }
},&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First we clear the overlays for the new results.&lt;/p&gt;
&lt;p&gt;Then, we get the piggybacking geo data using the data object&amp;#8217;s &lt;code&gt;original_hash&lt;/code&gt; function, finally iterating over all coordinates and adding overlays as we go.&lt;/p&gt;
&lt;p&gt;By default, the client only gets 20 results at a time. We set it to 100 using the &lt;code&gt;fullResults&lt;/code&gt; option.&lt;/p&gt;
&lt;pre class=&quot;sh_javascript&quot;&gt;&lt;code&gt;fullResults: 100&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it. It&amp;#8217;s fast and quite easy to set up.&lt;/p&gt;
&lt;h2&gt;Sidenote&lt;/h2&gt;
&lt;p&gt;Since for Swiss data it is clear which is the longitude and which is the latitude (no data intersection), we can just enter e.g. &lt;code&gt;47.2 8.3&lt;/code&gt;, but if your data area isn&amp;#8217;t exclusive, e.g. &lt;code&gt;33.1 33.2&lt;/code&gt;, meaning that latitude values can also be longitude values, just add &lt;code&gt;north:33.1 east:33.2&lt;/code&gt;, to denote what is what if north, east are the names of your categories.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that a geo search in Picky is quite snappy.&lt;/li&gt;
	&lt;li&gt;that you can search for latitude and location name only, for example.&lt;/li&gt;
	&lt;li&gt;how you can configure the server.&lt;/li&gt;
	&lt;li&gt;how you can configure the client and the web frontend.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Environmental&amp;nbsp;Considerations</title>
   <link href="http://florianhanke.com/blog/2011/04/18/picky-environmental-considerations.html"/>
   <updated>2011-04-18T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/18/picky-environmental-considerations</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;(Man, being in Australia is cool in that I can post on the 18th, while most of you are still wallowing in the 17th)&lt;/p&gt;
&lt;p&gt;This is a Google Analytics driven post. I saw recently that many people looked for &amp;#8220;Picky environment and Rails&amp;#8221; or similar.&lt;/p&gt;
&lt;h2&gt;PICKY_ENVIRONMENT and PICKY_ROOT&lt;/h2&gt;
&lt;p&gt;Almost like e.g. Rails, Picky has an constant ready for your environment handling: &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s what you use to differentiate, for example, data source files from each other. So you might have a &lt;code&gt;data&lt;/code&gt; directory with population data for zimbabwe in the &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; format. It would be a good idea to have three different files, &lt;code&gt;data/development/zimbabwe.csv&lt;/code&gt;, &lt;code&gt;data/test/zimbabwe.csv&lt;/code&gt;, and &lt;code&gt;data/production/zimbabwe.csv&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;(Since for testing you probably use only a subset of your data)&lt;/p&gt;
&lt;p&gt;Then, in your index data source definition, use &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new(:zimbabwe) do
  source Sources::CSV.new(file: &quot;data/#{PICKY_ENVIRONMENT}/zimbabwe.csv&quot;)
  # ...
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Well, you&amp;#8217;re probably used to that from using Rails, right?&lt;/p&gt;
&lt;p&gt;It may be interesting how this constant is defined.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ENV['PICKY_ENV'] ||= ENV['RACK_ENV']

PICKY_ENVIRONMENT = ENV['PICKY_ENV'] || 'development' unless defined? PICKY_ENVIRONMENT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, if you haven&amp;#8217;t set the &lt;code&gt;PICKY_ENV&lt;/code&gt; environment variable, Picky will use the one set by Rack. Then, if you haven&amp;#8217;t set &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt; explicitly by hand, Picky will use the environment variable to set &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So you have two overriding possibilities: Either through an env variable, or through setting a Ruby constant.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;PICKY_ROOT&lt;/code&gt; is also available, and is defined like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;PICKY_ROOT = Dir.pwd unless defined? PICKY_ROOT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It just uses the current directory, unless you want it to point somewhere else, explicitly. Everywhere in Picky where a file is used (mostly in the data sources), &lt;code&gt;PICKY_ROOT&lt;/code&gt; is used.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt; and &lt;code&gt;PICKY_ROOT&lt;/code&gt; are set.&lt;/li&gt;
	&lt;li&gt;how you can use &lt;code&gt;PICKY_ENVIRONMENT&lt;/code&gt; to your advantage.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;#58;&amp;nbsp;Integration&amp;nbsp;Testing</title>
   <link href="http://florianhanke.com/blog/2011/04/17/picky-integration-testing.html"/>
   <updated>2011-04-17T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/17/picky-integration-testing</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;Let me start off by saying that it&amp;#8217;s embarrassing that this topic is discussed only as Picky 2.3.0 is released. Especially as a proponent of test driven design. (Picky has 1300 tests and 50% more spec code than normal code)&lt;/p&gt;
&lt;p&gt;So let&amp;#8217;s check out how you can write the most beautifully tested Picky servers. Oh yeah.&lt;/p&gt;
&lt;h2&gt;Doin&amp;#8217; it&lt;/h2&gt;
&lt;p&gt;As of 2.3.0, if you use &lt;code&gt;picky generate unicorn_server&lt;/code&gt;, you&amp;#8217;ll get a &lt;code&gt;rake spec&lt;/code&gt; for free which already runs integration specs on the example data.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s look at the example, and after that, at each separate part.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'spec_helper'
require 'picky-client/spec'

describe 'Integration Tests' do

  before(:all) do
    Indexes.index_for_tests
    Indexes.load_from_cache
  end

  let(:books) { Picky::TestClient.new(PickySearch, :path =&amp;gt; '/books') }

  # Testing a count of results.
  #
  it { books.search('a s').total.should == 42 }

  # Testing a specific order of result ids.
  #
  it { books.search('alan').ids.should == [259, 307, 449] }

  # Testing an order of result categories.
  #
  it { books.search('alan').should have_categories(['author'], ['title']) }
  it { books.search('alan p').should have_categories(['author', 'title'], ['title', 'author']) }

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It starts off like any RSpec file, by requiring &lt;code&gt;spec_helper&lt;/code&gt;. Then we require the spec part of the picky client.&lt;/p&gt;
&lt;p&gt;What does it do? It provides us with the testing counterpart of the client&amp;#8217;s &lt;code&gt;Picky::Client&lt;/code&gt;, which is &lt;code&gt;Picky::TestClient&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The test client works almost exactly like the real client, with the exception that the test client never sends &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; requests, but uses your app&amp;#8217;s Rack adapter. But more about that later.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'spec_helper'
require 'picky-client/spec'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, we set up the environment for the tests, i.e. get the indexes up and running.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Indexes.index_for_tests&lt;/code&gt; is a special index method that does not fork and runs silently (to not disturb the deadly test bugs that trawl the area).&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;before(:all) do
  Indexes.index_for_tests
  Indexes.load_from_cache
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;Indexes.load_from_cache&lt;/code&gt; loads the generated index (caches) into memory (or just leaves them alone in Redis).&lt;/p&gt;
&lt;p&gt;Now we&amp;#8217;re ready to do some testing!&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;let(:books) { Picky::TestClient.new(PickySearch, :path =&amp;gt; '/books') }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This sets up an accessor for your tests. You give the &lt;code&gt;TestClient&lt;/code&gt; your Application&amp;#8217;s constant, &lt;code&gt;PickySearch&lt;/code&gt; here, and give it the path to send queries to, here &lt;code&gt;'/books'&lt;/code&gt;. This only works if you &lt;code&gt;route&lt;/code&gt; the path &lt;code&gt;'/books'&lt;/code&gt; to a &lt;code&gt;Search&lt;/code&gt; in your &lt;code&gt;application/app.rb&lt;/code&gt;, of course.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it! Easy so far, right?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Testing a count of results.
#
it { books.search('a s').total.should == 42 }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;books&lt;/code&gt; is the test client we defined with the &lt;code&gt;let&lt;/code&gt;, above. As with the normal &lt;code&gt;Picky::Client&lt;/code&gt;, it offers a &lt;code&gt;#search(text, options = {})&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;As return value, we get a hash with the result data. However, it has already been enriched through &lt;code&gt;Picky::Convenience&lt;/code&gt;, which you might know if you&amp;#8217;ve set up a client webapp already.&lt;/p&gt;
&lt;p&gt;This means we get a &lt;code&gt;#total&lt;/code&gt; method, but also &lt;code&gt;#ids&lt;/code&gt;, &lt;code&gt;#empty?&lt;/code&gt;, &lt;code&gt;#allocations&lt;/code&gt; and more which are less useful for testing.&lt;/p&gt;
&lt;p&gt;So to test the count of results, just use &lt;code&gt;#total&lt;/code&gt; on the result of the search.&lt;/p&gt;
&lt;p&gt;To get a sorted array of the top ids, use &amp;#8211; surprise &amp;#8211; &lt;code&gt;#ids&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Testing a specific order of result ids.
#
it { books.search('alan').ids.should == [259, 307, 449] }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also useful is to test if the category combination boosting/weights are correct. So if &lt;code&gt;author&lt;/code&gt;, like in the first example below, should be boosted, use the &lt;code&gt;have_categories&lt;/code&gt; matcher to check for that.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;# Testing an order of result categories.
#
it { books.search('alan').should have_categories(['author'], ['title']) }
it { books.search('alan p').should have_categories(['author', 'title'], ['title', 'author']) }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And that&amp;#8217;s how you do integration testing in Picky.&lt;/p&gt;
&lt;p&gt;About time. Test away!&lt;/p&gt;
&lt;h2&gt;spec_helper and Rakefile&lt;/h2&gt;
&lt;p&gt;This is what your &lt;code&gt;spec/spec_helper.rb&lt;/code&gt; would look like:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ENV['PICKY_ENV'] = 'test'

require 'picky'

SearchLog = Loggers::Search.new ::Logger.new(STDOUT)
puts &quot;Using STDOUT as test log.&quot;

Loader.load_application&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the &lt;code&gt;Rakefile&lt;/code&gt; just add&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'rspec'
require 'rspec/core/rake_task'

RSpec::Core::RakeTask.new :spec&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;if you haven&amp;#8217;t done this already.&lt;/p&gt;
&lt;h2&gt;Sidenote&lt;/h2&gt;
&lt;p&gt;Should any RSpec vs. Test::Unit controversy erupt around Picky… just kidding ;)&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how you do integration testing in Picky&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;2.2.0</title>
   <link href="http://florianhanke.com/blog/2011/04/14/picky-two-point-two-point-oh.html"/>
   <updated>2011-04-14T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/14/picky-two-point-two-point-oh</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;Picky 2.2.0 will be released shortly.&lt;/p&gt;
&lt;p&gt;What is good and new?&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Breaking &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; change (Please read this if you already have Picky running)&lt;/li&gt;
	&lt;li&gt;More flexible sources (This is the cool stuff)&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;rake search&lt;/code&gt; is now &lt;code&gt;picky search&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;Uses ActiveRecord/ActiveSupport 3.0&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Breaking &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; change&lt;/h2&gt;
&lt;p&gt;2.2.0 will introduce an &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; change that will break your existing, pre-2.2.0 server configuration.&lt;/p&gt;
&lt;p&gt;Instead of as second parameter, the data source is now passed in as an option, or called inside the configuration block.&lt;/p&gt;
&lt;p&gt;The old style:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :users, your_data_source do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;has now become the&lt;/p&gt;
&lt;p&gt;new style:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :users, source: your_data_source do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;OR&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :users do
  source   your_data_source
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;p&gt;The old style was actually more correct, since an index &lt;strong&gt;needs&lt;/strong&gt; a data source. But I never really got friends with it, since it looked so unwieldy, especially when you have a &amp;#8220;long&amp;#8221; data source, like
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Sources::CSV.new(:abra, :ca, :dabra, file: 'some/file/that/is/somewhere.csv')&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;The new style is much cleaner to look at. And Picky will tell you if you forgot the data source as early as possible.&lt;/p&gt;
&lt;p&gt;If you use the old style config, Picky will tell you how you need to update your config on server restart. But still, sorry about the breaking change!&lt;/p&gt;
&lt;h2&gt;Flexible sources&lt;/h2&gt;
&lt;p&gt;We&amp;#8217;ve completely rewritten the sources.&lt;/p&gt;
&lt;p&gt;Before 2.2.0, the data source needed to be an object that responds to the &lt;code&gt;#harvest&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;In 2.2.0, it can be any object responding to the &lt;code&gt;#each&lt;/code&gt; method, if that method returns objects that at least respond to the &lt;code&gt;#id&lt;/code&gt; method and to any methods specified by the category method.&lt;/p&gt;
&lt;p&gt;Let me give you an example. Let&amp;#8217;s say we have some monkeys that we&amp;#8217;d like to index.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Monkey
  attr_reader :id, :name, :color
  def initialize id, name, color
    @id, @name, @color = id, name, color
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We&amp;#8217;ll create three monkeys and save them in an array:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;monkeys = [
  Monkey.new(1, 'pete', 'red'),
  Monkey.new(2, 'joey', 'green'),
  Monkey.new(3, 'hans', 'blue')
]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, since an &lt;code&gt;Array&lt;/code&gt; has the &lt;code&gt;#each&lt;/code&gt; method, you can index it:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :monkeys do
  source   monkeys
  category :name
  category :couleur, :from =&amp;gt; :color # The couleur category will take its data from the #color method.
end&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;Since each monkey has an &lt;code&gt;#id&lt;/code&gt;, a &lt;code&gt;#name&lt;/code&gt;, and a &lt;code&gt;#color&lt;/code&gt; method, Picky will happily index the monkeys for you. Note that the couleur category uses the &lt;code&gt;from&lt;/code&gt; option to define from where in the source it takes its data from.&lt;/p&gt;
&lt;p&gt;Hmmmm&amp;#8230; id method? You&amp;#8217;re probably thinking the same thing as I.&lt;/p&gt;
&lt;p&gt;MongoMapper, the new ActiveRecord and others use a fluid style interface (see last post), whose proxies support &lt;code&gt;#each&lt;/code&gt;, and the yielded objects support &lt;code&gt;#id&lt;/code&gt; and various methods!&lt;/p&gt;
&lt;p&gt;So this becomes possible:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;
# For completeness:
#
class Book &amp;lt; ActiveRecord::Base; end
Book.establish_connection YAML.load(File.open('app/db.yml'))

Index::Memory.new :books do
  source   Book.order('title ASC')
  category :id
  category :title
  category :author
  category :year
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;See the first line in the index config block?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Book.order('title ASC')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This passes the AR proxy as source to the &lt;code&gt;books&lt;/code&gt; index. Since it provides a &lt;code&gt;#each&lt;/code&gt; method, and the yielded objects support &lt;code&gt;#id&lt;/code&gt; etc., Picky will index all books in a &lt;code&gt;title ASC&lt;/code&gt; order.&lt;/p&gt;
&lt;p&gt;I love it!&lt;/p&gt;
&lt;p&gt;Note that the old style sources still work. And for &lt;code&gt;ranged_category&lt;/code&gt;-s, it is still necessary to use the old style sources. We&amp;#8217;ll be working on that, but for the near future, use the old style sources for range/area/volume searches.&lt;/p&gt;
&lt;h2&gt;rake search &amp;#8594; picky search&lt;/h2&gt;
&lt;p&gt;See the last post.&lt;/p&gt;
&lt;p&gt;Since &lt;code&gt;rake search&lt;/code&gt; was project specific, but its functionality is actually &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; specific, I&amp;#8217;ve deprecated the rake task (it will tell you so), and created &lt;code&gt;picky search&lt;/code&gt; that you can use.&lt;/p&gt;
&lt;h2&gt;AR 3.0 / AS 3.0&lt;/h2&gt;
&lt;p&gt;In other news, Picky now uses AR 3.0 / AS 3.0.&lt;/p&gt;
&lt;p&gt;In your existing Gemfile, please update the line&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;gem 'activerecord',  '~&amp;gt; 2.3.8', :require =&amp;gt; 'active_record'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;gem 'activesupport', '~&amp;gt; 3.0', :require =&amp;gt; 'active_support/core_ext'
gem 'activerecord',  '~&amp;gt; 3.0', :require =&amp;gt; 'active_record'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Thanks!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; broke a little.&lt;/li&gt;
	&lt;li&gt;that a new group of data sources is available.&lt;/li&gt;
	&lt;li&gt;that rake search is now picky search.&lt;/li&gt;
	&lt;li&gt;that Picky now uses AR 3.0 / AS 3.0.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky&amp;nbsp;Data&amp;nbsp;Sources&amp;#58; Next&amp;nbsp;Steps</title>
   <link href="http://florianhanke.com/blog/2011/04/12/picky-data-sources-next-steps.html"/>
   <updated>2011-04-12T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/12/picky-data-sources-next-steps</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings.&lt;/p&gt;
&lt;p&gt;For quite some time now I have been thinking about rewriting the Picky data sources.&lt;/p&gt;
&lt;p&gt;Although the ones that Picky use now work well, they do feel unelegant and unruby-ish.&lt;/p&gt;
&lt;p&gt;But I&amp;#8217;ll let you be the judge of that in the next part: How it works currently.&lt;/p&gt;
&lt;p&gt;After that, I&amp;#8217;ll talk about the problems with the current approach, and how I&amp;#8217;d like it to be and how this could be possible to do. Feedback welcome, as always!&lt;/p&gt;
&lt;h2&gt;How does it work now?&lt;/h2&gt;
&lt;p&gt;At the moment, every index needs a data source. So you might write:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data_source = Sources::DB.new 'SELECT id, title, author, year FROM books', file: 'app/db.yml'
Index::Memory.new :books, data_source do
  # categories ...
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the example, the data is coming from a database which is defined in &lt;code&gt;app/db.yml&lt;/code&gt; (the &lt;code&gt;file&lt;/code&gt; option).&lt;/p&gt;
&lt;p&gt;Then, Picky&amp;#8217;s indexer takes a snapshot of the data using your query and saves it in another table. The query can be anything, with joins and conditions etc.&lt;/p&gt;
&lt;p&gt;Then, from this intermediate table, it will load batches of data, ordered in the way you ordered the results in your DB data source query.&lt;/p&gt;
&lt;p&gt;So if you happened to say&lt;/p&gt;
&lt;pre class=&quot;sh_sql&quot;&gt;&lt;code&gt;SELECT id, titulo as title, author, year FROM books ORDER BY year DESC&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;then your results would be ordered by &lt;code&gt;year&lt;/code&gt;, descending.&lt;/p&gt;
&lt;p&gt;Picky is really &lt;em&gt;data driven&lt;/em&gt;. If you sort the data in a certain way, it will be sorted like that in the results. (Well, inside each category combination, but let&amp;#8217;s not go into that for the moment. Just know that it will help your user.)&lt;/p&gt;
&lt;p&gt;By the way, don&amp;#8217;t hesitate to use &lt;code&gt;REGEXP&lt;/code&gt;, &lt;code&gt;SUBSTRING&lt;/code&gt; or other functions in your &lt;code&gt;SELECT&lt;/code&gt; statement to preprocess your data. It&amp;#8217;s incredibly powerful to preprocess your data.&lt;/p&gt;
&lt;h2&gt;How does it work in the code?&lt;/h2&gt;
&lt;p&gt;What Picky does is instantiate an indexer for each combination of (index, category, source, tokenizer). So as an example, it is indexing the &lt;code&gt;title&lt;/code&gt; category of a &lt;code&gt;books&lt;/code&gt; index, with data coming from a &lt;code&gt;db source&lt;/code&gt;, using the &lt;code&gt;indexing tokenizer&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What the indexer first does is call the &lt;code&gt;harvest(index, category)&lt;/code&gt; method on the data source, passing it the current index and category. That&amp;#8217;s step 1.&lt;/p&gt;
&lt;p&gt;The source can then use the index and/or category to get the data from its backend.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-04-12-data-sources.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The source then gets the data from the backend and extracts the relevant parts. For the books index and title category it would do a select on the database using that information. Then, in step 2, it yields (slightly normalized) information back to the indexer, i.e. the id to index, and the data, the text to index.&lt;/p&gt;
&lt;p&gt;The indexer then, in step 3, tokenizes the data as you defined with the &lt;code&gt;default_indexing&lt;/code&gt; options, and finally, after some caching, writes it to the human readable index file.&lt;/p&gt;
&lt;p&gt;The human readable index files are located in the Picky server directory &lt;code&gt;index/{development,test,production}/books/&lt;/code&gt; where you&amp;#8217;ll find lots of files named &lt;code&gt;category_...&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I urge you to look at them! Lots of indexing questions can be answered by just looking at &lt;code&gt;title_exact_index.json&lt;/code&gt;, for example.&lt;/p&gt;
&lt;p&gt;Note that all index files are encoded in json, with the exception of the similarity indexes, which are &lt;code&gt;Marshal&lt;/code&gt; dumped. So these are only human readable if you load them using &lt;code&gt;Marshal.load&lt;/code&gt;, I&amp;#8217;m afraid.&lt;/p&gt;
&lt;h2&gt;The problems&lt;/h2&gt;
&lt;p&gt;Although it all sounds nice, probably, there are three problems:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The indexer is a &amp;#8220;serial&amp;#8221; indexer. Meaning that for each category, it asks the database to give it the data for the current category. So for each id, it asks the database for each data category separately. So for id 1 it asks for the title, then, later, for the author etc.&lt;/li&gt;
	&lt;li&gt;In a similar vein, if I like to index correlated values, like geocoded data, that needs to be processed, it is simply not possible with the current indexer.&lt;/li&gt;
	&lt;li&gt;It is a bit unwieldy seeming for a user, imho. This could be a sign that it could be more elegant.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let&amp;#8217;s look at the problems in more detail:&lt;/p&gt;
&lt;h3&gt;Serial Indexer&lt;/h3&gt;
&lt;p&gt;The first problem, that Picky is going to the database for each category, is of a performance nature. Although it does not have much impact (you probably haven&amp;#8217;t noticed it yet), the way it is doing it now, it is still irking me that it does several return trips per id.&lt;/p&gt;
&lt;h3&gt;Correlated values not possible&lt;/h3&gt;
&lt;p&gt;Correlated values are not possible. What does this mean?&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s say that we have geocoded data, longitude and latitude. If you now try to do a geosearch by (ab)using the &lt;code&gt;ranged_category&lt;/code&gt; method, you will experience problems, the closer to the pole the location is. While on the equator, Picky will search around it in a nice square.&lt;/p&gt;
&lt;p&gt;But if you e.g. move to the north, since the longitudinal lines are closer and closer together, so will the ranged search distance. While 0.008 degrees might mean a kilometer on the equator, near the north pole it will be closer and closer to zero kilometers. So the square will be squished until it finally looks like a triangle.&lt;/p&gt;
&lt;p&gt;Depending on the cartographic method used, this might not be a problem for you. But it certainly is if you&amp;#8217;re looking at the whole earth. Now, if the categories were indexed together, Picky could recalculate the data for you such that the square area search (see one of the last blog posts) would be preserved.&lt;/p&gt;
&lt;p&gt;One approach to how this could look is this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :books, data_source do
  geocoded_category :longitude, :latitude, 1.km
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In a &amp;#8220;parallel&amp;#8221; indexer, Picky could load both &lt;code&gt;longitude&lt;/code&gt; and &lt;code&gt;latitude&lt;/code&gt; and do corrections on the longitude/latitude to preprocess the data so it would return correctly geocoded results.&lt;/p&gt;
&lt;h3&gt;Elegance&lt;/h3&gt;
&lt;p&gt;This is the part where I am most unsure about. But this&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data_source = Sources::DB.new 'SELECT id, title, author, year FROM books', file: 'app/db.yml'
Index::Memory.new :books, data_source do
  # categories ...
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;just doesn&amp;#8217;t look good. Granted, you need to inject a lot of information in a few lines:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Type of source (&lt;code&gt;DB&lt;/code&gt;)&lt;/li&gt;
	&lt;li&gt;Selection of data from the source (&lt;code&gt;SELECT&lt;/code&gt;)&lt;/li&gt;
	&lt;li&gt;Configuration of source (&lt;code&gt;file: 'app/db.yml'&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But still, I&amp;#8217;d love it to be much more elegant.&lt;/p&gt;
&lt;p&gt;For quite some time, I wasn&amp;#8217;t sure what to do. There isn&amp;#8217;t a single nice interface of all the data sources. ActiveRecord does it this way, MongoMapper another etc. etc.&lt;/p&gt;
&lt;p&gt;So Simon from Berlin asked me last night about whether I had experience with Picky and MongoMapper. I don&amp;#8217;t, but it would certainly be cool to include it as a data source in one of the next versions of Picky.&lt;/p&gt;
&lt;p&gt;I took a closer look at it. Similar to the new way in Rails 3, it uses a fluid interface, where some methods just modify the query, while some are &amp;#8220;kicker&amp;#8221; methods that actually do something:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;User.where(:age.gt =&amp;gt; 27).sort(:age).all&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;More here, &lt;a href=&quot;http://railstips.org/blog/archives/2010/06/16/mongomapper-08-goodies-galore/&quot;&gt;http://railstips.org/blog/archives/2010/06/16/mongomapper-08-goodies-galore/&lt;/a&gt;. The &lt;code&gt;all&lt;/code&gt; method at the end of a chain would be a kicker method, loading all objects.&lt;/p&gt;
&lt;p&gt;That got me thinking.&lt;/p&gt;
&lt;h2&gt;How I would like it to be&lt;/h2&gt;
&lt;p&gt;Wouldn&amp;#8217;t it be nice if we could just, instead of a data source, just pass any object as data source, so for example, with MongoMapper:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :books, User.where(:age.gt =&amp;gt; 27).sort(:age) do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Quite a bit sexier, imho. Since the result of the &lt;code&gt;sort(:age)&lt;/code&gt; method is a proxy that offers kicker and non-kicker methods, the Picky indexer could now call &lt;code&gt;each&lt;/code&gt; on it.&lt;/p&gt;
&lt;p&gt;The contract would then be that each object that is yielded by &lt;code&gt;#each&lt;/code&gt; must offer methods that are named like the categories (or named like the &lt;code&gt;from&lt;/code&gt; option – e.g. &lt;code&gt;category :name, from: surname&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;So, in the above example, each User object would have methods &lt;code&gt;#name&lt;/code&gt; and &lt;code&gt;#age&lt;/code&gt; such that Picky could extract the data.&lt;/p&gt;
&lt;p&gt;The cool thing with that would be that I could just pass in an Array of data. So, this would work (&lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;, &lt;code&gt;c&lt;/code&gt; all respond to &lt;code&gt;#name&lt;/code&gt;):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :books, [a, b, c] do
  category :name
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What would we have to do to make this work in Picky?&lt;/p&gt;
&lt;h2&gt;How to get there?&lt;/h2&gt;
&lt;p&gt;First of all, Picky would need to be rewritten, or at least be partially rewritten to use a &amp;#8220;parallel&amp;#8221; indexer, where each category would be loaded along with the others. So loading data set 1 would load &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;author&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt; etc. at the same time. (Since some of these frameworks throw away the data after it has been yielded with &lt;code&gt;#each&lt;/code&gt;)&lt;/p&gt;
&lt;p&gt;The nice side-effect of this is that it opens real geosearch (or any combined category search) possibilities in Picky.&lt;/p&gt;
&lt;p&gt;Probably, the frameworks offering the &lt;code&gt;#each&lt;/code&gt; way would need to yield lazily, i.e. &lt;code&gt;#each&lt;/code&gt; should not preload all the data before yielding as the data in question might be huge. Or maybe load it in batches.&lt;/p&gt;
&lt;p&gt;How could we migrate from the current state to the new indexer?&lt;/p&gt;
&lt;p&gt;I suggest that before instantiating the indexer, the index would first look at the source. If the source &lt;code&gt;responds_to?(:each)&lt;/code&gt;, the parallel indexer would be used. And if not, the &amp;#8220;serial&amp;#8221; indexer would be used, doing things the old way.&lt;/p&gt;
&lt;p&gt;So the contract for parallel sources would be that they implement &lt;code&gt;#each&lt;/code&gt; in a way that would load the data in batches and only yield objects which respond to the category names.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s see if we can get this working soon :)&lt;/p&gt;
&lt;p&gt;What I am wondering: Are we walking down a fool&amp;#8217;s path? Comment if you have an opinion about that, please.&lt;/p&gt;
&lt;h2&gt;Possible problems&lt;/h2&gt;
&lt;p&gt;One problem could be that we lose speed since we&amp;#8217;ll be instantiating lots of objects that respond to the categories. On the other hand, the return trips would not be necessary anymore.&lt;/p&gt;
&lt;p&gt;Another problem is that since we&amp;#8217;re just depending on &lt;code&gt;#each&lt;/code&gt;, we couldn&amp;#8217;t pass the source the index and category anymore. So choosing the right data would be the responsibility of the user. I do not think this to be a big problem.&lt;/p&gt;
&lt;h2&gt;Final remarks&lt;/h2&gt;
&lt;p&gt;Although I&amp;#8217;d like to make it more elegant, I&amp;#8217;d still like to preserve the old way of doing things. Sure,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Index::Memory.new :users, User do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;might look nicer than&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;user_source = Sources::DB.new 'SELECT id, name, age FROM users', file: 'app/db.yml'
Index::Memory.new :users, user_source do
  category :name, similarity: Similarity::DoubleMetaphone.new(3)
  category :age
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I&amp;#8217;d like the old way to be available, since doing the right &lt;code&gt;SELECT&lt;/code&gt; is incredibly useful.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how Picky data sources work now.&lt;/li&gt;
	&lt;li&gt;how they ought to work.&lt;/li&gt;
	&lt;li&gt;that &lt;code&gt;#each&lt;/code&gt; would be more ruby-ish.&lt;/li&gt;
	&lt;li&gt;how a migration path could look like.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; In&amp;nbsp;the&amp;nbsp;Terminal</title>
   <link href="http://florianhanke.com/blog/2011/04/11/searching-with-picky-rake-search.html"/>
   <updated>2011-04-11T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/11/searching-with-picky-rake-search</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This post is about a fun little experimental toy I&amp;#8217;ve been working on: &lt;code&gt;picky search &amp;lt;url&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Update!&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;rake search&lt;/code&gt; is &lt;code&gt;picky search&lt;/code&gt; from 2.2.0 on.&lt;/p&gt;
&lt;h2&gt;picky search &amp;lt;url&amp;gt;?&lt;/h2&gt;
&lt;p&gt;Yes. While working on a server, I sometimes want to see if the search engine works correctly directly in the terminal (normally I use tests, but sometimes I need that quick look).&lt;/p&gt;
&lt;h2&gt;How do I use it?&lt;/h2&gt;
&lt;p&gt;See this short video (it&amp;#8217;s best to full-screen it):&lt;/p&gt;
&lt;p&gt;&lt;iframe src=&quot;http://player.vimeo.com/video/22216442&quot; width=&quot;600&quot; height=&quot;387&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Start a Picky server.&lt;/li&gt;
	&lt;li&gt;Then type &lt;code&gt;picky search /some/url&lt;/code&gt; (where &lt;code&gt;/some/url&lt;/code&gt; is a path – or url if not on this server – you&amp;#8217;ve defined using &lt;code&gt;route&lt;/code&gt; in &lt;code&gt;app/application.rb&lt;/code&gt;).&lt;/li&gt;
	&lt;li&gt;Then, just type away.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The result id count will update as you type.&lt;/p&gt;
&lt;p&gt;When pressing enter, the top 20 result ids will appear next to your search text.&lt;/p&gt;
&lt;p&gt;If you want to exit, just &lt;code&gt;Ctrl-C&lt;/code&gt;. That&amp;#8217;s it.&lt;/p&gt;
&lt;p&gt;Note that you need the picky-client &amp;amp; highline gem installed. But Picky will tell you so if you haven&amp;#8217;t.&lt;/p&gt;
&lt;h2&gt;How does it work?&lt;/h2&gt;
&lt;p&gt;I use the highline gem (by &lt;a href=&quot;http://twitter.com/jeg2&quot;&gt;@JEG2&lt;/a&gt;) to get single characters (using the appropriately named &lt;code&gt;get_character&lt;/code&gt;) from the user and then move the cursor around using &lt;code&gt;\e[#{amount}D&lt;/code&gt; (left) and &lt;code&gt;\e[#{amount}C&lt;/code&gt; (right), &lt;code&gt;print&lt;/code&gt; ing to the &lt;span class=&quot;caps&quot;&gt;STDOUT&lt;/span&gt; and &lt;code&gt;flush&lt;/code&gt; ing it a lot.&lt;/p&gt;
&lt;p&gt;If there is a gem which makes it easy to position objects in the terminal which update it (by being used in a visitor pattern or however), I&amp;#8217;d like to hear about it!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how you run a search directly in the terminal.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Range/Area/Volume&amp;nbsp;etc.&amp;nbsp;Search</title>
   <link href="http://florianhanke.com/blog/2011/04/09/searching-with-picky-area-search.html"/>
   <updated>2011-04-09T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2011/04/09/searching-with-picky-area-search</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This post is all about searching areas, volumes, space and time – and more!&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;Using &lt;code&gt;ranged_category&lt;/code&gt; instead of &lt;code&gt;category&lt;/code&gt; in index definition lets you search inside numeric ranges (instead of exact or partial strings). Example:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ranged_category :height,
                50,          # units &quot;around&quot; the searched value, here: meters
                precision: 5 # very high precision, 1% error&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Warp Area&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;#area&quot;&gt;Range Search&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#areahow&quot;&gt;Range, how?&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#map&quot;&gt;Area Search&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#volumetric&quot;&gt;Volume Search&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#funky&quot;&gt;&amp;#8220;Find all locations in a thin slice of N47.11 to N47.13, whose names start with F, that are in height 362m to 462m&amp;#8221;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#spacetime&quot;&gt;Space &amp;amp; Time Search&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#radiuses&quot;&gt;Different Radiuses/Volume Sizes etc.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#caveats&quot;&gt;Caveats&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;area&quot;&gt;Range Search&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-04-09-intersection.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Picky is good at intersecting stuff – and guessing which of the intersections you actually are looking for.&lt;/p&gt;
&lt;p&gt;The pink part is where e.g. &amp;#8220;name:eisenhower&amp;#8221; and &amp;#8220;title:wa&amp;#8221;&amp;#174; intersects in a speech database, and Picky finds it. The blue part is where &amp;#8220;name:eisenhower&amp;#8221; and &amp;#8220;title:wa&amp;#8221;(rthog) intersect. Less interesting, and Picky thinks so too.&lt;/p&gt;
&lt;p&gt;Usually, what Picky does is intersecting these circles of words you are looking for, resulting in funky Venn diagrams that have so successfully been used in 60s style living rooms.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/2011-04-09-area.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Hey, doesn&amp;#8217;t a map have grids that intersect somehow? What if Picky could intersect the area between the x lines (light blue) with the area between the y lines (also light blue)?&lt;/p&gt;
&lt;p&gt;What we&amp;#8217;d get is the results in the pinkish area.&lt;/p&gt;
&lt;p&gt;This type of diagram has been successfully used by Piet Mondrian at the beginning of last century.&lt;/p&gt;
&lt;p&gt;Now, if we could pass Picky the median x value, and the median y value and get it to return the results in the pink area, wouldn&amp;#8217;t that be something?&lt;/p&gt;
&lt;p&gt;Indeed it would, and indeed it already can. You probably just didn&amp;#8217;t know.&lt;/p&gt;
&lt;h2 id=&quot;areahow&quot;&gt;But how can I do a range search?&lt;/h2&gt;
&lt;p&gt;Apart from searching exact or partial strings with the &lt;code&gt;#category&lt;/code&gt; method, Picky offers a &lt;code&gt;#ranged_category&lt;/code&gt; method for numerical values.&lt;/p&gt;
&lt;p&gt;Let me show you how it works. Let&amp;#8217;s say that I have a &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file, &lt;code&gt;mountains.csv&lt;/code&gt;, with the mountains of the world, from lowest to highest, in meters:&lt;/p&gt;
&lt;pre class=&quot;sh_csv&quot;&gt;&lt;code&gt;1, Tokelau (NZ), 5.0
...
124, Vaalserberg (NL), 321.9
...
78513, Mount Everest (NP), 8850.0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we want the user to be able to enter&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;200&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and get all the mountains that are +/- 50 meters in height away from 200.&lt;/p&gt;
&lt;p&gt;For that you use &lt;code&gt;ranged_category(name, units_around, options = {})&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data_source = Sources::CSV.new(:location, :height, file: 'data/mountains.csv')
mountains = Index::Memory.new(:mountains, data_source) do
  category        :name
  ranged_category :height, 50, precision: 3 # 50 is the units around the searched height
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So we&amp;#8217;d have a name (that is searched with the default config, like text) and a height that is searched with a precision of 3, 50 meters around the number the user enters.&lt;/p&gt;
&lt;p&gt;What does the precision mean?&lt;/p&gt;
&lt;p&gt;Precision 1, the default, has 5% error and is really, really fast, and precision 5 has 1% error and is just fast. You can go up to wherever you want, but 5 is a good tradeoff if you need a precise result.&lt;/p&gt;
&lt;p&gt;Note that – since Picky does intersections – you can also search for height &lt;span class=&quot;caps&quot;&gt;AND&lt;/span&gt; name at the same time. If you add a full partial search option to the name category, &lt;code&gt;category :name, partial: Partial::Substring.new(from: 1)&lt;/code&gt; then when you search for example for&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;300 va&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you will find all the mountains from height 250 to 350 whose name starts with &amp;#8220;va&amp;#8221;. Nice eh?&lt;/p&gt;
&lt;h2 id=&quot;map&quot;&gt;Nice indeed, but can I use this for an area search?&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s say I have a &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file, &lt;code&gt;swiss_places.csv&lt;/code&gt;, with all places, 20910 in all, in Switzerland, like so:&lt;/p&gt;
&lt;pre class=&quot;sh_csv&quot;&gt;&lt;code&gt;1,Zuger See,47.11667,8.48333
2,Zwischbergental,46.16667,8.13333
3,Zwischbergen,46.16667,8.11667
4,Zwingen,47.43825,7.53027
...
20910,Les 4 Vallées,46.17572,7.32142&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the data. Then I tell Picky where to find the data (in the &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt;) and how to index it:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data_source = Sources::CSV.new(:location, :north, :east, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
  category        :location
  ranged_category :north, 0.01, precision: 3
  ranged_category :east,  0.01, precision: 3
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This means that we can search for the location, and the north and the east value, with 0.01 leeway around the searched number. So entering 47.12 would find numbers in the range 47.11..47.13.&lt;/p&gt;
&lt;p&gt;Now, if you search for&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;47.12, 8.48&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you find the &amp;#8220;Zuger See&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Since for Switzerland, the north and east coordinates are exclusive (one around 47, the other around 8.4), Picky knows what is what by itself.&lt;/p&gt;
&lt;p&gt;If your values aren&amp;#8217;t exclusive, for example both are in the range 1..3, then entering the search&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;1.3, 2.4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;might make Picky ask you which one is what. It&amp;#8217;s not clear if you want 1.3 from the one and 2.4 from the other, and voice versa. This can be remedied by exclusively specifying what is what:&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;north:1.3, east:2.4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The best thing is that you don&amp;#8217;t need to use the Picky interface. You could whip up a Javascript interface (of some area) where you click into and run searches on Picky, then returning results that are displayed in the area.&lt;/p&gt;
&lt;p&gt;But now, let&amp;#8217;s go a little crazy!&lt;/p&gt;
&lt;h2 id=&quot;volumetric&quot;&gt;Volumetric Search&lt;/h2&gt;
&lt;p&gt;Say, the swiss data also had heights:&lt;/p&gt;
&lt;pre class=&quot;sh_csv&quot;&gt;&lt;code&gt;1,Zuger See,47.11667,8.48333,410.0
2,Zwischbergental,46.16667,8.13333,
...
20910,Les 4 Vallées,46.17572,7.32142,1205.3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Just add the new line in the index definition, and in the source:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;data_source = Sources::CSV.new(:location, :north, :east, :height, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
  ...
  ranged_category :height, 50
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Voilà!&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;47.12, 8.48, 400&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This would make you find the &amp;#8220;Zugerberg&amp;#8221;, while using a height of 500 wouldn&amp;#8217;t.&lt;/p&gt;
&lt;h3 id=&quot;funky&quot;&gt;Let&amp;#8217;s get funky!&lt;/h3&gt;
&lt;p&gt;We don&amp;#8217;t need to use all categories:&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;47.12, f*, 412&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Funky search, but this would find all locations in a thin band of north 47.11..47.13, whose names start with f, and that are in height 362..462.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s add more dimensions.&lt;/p&gt;
&lt;h2 id=&quot;spacetime&quot;&gt;Space and Time&lt;/h2&gt;
&lt;p&gt;So how would we search in space and time? Space is easy, that is just a volumetric search.&lt;/p&gt;
&lt;p&gt;Now: How would you add in time?&lt;/p&gt;
&lt;p&gt;Probably you&amp;#8217;d index it in seconds from January 1st, 1970 or something like that, then define a ranged search with &amp;#8220;radius&amp;#8221; 1800. This would make Picky find things in the hour around the searched seconds since 1970.&lt;/p&gt;
&lt;h2 id=&quot;radiuses&quot;&gt;I want to be able to search in 1m, 10m, 100m&lt;/h2&gt;
&lt;p&gt;Now, as you saw, we looked for heights 50 meters around it using:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ranged_category :height, 50&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What if we want to search 1 meter, 10 meters, 100 meters around it, choosing as we go?&lt;/p&gt;
&lt;p&gt;This is accomplished by adding more searchable categories, like so. You name the category specifically, and tell Picky from where in the data source it should get the data, using the &lt;code&gt;from&lt;/code&gt; option.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;ranged_category :height1,     1, from: height
ranged_category :height10,   10, from: height
ranged_category :height100, 100, from: height
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Choosing from the categories is done as usual. If you want 10 meters, search like this:&lt;/p&gt;
&lt;pre class=&quot;sh_search&quot;&gt;&lt;code&gt;height10:412&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will find locations of heights 402..422.&lt;/p&gt;
&lt;h2 id=&quot;caveats&quot;&gt;Caveats&lt;/h2&gt;
&lt;p&gt;Actually, if you use the &lt;code&gt;ranged_category&lt;/code&gt; on a larger area on a ball, like earth. For example in Australia – the place I am staying in, currently – what you will find is that the more south you go, towards the pole, the less square and more rectangular your search area will get. This is because Picky does not correct the ball&amp;#8217;s sphere. I&amp;#8217;m working on it.&lt;/p&gt;
&lt;p&gt;So, Picky cannot handle your balls yet.&lt;/p&gt;
&lt;p&gt;For small countries it is still useful, and of course for lots of graph searches etc.&lt;/p&gt;
&lt;p&gt;Flat things it does marvellously. And super fast!&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how Picky can search areas.&lt;/li&gt;
	&lt;li&gt;how Picky can search volumes.&lt;/li&gt;
	&lt;li&gt;how Picky can search any number of dimensions.&lt;/li&gt;
	&lt;li&gt;how you can choose any combination of areas and other features.&lt;/li&gt;
	&lt;li&gt;how you search in different ranges on the same thing/category.&lt;/li&gt;
	&lt;li&gt;that you cannot quite search on a ball, like earth.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">On Searching</title>
   <link href="http://florianhanke.com/blog/2011/03/30/on-searching.html"/>
   <updated>2011-03-30T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/03/30/on-searching</id>
   <content type="html">&lt;h2&gt;tl; dr&lt;/h2&gt;
&lt;p&gt;This post is about engineers, our pride in information gathering and organizing, and how we often fail in information searching.&lt;/p&gt;
&lt;p&gt;Also about different types of search engines.&lt;/p&gt;
&lt;h2&gt;Engineering&lt;/h2&gt;
&lt;p&gt;Imagine a structural engineer planning a bridge.&lt;/p&gt;
&lt;p&gt;How do you think does he approach the problem? Does he just build a standard concrete/steel bridge?&lt;/p&gt;
&lt;p&gt;Probably not. He analyzes the constraints put on it by various factors, monetary, environmental, political, and last but not least – time, and sets out to build the bridge that fits as many of these constraints as possible.&lt;/p&gt;
&lt;p&gt;Similarly with software engineering: We analyze various options, plan, code, release. (In this magical dream world I am conjuring up. But bear with me.)&lt;/p&gt;
&lt;p&gt;And most of the time, we do this well. An incredible number of blog posts, books etc. describe various options and tools in the software world that can be used as blueprints, tools, or inspiration to build our specific &amp;#8220;bridges&amp;#8221;.&lt;/p&gt;
&lt;h2&gt;Information gathering and structuring&lt;/h2&gt;
&lt;p&gt;When it is about collecting information, we are world masters.&lt;/p&gt;
&lt;p&gt;There is an enormous wealth of information regarding how to structure data, which database/key-value-store/glorified-hash etc. to use, when, and how.&lt;/p&gt;
&lt;p&gt;How to acquire users, how to aqcuire information from these users, also how to access this information through APIs and how to make information accessible and so on.&lt;/p&gt;
&lt;p&gt;Tell me the size of your valley, the amount and color of cars expected, and I can provide you a set of blueprints in a nice price range.&lt;/p&gt;
&lt;p&gt;This is great. But what happens when it is about making this information searchable? Not so great in my humble opinion.&lt;/p&gt;
&lt;h2&gt;Information searching&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;ve recently experienced a few cases where the analysis for which search engine to use went something like this:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&amp;#8220;Oh, we&amp;#8217;ve used it for project X, it will be great in project Y (totally different project).&amp;#8221;&lt;/li&gt;
	&lt;li&gt;&amp;#8220;Just use the gem in ActiveRecord, and it takes care of everything.&amp;#8221;&lt;/li&gt;
	&lt;li&gt;&amp;#8220;Search engine X is cool, Y recommended it on his blog yesterday!&amp;#8221;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While I appreciate the strict time constraints often involved in projects, the above reasons should not be used by an engineer worth his salt.&lt;/p&gt;
&lt;p&gt;Yes, using search engine X will not end in disaster, and yes, it will return some results to the user.&lt;/p&gt;
&lt;p&gt;But instead of building an elegant bamboo bridge over the wide jungle river, perfect for one person, you built a concrete bridge.&lt;/p&gt;
&lt;p&gt;Yes, it works. Yes, a person can safely cross the river. But most of the jungle is destroyed. Nobody feels comfortable using it. The town next to it had to spend most of its money on it.&lt;/p&gt;
&lt;p&gt;What I am saying is: While you arrived through lots of reasoning why you use e.g. Redis over MongoDB, and can and will defend it if asked for your reasons – in information searching, this is often not the case.&lt;/p&gt;
&lt;p&gt;Or can you tell me why you used search engine X in your last project?&lt;/p&gt;
&lt;p&gt;I know that often the first step is information acquiring, and towards the end, project managers notice that they were so busy acquiring all this information, that they totally forgot to think about making this information properly searchable. Time constraints then trash sensible search engine selection.&lt;/p&gt;
&lt;p&gt;There are other reasons as to why searching is neglected, but this is one I most often experienced.&lt;/p&gt;
&lt;h2&gt;The resulting problem&lt;/h2&gt;
&lt;p&gt;Often the end result of our careless choice is that the coders are quite happy, and the end users are relatively happy. But not a good happy, more of an accepting happy. Yes, we can search, and we should be grateful for it.&lt;/p&gt;
&lt;p&gt;But are you really happy? Did you really put your engineering savvy into it to help your users advance?&lt;/p&gt;
&lt;p&gt;Not really, right?&lt;/p&gt;
&lt;h2&gt;What we need to do&lt;/h2&gt;
&lt;p&gt;Know your problem domain, your information structure. Know your options and tools too.&lt;/p&gt;
&lt;p&gt;Do you specifically need a realtime indexing search? There&amp;#8217;s &lt;a href=&quot;http://masanjin.net/whistlepig/&quot;&gt;one written in Ruby&lt;/a&gt; (just as an example for a rather special/specific search engine – not sure how far it is yet).&lt;/p&gt;
&lt;p&gt;Do you really need a full-text search? Do you know &lt;a href=&quot;http://en.wikipedia.org/wiki/Full_text_search&quot;&gt;what a full text search&lt;/a&gt; is? Do you know when to use one and also, when not? When is a &lt;a href=&quot;http://en.wikipedia.org/wiki/Semantic_search&quot;&gt;semantic search engine&lt;/a&gt; the better choice?&lt;/p&gt;
&lt;p&gt;Do you know the answers?&lt;/p&gt;
&lt;p&gt;Btw, not dissing full-text search engines to promote &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;Picky&lt;/a&gt; (the semantic search engine) here ;) They&amp;#8217;re great.&lt;/p&gt;
&lt;p&gt;What I&amp;#8217;m criticizing is the indiscriminate choice by many of my peers. I&amp;#8217;m just trying to bring the point across that one should weigh the options, and decide based on reason.&lt;/p&gt;
&lt;h2&gt;Fallacy: Search Engines are hard&lt;/h2&gt;
&lt;p&gt;I guess that sometimes the problem is just that search engines seem like magic. Sure you most of the time know which knob to turn, but when something unexpected happens, you feel like a wet dog out in the wind.&lt;/p&gt;
&lt;p&gt;Search engines are easy, actually. Take some time and &lt;a href=&quot;http://en.wikipedia.org/wiki/Search_engine_(computing)&quot;&gt;read all about them&lt;/a&gt;, especially by following the links.&lt;/p&gt;
&lt;p&gt;Mind, blown?&lt;/p&gt;
&lt;h2&gt;Sidenote: Computer Science vs. &amp;#8220;Informatik&amp;#8221;&lt;/h2&gt;
&lt;p&gt;&amp;#8220;We want information. In-for-mation!&amp;#8221; &lt;a href=&quot;http://en.wikipedia.org/wiki/The_Prisoner&quot;&gt;(The Prisoner)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;German speaking countries got it right: They got computer science pegged.&lt;/p&gt;
&lt;p&gt;I love the start of this &lt;a href=&quot;http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/&quot;&gt;set of lectures&lt;/a&gt; by Abelson and Sussman. In it, one of the guys casually strikes through &amp;#8220;Computer&amp;#8221;, then &amp;#8220;Science&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Watch them and be enlightened. And they are so right.&lt;/p&gt;
&lt;p&gt;Why? Our work is not about computers, it&amp;#8217;s about information. Acquiring, analyzing, understanding, searching, offering: Information.&lt;/p&gt;
&lt;p&gt;In german, &lt;a href=&quot;http://de.wikipedia.org/wiki/Informatik&quot;&gt;Informatik&lt;/a&gt; is a combination of &amp;#8220;Information&amp;#8221; and &amp;#8220;Mathematik&amp;#8221;. That&amp;#8217;s calling a horse a horse!&lt;/p&gt;
&lt;p&gt;Actually, in english the term exists as well, &lt;a href=&quot;http://en.wikipedia.org/wiki/Informatics&quot;&gt;Informatics&lt;/a&gt; – but I&amp;#8217;ve never heard it used.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that in information searching we sometimes forget we&amp;#8217;re engineers.&lt;/li&gt;
	&lt;li&gt;that there are many different types of search engines.&lt;/li&gt;
	&lt;li&gt;that we perhaps should be talking about &amp;#8220;informatics&amp;#8221; from now on.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;p&gt;Comments and feedback, as usual, are appreciated.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky 2.0</title>
   <link href="http://florianhanke.com/blog/2011/03/28/picky-two-dot-oh.html"/>
   <updated>2011-03-28T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/03/28/picky-two-dot-oh</id>
   <content type="html">&lt;p&gt;In my previous post, I talked about what bothers me in Picky&amp;#8217;s &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, and did a few 2.0 prerelease versions with the improvements.&lt;/p&gt;
&lt;p&gt;After quite a bit of feedback, Picky 2.0 is released! :)&lt;/p&gt;
&lt;p&gt;So, what&amp;#8217;s in it for you and what do you need to change in your 1.x version to use the spankingly new gem?&lt;/p&gt;
&lt;h2&gt;What has changed?&lt;/h2&gt;
&lt;p&gt;Only four things. 2.0&amp;#8217;s change list is short but sweet.&lt;/p&gt;
&lt;h3&gt;Index definitions&lt;/h3&gt;
&lt;p&gt;We&amp;#8217;ve added a nice new possibility to define categories on an index. The blocky initializer. So where you had&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Index::Memory.new(:name, source)
index.define_category :a
index.define_category :b
index.define_category :c&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you now can write&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;index = Index::Memory.new(:name, source) do
  category :a
  category :b
  category :c
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This helps keeping everything together a bit more tightly. Also, smoother skin by not having to type as much ;)&lt;/p&gt;
&lt;p&gt;The old style still works, but is totally shunned by veteran Pickiers. Be the hippest Pickier in town by using the blocky initializer style. You know you want it.&lt;/p&gt;
&lt;h3&gt;Query::Full/Live &amp;#8594; Search&lt;/h3&gt;
&lt;p&gt;The double definitions, &lt;code&gt;Query::Full&lt;/code&gt; and &lt;code&gt;Query::Live&lt;/code&gt; are no more. Good riddance!&lt;/p&gt;
&lt;p&gt;Instead, you simply use &lt;code&gt;Search&lt;/code&gt;. So instead of&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class MyBeooootifulPickySearch &amp;lt; Application

  route %r{^/books/full} =&amp;gt; Query::Full.new(some_index),
        %r{^/books/live} =&amp;gt; Query::Live.new(some_index)

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you use&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class MyBeooootifulPickySearch &amp;lt; Application

  route %r{^/books} =&amp;gt; Search.new(some_index)

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It says &amp;#8220;Route this &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; to that search with these indexes and options&amp;#8221;.
Much more understandable and sexier! :)&lt;/p&gt;
&lt;p&gt;To discern whether it is a full (with result ids) or live (without result ids) search, you pass e.g. curl an &lt;code&gt;ids&lt;/code&gt; query parameter:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ curl 'localhost:8080/books?query=meow&amp;amp;ids=15&amp;amp;offset=0'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Defaults are 20 &lt;code&gt;ids&lt;/code&gt; and 0 &lt;code&gt;offset&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Similarity::Phonetic &amp;#8594; Similarity::DoubleMetaphone&lt;/h3&gt;
&lt;p&gt;We&amp;#8217;ve renamed &lt;code&gt;Similarity::Phonetic&lt;/code&gt; to &lt;code&gt;Similarity::DoubleMetaphone&lt;/code&gt;. It&amp;#8217;s still the same algorithm. See &lt;a href=&quot;http://en.wikipedia.org/wiki/Metaphone#Double_Metaphone&quot;&gt;the double metaphone&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Also, we&amp;#8217;ve added two default implementations, &lt;code&gt;Similarity::Metaphone&lt;/code&gt; and &lt;code&gt;Similarity::Soundex&lt;/code&gt; for your similarity pleasure :)&lt;/p&gt;
&lt;p&gt;Since Picky is normally used by programmers, &lt;code&gt;DoubleMetaphone&lt;/code&gt; is much clearer for what it actually does than &lt;code&gt;Phonetic&lt;/code&gt; – it&amp;#8217;s a bit of a mouthful, I admit.&lt;/p&gt;
&lt;p&gt;Picky will tell you if you still use the old &lt;code&gt;Phonetic&lt;/code&gt; definition in your &lt;code&gt;app/application.rb&lt;/code&gt;, so you don&amp;#8217;t need to learn this by heart.&lt;/p&gt;
&lt;h3&gt;Picky::Client::Full/Live (in a client) &amp;#8594; Picky::Client&lt;/h3&gt;
&lt;p&gt;The Picky client in your application needs a few changes. Only a single client is needed anymore. So instead of&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;FullBooksSearch = Picky::Client::Full.new ...
LiveBooksSearch = Picky::Client::Live.new ...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you use&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;BooksSearch = Picky::Client.new ...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then in your e.g. controller actions passing what amount of ids you need&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;BooksSearch.search params[:query], :ids =&amp;gt; params[:ids], :offset =&amp;gt; params[:offset]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or directly, using &lt;code&gt;:ids =&amp;gt; 20&lt;/code&gt; or however you like it.&lt;/p&gt;
&lt;h3&gt;Various&lt;/h3&gt;
&lt;p&gt;Leading up to 2.0, we&amp;#8217;ve removed the hashbangs in the JS client history, added &lt;code&gt;rake stats&lt;/code&gt; and &lt;code&gt;rake analyze&lt;/code&gt;. See more in the repo&amp;#8217;s top level &lt;code&gt;history.textile&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Picky is two-dot-oh-soooome!&lt;/li&gt;
	&lt;li&gt;what you&amp;#8217;d need to change to be 2.0 compatible.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;p&gt;Btw, protip: Generate a client and server using &lt;code&gt;picky generate&lt;/code&gt; and see how everything is defined in 2.0 and compare.&lt;/p&gt;
&lt;p&gt;Comments and feedback, as usual, are appreciated.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Picky's Coming of Age</title>
   <link href="http://florianhanke.com/blog/2011/03/16/pickys-adolescence.html"/>
   <updated>2011-03-16T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/03/16/pickys-adolescence</id>
   <content type="html">&lt;p&gt;I&amp;#8217;m gonna talk about what bothers me in Picky&amp;#8217;s current configuration and what I&amp;#8217;d like to propose for 2.0. Opinions or ideas for new &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt; features are very welcome!&lt;/p&gt;
&lt;h2&gt;A spot of bother&lt;/h2&gt;
&lt;p&gt;Since releasing 1.0, something&amp;#8217;s always bothered me about Picky&amp;#8217;s configuration.&lt;/p&gt;
&lt;p&gt;I used to think it&amp;#8217;s the abundance of class methods used in definining indexing, querying, or routing:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class MyBeooootifulPickySearch &amp;lt; Application

  default_indexing removes_characters: /[^äöüa-zA-Z0-9\s\/\-\&quot;\&amp;amp;\.]/
  # etc.

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I usually prefer instances on which I define things. In a nutshell, it&amp;#8217;s more easily testable. But this is not really the problem.&lt;/p&gt;
&lt;p&gt;So, what is it that is bothering me?&lt;/p&gt;
&lt;h2&gt;What is really bothering me&lt;/h2&gt;
&lt;p&gt;Take a look at how routing and queries are defined:&lt;/p&gt;
&lt;p&gt;Here, we&amp;#8217;re routing &lt;code&gt;/all/full&lt;/code&gt;, &lt;code&gt;/all/live&lt;/code&gt; to queries which includes three indexes, and &lt;code&gt;/contacts/full&lt;/code&gt;, &lt;code&gt;/contacts/live&lt;/code&gt; to queries with just the contacts index:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/all/full\Z}      =&amp;gt; Query::Full.new(accounts_index, users_index, contacts_index),
      %r{\A/all/live\Z}      =&amp;gt; Query::Live.new(accounts_index, users_index, contacts_index),
      %r{\A/contacts/full\Z} =&amp;gt; Query::Full.new(contacts_index),
      %r{\A/contacts/live\Z} =&amp;gt; Query::Live.new(contacts_index)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the last sentence, I mention two things that are routed – why do I need double the number of route definitions?&lt;/p&gt;
&lt;h2&gt;Full and Live queries. Why?&lt;/h2&gt;
&lt;p&gt;Let me talk a little about the client why this is so.&lt;/p&gt;
&lt;p&gt;The Picky client does two different types of queries:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;A &amp;#8220;live&amp;#8221; query, which is sent when typing, to update the number of results.&lt;/li&gt;
	&lt;li&gt;A &amp;#8220;full&amp;#8221; query, which is sent when the user presses return or chooses an allocation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A full query needs to be enriched with rendered results, e.g. with list entries.&lt;/p&gt;
&lt;p&gt;This means that full queries need to go through the webapp to be enriched (rendered results etc.) and the live queries can go directly to the server, as no enriching is needed.&lt;/p&gt;
&lt;p&gt;Also, live and full queries were once very different. I&amp;#8217;ve worked hard to unify them, and the only difference that still exists is that live queries don&amp;#8217;t contain the result ids, or more precise: They return 0 result ids, while full queries return by default 20 ids.&lt;/p&gt;
&lt;p&gt;The other reason was that I needed two different URLs to have &lt;a href=&quot;http://www.varnish-cache.org/&quot;&gt;Varnish&lt;/a&gt; route the live queries directly to the server (since the id count alone didn&amp;#8217;t need to be enriched by the webapp), and the full queries were routed through the webapp, like so:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/api1.png&quot; class=&quot;nonfloat&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Isn&amp;#8217;t it a bit overkill having to define two identical routes for two queries where just the amount of ids is different?&lt;/p&gt;
&lt;p&gt;Absolutely.&lt;/p&gt;
&lt;h2&gt;A better solution&lt;/h2&gt;
&lt;p&gt;What I&amp;#8217;d like to have is the following&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/all\Z}      =&amp;gt; Query.new(accounts_index, users_index, contacts_index),
      %r{\A/contacts\Z} =&amp;gt; Query.new(contacts_index)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This would &lt;span class=&quot;caps&quot;&gt;DRY&lt;/span&gt; up the code immensely.&lt;/p&gt;
&lt;h2&gt;Problems with this solution&lt;/h2&gt;
&lt;p&gt;But now we&amp;#8217;re presented with two problems:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;How do we tell the server that we need 0, or 20 ids, and where?&lt;/li&gt;
	&lt;li&gt;How can I route the queries differently?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Solutions to these problems&lt;/h2&gt;
&lt;p&gt;I suggest that the first problem is handled by a query parameter &lt;code&gt;ids&lt;/code&gt;. So a query through curl would look like this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;curl localhost:8080/contacts?query=miller&amp;amp;ids=20&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Even if this means more typing, it is much more convenient and flexible to use. What I now can do is define default amounts in the JS client and in the webapp client (picky-client gem).&lt;/p&gt;
&lt;p&gt;The second problem is routing the queries differently. With the new way, you are much more flexible in this. Several solutions are possible. Say you have a Varnish:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;If query param &lt;code&gt;ids&lt;/code&gt; is 0, we route directly to the server. If not, it is routed through the webapp.&lt;/li&gt;
	&lt;li&gt;Define two different URLs, route the live one right on to the server and send the other through the webapp.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Or without Varnish (or Nginx etc.):&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Speed is not an issue? Route both through the webapp, and do different queries from there – one with 0 ids, one with 20.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Or any other way that suits you best.&lt;/p&gt;
&lt;h2&gt;Picky 2.0&lt;/h2&gt;
&lt;p&gt;Since this really irritates me, I&amp;#8217;ll start working on it &lt;span class=&quot;caps&quot;&gt;ASAP&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Most work is needed in the documentation – so if after the release, you see the old style anywhere, please tell me so.&lt;/p&gt;
&lt;p&gt;Yeah, Picky 2.0 – good times! :)&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Picky lives in a wet environment and needs some DRYing up.&lt;/li&gt;
	&lt;li&gt;that Picky 2.0 is around the corner.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;p&gt;If you have some feedback on what else could be improved, comment away!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Rake Tasks</title>
   <link href="http://florianhanke.com/blog/2011/03/13/searching-with-picky-rake-tasks.html"/>
   <updated>2011-03-13T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/03/13/searching-with-picky-rake-tasks</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;We&amp;#8217;ve all have used &lt;code&gt;rake index&lt;/code&gt; and &lt;code&gt;rake start&lt;/code&gt; to index and then start up a server. But did you know that Picky (and Rake, one of his best buddies) offer quite a few more?&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s do a quick &lt;code&gt;rake -T&lt;/code&gt;. What we get is:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake -T
rake analyze                         # Analyze your indexes (needs rake index).
rake check:index                     # Checks the index files for files that are small or missing.
rake index                           # Generate the index (random order).
rake index:ordered                   # Takes a snapshot, indexes, and caches in order given.
rake index:randomly                  # Takes a snapshot, indexes, and caches in random order.
rake index:specific[index,category]  # Generates a specific index from index snapshots (category opt).
rake routes                          # Shows the available URL paths
rake spec                            # Run all specs in spec directory.
rake start                           # Start the server.
rake stats                           # Application summary.
rake stop                            # Stop the server.
rake try[text,index,category]        # Try the given text in the indexer/query (index/category opt).
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I will give you a quick overview over each of them, with the idea that you know what&amp;#8217;s there and can try them yourself if you need details.&lt;/p&gt;
&lt;p&gt;Before we begin, a note on the naming: I used to name rake tasks &lt;code&gt;rake subject:verb&lt;/code&gt;, but not in Picky, since Picky has a lot of single word tasks. So they&amp;#8217;re named &lt;code&gt;rake verb:subject&lt;/code&gt;, as subjects are usually not present.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;ll start out with the fun ones.&lt;/p&gt;
&lt;h2&gt;rake try[text,index,category]&lt;/h2&gt;
&lt;p&gt;Suppose you send a few queries to Picky and you get empty results, even though you know that &amp;#8220;it must be in the indextubes&amp;#8221; aka &amp;#8220;Y U NO &lt;span class=&quot;caps&quot;&gt;FIND&lt;/span&gt;?&amp;#8221;.&lt;/p&gt;
&lt;p&gt;This is the task for you! It shows you how a text gets split up into tokens, in indexing and querying. Let me show you with an example project of mine:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake 'try[flöre.hanke]'
...
&quot;flöre.hanke&quot; is saved in the index as             [:floerehanke]
&quot;flöre.hanke&quot; as a query will be preprocessed into [:&quot;floere.hanke&quot;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I used single quotes to remind you that you might need these to escape special characters.&lt;/p&gt;
&lt;p&gt;So what we see is that if my specific Picky app encounters &lt;code&gt;flöre.hanke&lt;/code&gt;, it will index it as one word, remove &lt;code&gt;.&lt;/code&gt;
, and replace the umlaut &lt;code&gt;ö&lt;/code&gt;
with &lt;code&gt;oe&lt;/code&gt;, as per german rules.&lt;/p&gt;
&lt;p&gt;However, in a query, if someone searches for &lt;code&gt;flöre.hanke&lt;/code&gt;, my specific Picky app will not remove the &lt;code&gt;.&lt;/code&gt;
but use it as given (with the exception of the replaced &lt;code&gt;ö&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;So, in this case, nothing would be found.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;index&lt;/code&gt; and &lt;code&gt;category&lt;/code&gt; options let you specify with which index and category you&amp;#8217;d like to &lt;code&gt;try&lt;/code&gt; them.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rake try&lt;/code&gt; is your first line of defense against nasty configuration bugs.&lt;/p&gt;
&lt;p&gt;The interesting thing here is that often, the configurations for indexing and querying are similar.
The intelligence and beauty lies in &lt;em&gt;where they are not&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;rake routes&lt;/h2&gt;
&lt;p&gt;Remember Rails? That huge framework that was eventually replaced by Sinatra? Same rake task: &lt;code&gt;rake routes&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It blasts out all your routes and where they route to:
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake routes
...
Note: Anchored (✓) regexps are faster, e.g. /\A.*\Z/ or /^.*$/.
✓  \A/admin\Z      =&amp;gt; Suckerfish Live Interface (Use the picky-live gem to introspect)
✓  \A/books/full\Z =&amp;gt; Query::Full(books, isbn, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5})
✓  \A/books/live\Z =&amp;gt; Query::Live(books, isbn, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5})
&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;h2&gt;rake stats&lt;/h2&gt;
&lt;p&gt;Similar to Rails&amp;#8217; &lt;code&gt;rake stats&lt;/code&gt;, but with more steroids. Let me just show you an example:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake stats
...
Application(s)
  Definition LOC:    81
  Indexes defined:    2

  BookSearch
    Indexing (default):
      Removes characters:        /[^äöüa-zA-Z0-9\s\/\-\&quot;\&amp;amp;\.]/
      Stopwords:                 /\b(und|and|the|or|on|of|in|is|to|from|as|at|an)\b/
      Splits text on:            /[\s\/\-\&quot;\&amp;amp;]/
      Removes chars after split: /[\.]/
      Normalizes words:          [[/\$(\w+)/i, &quot;\\1 dollars&quot;]]
      Rejects tokens?            Yes, see line 10 in app/application.rb
      Substitutes chars?         Yes, using CharacterSubstituters::WestEuropean.

    Querying (default):
      Removes characters:        /[^ïôåñëäöüa-zA-Z0-9\s\/\-\,\&amp;amp;\.\&quot;\~\*\:]/
      Stopwords:                 /\b(und|and|the|or|on|of|in|is|to|from|as|at|an)\b/
      Splits text on:            /[\s\/\-\,\&amp;amp;]+/
      Removes chars after split: //
      Normalizes words:          -
      Rejects tokens?            -
      Substitutes chars?         Yes, using CharacterSubstituters::WestEuropean.

    Indexes:
      books (Index::Memory):
        source:            Sources::DB(&quot;SELECT id, title, author, year FROM books&quot;, {:file=&amp;gt;&quot;app/db.yml&quot;})
        categories:        id, title, author, year
        result identifier: &quot;boooookies&quot;

      redis (Index::Redis):
        source:            Sources::CSV(title, author, isbn, year, publisher, subjects, {:file=&amp;gt;&quot;data/books.csv&quot;})
        categories:        title, author, year, publisher, subjects


    Routes:
      Note: Anchored (✓) regexps are faster, e.g. /\A.*\Z/ or /^.*$/.

      ✓  \A/admin\Z      =&amp;gt; Suckerfish Live Interface (Use the picky-live gem to introspect)
      ✓  \A/books/full\Z =&amp;gt; Query::Full(books, redis, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5)
      ✓  \A/books/live\Z =&amp;gt; Query::Live(books, redis, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5)
      ✓  \A/redis/full\Z =&amp;gt; Query::Full(redis, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5)
      ✓  \A/redis/live\Z =&amp;gt; Query::Live(redis, weights: {[:author]=&amp;gt;6, [:title, :author]=&amp;gt;5)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is cool, right? In one fell swoop you see who uses what stopwords,
which characters aren&amp;#8217;t removed, and how many &lt;span class=&quot;caps&quot;&gt;LOC&lt;/span&gt; your config file has. I love it.&lt;/p&gt;
&lt;p&gt;The routes are also available separately for just $9.99 … uh, I mean, as &lt;code&gt;rake routes&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;rake analyze&lt;/h2&gt;
&lt;p&gt;This task takes a look at your indexes and tells you a few statistics about them.
This is most likely to evolve into something more powerful with each iteration.&lt;/p&gt;
&lt;p&gt;For now, it gives you this:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake analyze
...
Indexes analysis:
  books:id::
    exact:
      Index matches single characters.
      There's only one id per key – you'll only get single results.
      index key cardinality:                       540
      index key length range (avg):               1..3 (2.8)
      index ids per key length range (avg):       1..1 (1.0)
      weights range (avg):                    0.0..0.0 (0.0)
    partial*:
      Index matches single characters.
      index key cardinality:                       540
      index key length range (avg):               1..3 (2.8)
      index ids per key length range (avg):     1..111 (2.8)
      weights range (avg):                   0.0..4.71 (0.26)

  books:title::
    exact:
      Index matches single characters.
      index key cardinality:                      1681
      index key length range (avg):              1..19 (7.4)
      index ids per key length range (avg):      1..81 (1.9)
      weights range (avg):                   0.0..4.39 (0.33)
      similarity key length range (avg):          0..4 (3.58)
    partial*:
      Index matches single characters.
      index key cardinality:                      7010
      index key length range (avg):              1..19 (6.29)
      index ids per key length range (avg):     1..242 (3.08)
      weights range (avg):                   0.0..5.49 (0.52)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Most of it is probably gibberish. Picky tries to give you useful notes (in color, not visible above) about the indexes,
for example the &lt;span style=&quot;color:orange;&quot;&gt;Index matches single characters&lt;/span&gt; (when a single character already gets results) or
as a warning &lt;span style=&quot;color:red;&quot;&gt;There&amp;#8217;s only one id per key – you&amp;#8217;ll only get single results&lt;/span&gt;
(when you&amp;#8217;ll only get one result id per query – which might not be what you want).&lt;/p&gt;
&lt;h2&gt;rake index&amp;#8230;&lt;/h2&gt;
&lt;p&gt;Frankly, if you haven&amp;#8217;t seen &lt;code&gt;rake index&lt;/code&gt; yet, you haven&amp;#8217;t tried Picky yet. If this were a flow diagram, you&amp;#8217;d be sent back to the start ;)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rake index&lt;/code&gt; does just that. It indexes.&lt;/p&gt;
&lt;p&gt;You can tell it in what order to index them by using &lt;code&gt;rake index:ordered&lt;/code&gt; and &lt;code&gt;rake index:randomly&lt;/code&gt;, which will index the indexes either in the order they were defined or in a random fashion. Default is randomly, but if you&amp;#8217;re not happy with that, tell Picky explicitly.&lt;/p&gt;
&lt;p&gt;You can tell Picky to just index a single index, or even more specific, a single category inside a given index. Use &lt;code&gt;rake index:specific[books,title]&lt;/code&gt;.
It also tells you when an index or category is not there:
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake index:specific[books,isbn]
...
rake aborted!
Index category &quot;isbn&quot; not found. Possible categories: &quot;id&quot;, &quot;title&quot;, &quot;author&quot;, &quot;year&quot;.&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;h2&gt;rake check&amp;#8230;&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;rake check:index&lt;/code&gt; checks the indexes for suspiciously small or nonexistent indexes.&lt;/p&gt;
&lt;h2&gt;rake start/stop&lt;/h2&gt;
&lt;p&gt;One starts a Unicorn server, one stops it. I always forget which is which.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s not too webserver agnostic yet, but as soon as somebody complains, I will rewrite it to be so – if you&amp;#8217;re not faster with one of these beloved pull requests :)&lt;/p&gt;
&lt;h2&gt;rake spec&lt;/h2&gt;
&lt;p&gt;You will be surprised by this one: Runs the specs in the spec directory.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Picky does not just &lt;code&gt;rake index&lt;/code&gt; and &lt;code&gt;rake start&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;that Picky gives you a few command line tools (apart from the web tools) to find bugs in your config.&lt;/li&gt;
	&lt;li&gt;that Picky is not just good for picking up girls in bars.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Redis</title>
   <link href="http://florianhanke.com/blog/2011/03/02/searching-with-picky-redis.html"/>
   <updated>2011-03-02T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/03/02/searching-with-picky-redis</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This post will be a very short introduction on Redis index backends and Picky, and how to configure your indexes to run on Redis.&lt;/p&gt;
&lt;p&gt;I intended to do a massive writeup, but since all you do is change 6 characters &lt;code&gt;Memory&lt;/code&gt; into 5 different characters &lt;code&gt;Redis&lt;/code&gt; it just seemed like a massive overkill.&lt;/p&gt;
&lt;p&gt;I admit though that many massive writeups have been done on even smaller changes, like &amp;#8220;1.8&amp;#8221; &amp;#8594; &amp;#8220;1.9&amp;#8221; ;)&lt;/p&gt;
&lt;p&gt;Ok, so what am I talking about?&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;a href=&quot;http://redis.io/&quot;&gt;Redis&lt;/a&gt; can now be used in Picky as an index backend.&lt;/li&gt;
	&lt;li&gt;In your config, do &lt;code&gt;Index::Memory.new&lt;/code&gt; &amp;#8594; &lt;code&gt;Index::Redis.new&lt;/code&gt; and you&amp;#8217;re set :)&lt;/li&gt;
	&lt;li&gt;Memory and Redis indexes cannot (yet) be mixed and matched.&lt;/li&gt;
	&lt;li&gt;In 1.5.0, Picky uses Redis database 15.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What is Redis?&lt;/h2&gt;
&lt;p&gt;Redis is – taken from the website – an &amp;#8220;&lt;em&gt;open source, advanced key-value store&lt;/em&gt;&amp;#8221;. But this is not all. It also is a &amp;#8220;&lt;em&gt;data structure server&lt;/em&gt;&amp;#8221;. Check it out &lt;a href=&quot;http://redis.io/&quot;&gt;on its very nicely done website&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&amp;#8220;But we already have the massively fast in-memory backend. Why Redis?&amp;#8221;, you scream, indignantly.&lt;/p&gt;
&lt;h2&gt;Why Redis?&lt;/h2&gt;
&lt;p&gt;Granted, in-memory indexes in Picky are really fast. But they have a few drawbacks:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Relatively slow search engine startup, as the &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; index files need to be loaded into memory. This is especially noticeable if the index is around 12 GB.&lt;/li&gt;
	&lt;li&gt;To restart Unicorn without a hitch you need double the space the in-memory index needs, since Unicorn starts up a second master in parallel to the old one.&lt;/li&gt;
	&lt;li&gt;They need to be reloaded to be updated (see last blog post).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I haven&amp;#8217;t had any problems with that, but I can see the problem. Hence, Redis.&lt;/p&gt;
&lt;h2&gt;How do you use Redis indexes?&lt;/h2&gt;
&lt;p&gt;Looking at the configuration that the scaffolding generates, you see that it uses an &lt;code&gt;Index::Memory&lt;/code&gt; called books:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index = Index::Memory.new :books, Sources::CSV.new(:title, :author, file: 'app/library.csv')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you&amp;#8217;d like to use the Redis backend instead, you&amp;#8217;ll have to change &lt;code&gt;Memory&lt;/code&gt; into &lt;code&gt;Redis&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index = Index::Redis.new :books, Sources::CSV.new(:title, :author, file: 'app/library.csv')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I know. Picky is hard on the typing hand ;)&lt;/p&gt;
&lt;p&gt;Uh. That&amp;#8217;s already it. Welcome Redis. Bye bye, Memory.&lt;/p&gt;
&lt;p&gt;What you have to do now is re-index and start Picky:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;$ rake index
... indexing output ...
$ rake start&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or, start Picky, re-index and search while it is indexing, to get some added fun value.&lt;/p&gt;
&lt;h2&gt;What is the impact of Redis indexes?&lt;/h2&gt;
&lt;p&gt;Compared to the in-memory index, what are the advantages and disadvantages?&lt;/p&gt;
&lt;p&gt;Advantages:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Faster startup time, especially with a large index.&lt;/li&gt;
	&lt;li&gt;Indexing as-you-search. (No index reloading)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Drawbacks:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Factors slower.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Caveats / Next Versions&lt;/h2&gt;
&lt;p&gt;The Redis backend implementation in Picky is not yet customizable. This means that:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;It uses Redis database 15.&lt;/li&gt;
	&lt;li&gt;Returned entry ids are always strings, even when they were integers going in. You&amp;#8217;ll have to convert them back.&lt;/li&gt;
	&lt;li&gt;Redis and Memory indexes cannot (yet) be mixed and matched. So this isn&amp;#8217;t possible: &lt;code&gt;Query::Full.new(redis_index, memory_index)&lt;/code&gt;. Picky will notify you if you try to do so, so no worries.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I am focusing on these points in the upcoming 1.5.* versions.&lt;/p&gt;
&lt;h2&gt;Outlook&lt;/h2&gt;
&lt;p&gt;One of the next blog posts will look at the performance differences between the Redis backend and the memory backend.&lt;/p&gt;
&lt;p&gt;I can already reveal that the memory backend will be faster. Surprise! ;) The question is: Is Redis so much slower as to be unbearable?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Music, pregnant with suspense, fills the room: Dun dun &lt;span class=&quot;caps&quot;&gt;DUNNN&lt;/span&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;what Redis is.&lt;/li&gt;
	&lt;li&gt;that Picky offers two different index backends: In-Memory and Redis.&lt;/li&gt;
	&lt;li&gt;how you use/implement the Redis index backend in your search.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Live&amp;nbsp;reloading&amp;nbsp;indexes</title>
   <link href="http://florianhanke.com/blog/2011/02/20/searching-with-picky-live-reloading-indexes.html"/>
   <updated>2011-02-20T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/02/20/searching-with-picky-live-reloading-indexes</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This post is on reloading indexes by way of signals. So, first let&amp;#8217;s talk a little about signals. Then, in the second half, I talk about reloading the memory index in Picky.&lt;/p&gt;
&lt;p&gt;Warp 9?&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;#signals&quot;&gt;Signals in Ruby&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#oldhandler&quot;&gt;Still calling the old trap handler&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#reloading&quot;&gt;Reloading the indexes&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#exception&quot;&gt;Back when Ruby was mostly foxes and bacon&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;En-gage.&lt;/p&gt;
&lt;h2 id=&quot;signals&quot;&gt;Signals in Ruby&lt;/h2&gt;
&lt;p&gt;Signals are way of sending instructions to a running process. Here&amp;#8217;s a &lt;a href=&quot;http://www.ruby-doc.org/core/classes/Signal.html#M001253&quot;&gt;list of signals&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In Ruby you handle these signals by giving the &lt;a href=&quot;http://www.ruby-doc.org/core/classes/Signal.html#M001252&quot;&gt;Signal#trap&lt;/a&gt; method a handler block:&lt;/p&gt;
&lt;p&gt;What if I give it two? Let&amp;#8217;s try it.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Signal.trap('USR1') { p &quot;hello&quot; }
Signal.trap('USR1') { p &quot;world&quot; }

# Print out the process PID such that it is easier
# to enter &quot;kill -USR1 the_printed_process_pid&quot;
#
puts Process.pid

# You have sixty seconds to defuse … err try this example.
#
sleep 60&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, enter &lt;code&gt;kill -USR1 &amp;lt;the printed process pid&amp;gt;&lt;/code&gt; and see what happens.&lt;/p&gt;
&lt;p&gt;What happens is that the second block that prints &amp;#8220;world&amp;#8221; replaces the first one. So we see:&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;type here&amp;gt; ruby signals.rb 
77306
&quot;world&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Ruby throws the old block away. What if I don&amp;#8217;t want this?&lt;/p&gt;
&lt;h2 id=&quot;oldhandler&quot;&gt;Still calling the old trap handler&lt;/h2&gt;
&lt;p&gt;So, for example, in Unicorn, sending the &lt;code&gt;USR1&lt;/code&gt; signal handler reopens all logs. What if I want to do something else? If I just do&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Signal.trap('USR1') { something_else }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;the old handler will be gone.&lt;/p&gt;
&lt;p&gt;So, my assumption was that Ruby gives me the old handler when calling&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;old_handler = Signal.trap('USR1')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nope. Hurting the &lt;a href=&quot;http://de.wikipedia.org/wiki/Principle_of_Least_Surprise&quot;&gt;&lt;span class=&quot;caps&quot;&gt;POLS&lt;/span&gt;&lt;/a&gt; a little here. It only gives me the old handler when installing a new one.&lt;/p&gt;
&lt;p&gt;So what can you do? Use this &amp;#8220;trick&amp;#8221;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;old_handler = Signal.trap('USR1') {}
Signal.trap('USR1') { something_else; old_handler.call }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So I install a bogus handler to get the old handler, then throw the bogus handler away, right away, and call the old handler in the new handler.&lt;/p&gt;
&lt;h2 id=&quot;reloading&quot;&gt;Reloading the indexes&lt;/h2&gt;
&lt;p&gt;Currently, Picky does not support realtime indexes. It also runs with memory-only indexes (a &lt;a href=&quot;http://redis.io/&quot;&gt;Redis&lt;/a&gt; index backend is in the works). So, while the Picky server is running, it does not by itself pick up the new indexes, even if you generate new index files by running &lt;code&gt;rake index&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Btw, did you ever try to call &lt;code&gt;rake -T&lt;/code&gt; while in your Picky server project?&lt;/p&gt;
&lt;p&gt;How can we reload the indexes?&lt;/p&gt;
&lt;p&gt;Quite easy, actually. Reloading the memory indexes is done by calling&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Indexes.reload&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it.&lt;/p&gt;
&lt;p&gt;How do we get the Picky server process to call &lt;code&gt;Indexes.reload&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Now talking about all that signal handling pays off! :)&lt;/p&gt;
&lt;h3&gt;… in a non-forking web server.&lt;/h3&gt;
&lt;p&gt;When running Picky in a non-forking web server, in e.g. thin, in the file &lt;code&gt;app/application.rb&lt;/code&gt; we&amp;#8217;d call&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Signal.trap('USR1') { Indexes.reload }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and then in the Terminal, we run&lt;/p&gt;
&lt;pre class=&quot;sh_bash&quot;&gt;&lt;code&gt;type here&amp;gt; rake index
... (Picky indexes and writes new index files. Afterwards you tell the server to reload the indexes.)
type here&amp;gt; kill -USR1 your_picky_server_process_id&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should see some output that the server has reloaded the indexes.&lt;/p&gt;
&lt;h3&gt;… in a forking web server.&lt;/h3&gt;
&lt;p&gt;Unicorn, for example. Picky&amp;#8217;s current web server of choice.&lt;/p&gt;
&lt;p&gt;Since Unicorn already defines &lt;code&gt;USR1&lt;/code&gt;, we use the trick we&amp;#8217;ve talked about above to not replace the unicorn handler (if you need it):&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;old_handler = Signal.trap('USR1') {}
Signal.trap('USR1') { Indexes.reload; old_handler.call }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(Doesn&amp;#8217;t have to be &lt;code&gt;USR1&lt;/code&gt;, btw)&lt;/p&gt;
&lt;p&gt;After indexing and sending the &lt;code&gt;USR1&lt;/code&gt; signal to the Unicorn master, we aren&amp;#8217;t finished. Since the indexes have only been reloaded in the master, while the children are still happily using the old indexes.&lt;/p&gt;
&lt;p&gt;Check out &lt;a href=&quot;http://unicorn.bogomips.org/SIGNALS.html&quot;&gt;this very helpful page about signals in Unicorn&lt;/a&gt;. If &lt;code&gt;preload_app&lt;/code&gt; is set to &lt;code&gt;false&lt;/code&gt; in the unicorn.ru, you can just send a &lt;code&gt;HUP&lt;/code&gt; signal to the master. It will then kill all children, and fork then. Finished.&lt;/p&gt;
&lt;p&gt;When using Unicorn, you may of course also use &lt;a href=&quot;http://unicorn.bogomips.org/SIGNALS.html&quot;&gt;the way Unicorn does it&lt;/a&gt;. See the instructions under &amp;#8220;Procedure to replace a running unicorn executable&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Good stuff! Although this procedure uses around double the memory the Picky server uses normally, while the index reloading uses around 1.5 times the size of the largest sub-index (in a nutshell, a lot less than the Unicorn replacement technique).&lt;/p&gt;
&lt;h3&gt;… periodically.&lt;/h3&gt;
&lt;p&gt;What about reloading the indexes periodically?&lt;/p&gt;
&lt;p&gt;You could, of course, try to use a &lt;code&gt;Thread&lt;/code&gt;, trying to reload the indexes every X time units and monkey around with it (tell me if you are successful :) ). I wouldn&amp;#8217;t.&lt;/p&gt;
&lt;p&gt;I recommend to externally trigger &lt;code&gt;rake index&lt;/code&gt;, and then trigger reloads from outside using the mentioned signals.&lt;/p&gt;
&lt;h2 id=&quot;exception&quot;&gt;Btw, a fun thing with signals you should not do&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Back when Ruby was mostly foxes and bacons&lt;/em&gt;, I happened to type this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;begin
  p Process.pid
  looong_running_method
rescue Exception =&amp;gt; e
  p &quot;Oh deary me!&quot;
  retry
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note: I did not actually type &lt;code&gt;looong_running_method&lt;/code&gt; and &lt;code&gt;&quot;Oh deary me!&quot;&lt;/code&gt;, but you get the idea ;)&lt;/p&gt;
&lt;p&gt;The idea was that if the long running method fails, it&amp;#8217;d just retry running it.&lt;/p&gt;
&lt;p&gt;Sounds good, right? Try running it, and stop it with &lt;code&gt;Ctrl-C&lt;/code&gt;. The problem is the line &lt;code&gt;rescue Exception =&amp;gt; e&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Why? I soon found out that catching all &lt;code&gt;Exceptions&lt;/code&gt; is not a good idea if you&amp;#8217;d like stopping your program by way of &lt;code&gt;Ctrl-C&lt;/code&gt;, since &lt;code&gt;SignalException&lt;/code&gt; inherits from &lt;code&gt;Exception&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;p SignalException.ancestors # =&amp;gt; [SignalException, Exception, Object, Kernel, BasicObject]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;Ctrl-C&lt;/code&gt; sends a &lt;code&gt;SIGINT&lt;/code&gt;, an &lt;code&gt;INT&lt;/code&gt; signal to your process. Internally, a &lt;code&gt;SignalException&lt;/code&gt; is raised, which is then caught by the &lt;code&gt;rescue&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;A &lt;code&gt;kill -9&lt;/code&gt; sends this process to Walhalla. The place where all programs go that have incurred a major learning experience on their writers.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;how signals work&lt;/li&gt;
	&lt;li&gt;that reloading indexes in a running Picky server is easy&lt;/li&gt;
	&lt;li&gt;how you use signals to reload the server&lt;/li&gt;
	&lt;li&gt;how reloading works in different web servers&lt;/li&gt;
	&lt;li&gt;that reloading the indexes isn&amp;#8217;t without problems&lt;/li&gt;
	&lt;li&gt;that you need to be careful when catching exceptions&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">A better Rubygems search</title>
   <link href="http://florianhanke.com/blog/2011/02/13/a-better-rubygems-search.html"/>
   <updated>2011-02-13T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/02/13/a-better-rubygems-search</id>
   <content type="html">&lt;p&gt;Some time ago, &lt;a href=&quot;http://blog.absurd.li/&quot;&gt;Kaspar&lt;/a&gt; mentioned to me that it would be nice to have a &lt;em&gt;gem dependency search&lt;/em&gt;, i.e. where you could search in which gems a gem is used.&lt;/p&gt;
&lt;p&gt;I thought so too, so I wrote one :)
(and added some more features in the process)&lt;/p&gt;
&lt;p&gt;Take a look: &lt;a href=&quot;http://gemsearch.heroku.com/&quot;&gt;http://gemsearch.heroku.com/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(Note, it might take &lt;a href=&quot;http://heroku.com&quot;&gt;Heroku&lt;/a&gt; some time to ramp up the server)&lt;/p&gt;
&lt;h2&gt;Current state&lt;/h2&gt;
&lt;p&gt;While the &lt;a href=&quot;http://rubygems.org/&quot;&gt;current search&lt;/a&gt; isn&amp;#8217;t bad, it is missing the possibility of searching for an &lt;strong&gt;author&lt;/strong&gt;, &lt;strong&gt;where&lt;/strong&gt; a gem is used, or which &lt;strong&gt;version&lt;/strong&gt; it has. Or any combination thereof, for that matter.&lt;/p&gt;
&lt;h2&gt;Building the search&lt;/h2&gt;
&lt;p&gt;I happened to have a &lt;a href=&quot;http://florianhanke.com/picky/&quot;&gt;fast &amp;amp; clever search engine&lt;/a&gt; lying around ;) so this is what I used.&lt;/p&gt;
&lt;p&gt;How do you go about building or configuring a search engine?&lt;/p&gt;
&lt;h3&gt;1. Look at what your goals are.&lt;/h3&gt;
&lt;p&gt;My goals seemed simple enough.&lt;/p&gt;
&lt;p&gt;Each gem should be findable under:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Its &lt;strong&gt;name&lt;/strong&gt; (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=heroku_addon&quot;&gt;Try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;Its &lt;strong&gt;version(s)&lt;/strong&gt;, entered like x.y.z, or part thereof, x, x., x.y, x.y. (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=1%2E2%2E&quot;&gt;try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;Its &lt;strong&gt;author&amp;#8217;s names&lt;/strong&gt;, or first/last names. Or parts thereof, like &amp;#8220;flo&amp;#8221; for florian (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=hanke+flo&quot;&gt;try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;The gems it is &lt;strong&gt;dependent&lt;/strong&gt; upon. universe-parsing depends on parslet, for example (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=needs%3Aparslet&quot;&gt;try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;The names, gem name and dependent gem name should be &lt;strong&gt;phonetically findable&lt;/strong&gt; (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=rspoc%7E&quot;&gt;try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;The authors too should be &lt;strong&gt;phonetically findable&lt;/strong&gt; – since who knows how to write &amp;#8220;Heinemeier&amp;#8221; (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=heynemeyer%7E&quot;&gt;try it&lt;/a&gt;)?&lt;/li&gt;
	&lt;li&gt;All should be findable without entering the whole thing, like &amp;#8220;1.0&amp;#8221;, or &amp;#8220;activesupp&amp;#8221; (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=activesupp&quot;&gt;try it&lt;/a&gt;).&lt;/li&gt;
	&lt;li&gt;One should be able to specify what he is looking for by prefixing e.g. &amp;#8220;uses:&amp;#8221; in front of the search term (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=uses%3Apicky&quot;&gt;Try it&lt;/a&gt;). Or others, like &amp;#8220;dependency:&amp;#8221;, &amp;#8220;dependencies:&amp;#8221;, &amp;#8220;depends:&amp;#8221;, &amp;#8220;using:&amp;#8221;, &amp;#8220;uses:&amp;#8221;, &amp;#8220;use:&amp;#8221;, &amp;#8220;needs:&amp;#8221; (all possible).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I leave out the description for now, as it requires quite a bit of thinking and tinkering.&lt;/p&gt;
&lt;p&gt;With the goals defined&amp;#8230;&lt;/p&gt;
&lt;h3&gt;2. Look at the data.&lt;/h3&gt;
&lt;p&gt;I downloaded the &lt;a href=&quot;http://rubygems.org/Marshal.4.8.Z&quot;&gt;Marshal&lt;/a&gt; file, extracted the relevant data and restructured it into a readable &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Two potential problems I noticed:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Gem names are spaced using either an underscore _ or a hyphen -.&lt;/li&gt;
	&lt;li&gt;For the same name, there are sometimes up to three different encodings. Take the gems of &lt;a href=&quot;http://www.twitter.com/godfoca&quot;&gt;Nicolás Sanguinetti&lt;/a&gt; for example. &lt;a href=&quot;http://gemsearch.heroku.com/#/?q=nicolas+sanguinetti&quot;&gt;Try it&lt;/a&gt; and look at the author names.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Those were problematic. What does one do? Try to find an optimal solution.&lt;/p&gt;
&lt;h3&gt;3. Marry the goals and the data.&lt;/h3&gt;
&lt;p&gt;I decided not to tackle the display issues of the second point, encodings, but just the indexing issues. What I do is use &lt;a href=&quot;/2011/01/13/searching-with-picky-character-substituters.html&quot;&gt;character substitution&lt;/a&gt;, to make &amp;#8220;Nicolás&amp;#8221; findable under &amp;#8220;nicolas&amp;#8221;. This I do by saving the name as &amp;#8220;nicolas&amp;#8221; in the index, and also perform this character substitution on each search. &lt;a href=&quot;http://gemsearch.heroku.com/#/?q=nicol%C3%A1s+sanguinetti&quot;&gt;Try it with án áccent&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Deciding on what to with the gem names was a little harder. What is the problem?&lt;/p&gt;
&lt;p&gt;The problem is manyfold. For one, searchers should not need to know whether a gem was spaced with an underscore or a hyphen. Actually, I thought it best they be able to find it using a space. So the picky-live gem should be findable by typing &amp;#8220;picky live&amp;#8221; (&lt;a href=&quot;http://gemsearch.heroku.com/#/?q=picky+live&quot;&gt;Try it&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;However, if you then look for &amp;#8220;sinatra&amp;#8221;, the actual sinatra gem is not the first in the list. This is because I opted to go for an alphabetical ordering.&lt;/p&gt;
&lt;p&gt;However, if I need the user to enter the full name (like &amp;#8220;anthonymoralez-apn_on_rails&amp;#8221;), they might not find it at all.&lt;/p&gt;
&lt;p&gt;So, the way I did it now is have the user be able to use spaces when searching and trust people to depend on Picky&amp;#8217;s combinatorial nature. For example, if you look for sinatra and know that one of the owners is called Tomayko, you&amp;#8217;ll get to your answer directly: &lt;a href=&quot;http://gemsearch.heroku.com/#/?q=sinatra+tomayko&quot;&gt;Search for &amp;#8216;sinatra tomayko&amp;#8217;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Generally, the more you can help Picky, the more it will help you right back.&lt;/p&gt;
&lt;h3&gt;4. Have users try it and get feedback.&lt;/h3&gt;
&lt;p&gt;This is where you come in :)&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://gemsearch.heroku.com/&quot;&gt;Check it out&lt;/a&gt;, if you haven&amp;#8217;t yet and tell me what you think &lt;a href=&quot;http://www.twitter.com/hanke&quot;&gt;@hanke&lt;/a&gt;! Do you have ideas for improvement? (If yes, tell me which so I can improve it)&lt;/p&gt;
&lt;p&gt;How about we use this search on &lt;a href=&quot;http://rubygems.org&quot;&gt;rubygems.org&lt;/a&gt;? :)&lt;/p&gt;
&lt;h3&gt;A few technical Picky specifics.&lt;/h3&gt;
&lt;p&gt;A few Picky specifics for insiders:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;We have 4 data categories: &lt;code&gt;name, version, author, dependencies&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;The partial search &amp;#8220;rail*&amp;#8221; is done using &lt;code&gt;Partial::Substring.new(from: 1)&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;The similarity &amp;#8220;hallou~&amp;#8221; is done using: &lt;code&gt;Similarity::Phonetic.new(2)&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;A singly occurring name will be weighted up a little: &lt;code&gt;:weights =&amp;gt; { [:name] =&amp;gt; +1 } }&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;The author for example can be prefixed with: &lt;code&gt;qualifiers: [:author, :authors, :written, :writer, :by]&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Yes, currently I &lt;a href=&quot;http://www.isolani.co.uk/blog/javascript/BreakingTheWebWithHashBangs&quot;&gt;break the web with hashtags&lt;/a&gt; – I&amp;#8217;m rewriting it to use &lt;code&gt;pushState&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Thanks&lt;/h2&gt;
&lt;p&gt;Many thanks to &lt;a href=&quot;http://heroku.com&quot;&gt;Heroku&lt;/a&gt; for providing the infrastructure!&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that there&amp;#8217;s a better way to search Rubygems&lt;/li&gt;
	&lt;li&gt;where you can go to try it&lt;/li&gt;
	&lt;li&gt;how you could go about creating a search&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Running Sinatra inside a Ruby Gem</title>
   <link href="http://florianhanke.com/blog/2011/02/02/running-sinatra-inside-a-gem.html"/>
   <updated>2011-02-02T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/02/02/running-sinatra-inside-a-gem</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/blog/images/sinatra.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;In this post I&amp;#8217;ll show how to have &lt;a href=&quot;http://www.sinatrarb.com/&quot;&gt;Sinatra&lt;/a&gt; run directly from inside a gem. And at the end, how Picky uses this for its advantage.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s go singing in the gem!&lt;/p&gt;
&lt;h2&gt;The thing is…&lt;/h2&gt;
&lt;p&gt;What I wanted, was to add a nice statistics web interface to Picky.&lt;/p&gt;
&lt;p&gt;First I though about adding it to the server, but soon after (~1.2µs) decided that this was a silly idea.&lt;/p&gt;
&lt;p&gt;Picky is heavily designed around loosely connected elements in the server. I think this is even a better idea outside of a large component such as a server. So what I found myself thinking about – while showering – next was, to have a gem which generates a &lt;a href=&quot;http://www.sinatrarb.com/&quot;&gt;Sinatra&lt;/a&gt; application…&lt;/p&gt;
&lt;p&gt;Suddenly the room lit up and I spotted, scrawled on the wall in burning letters of blood:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;The wrong question.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;I gave it not much thought, as it can get crazy in this part of Zürich. Then, while gorging myself on my beloved alphabet soup, and thinking about how to structure files in this web application exactly, the letters suddenly formed a sentence:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Dude the wrong, fucking question.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;(Soups can only spell &lt;em&gt;so&lt;/em&gt; well)&lt;/p&gt;
&lt;p&gt;I only got it a few hours later, while three swedish massage therapists kneaded my shoulders.&lt;/p&gt;
&lt;p&gt;In computer science, the answers aren&amp;#8217;t nearly as important as asking:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;…the right fucking question.&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;The right fucking question&lt;/h2&gt;
&lt;p&gt;The right question is:&lt;/p&gt;
&lt;p&gt;How do I fit a web application wholly in a gem, such that I can do a&lt;/p&gt;
&lt;p&gt;&lt;code&gt;$ picky stats log/search.log&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;on any Picky logfile and it will parse it and show me a nice statistical representation of it in a browser without soiling the directory and everything else?&lt;/p&gt;
&lt;h2&gt;The right fucking tool for the job&lt;/h2&gt;
&lt;p&gt;That&amp;#8217;s &lt;a href=&quot;http://www.sinatrarb.com/&quot;&gt;Sinatra&lt;/a&gt; I&amp;#8217;m talking about. The great and &lt;strong&gt;extremely&lt;/strong&gt; easy to use Ruby &lt;span class=&quot;caps&quot;&gt;DSL&lt;/span&gt; for web applications.&lt;/p&gt;
&lt;p&gt;Give it a whirl if you haven&amp;#8217;t seen it!&lt;/p&gt;
&lt;h2&gt;How to do it&lt;/h2&gt;
&lt;p&gt;First, set up a gem structure – let&amp;#8217;s call the gem &amp;#8220;rain_sining&amp;#8221;. Then, inside it, set up the following structure:&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;rain_singing
  /bin
  /lib
    /rain_singing
      /application   # &amp;lt;- the app is in here
        app.rb       # &amp;lt;- the webapp itself
        /images
        /javascripts
        /stylesheets
        /views
    rain_singing.rb
  rain_singing.gemspec
  /spec&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &amp;#8220;hardest&amp;#8221; thing is getting the directories correctly set up.&lt;/p&gt;
&lt;p&gt;So what you do inside &lt;code&gt;app.rb&lt;/code&gt; is:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'sinatra'
require 'haml' # if you use haml views

class SingingRain &amp;lt; Sinatra::Base

  set :static, true                             # set up static file routing
  set :public, File.expand_path('..', __FILE__) # set up the static dir (with images/js/css inside)
  
  set :views,  File.expand_path('../views', __FILE__) # set up the views dir
  set :haml, { :format =&amp;gt; :html5 }                    # if you use haml
  
  # Your &quot;actions&quot; go here…
  #
  get '/' do
    haml :'/index'
  end
  
end

# Run the app!
#
puts &quot;Hello, you're running your web app from a gem!&quot;
SingingRain.run!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And that&amp;#8217;s already it for the app.&lt;/p&gt;
&lt;p&gt;Now, if you want to define a binary for the gem, put an executable &lt;code&gt;rain_singing&lt;/code&gt; file into &lt;code&gt;/bin&lt;/code&gt;. Into this file you&amp;#8217;d write:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;#!/usr/bin/env ruby
#
begin
  require 'rain_singing/application/app.rb'
rescue LoadError =&amp;gt; e
  require 'rubygems'
  path = File.expand_path '../../lib', __FILE__
  $:.unshift(path) if File.directory?(path) &amp;amp;&amp;amp; !$:.include?(path)
  require 'rain_singing/application/app.rb'
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, we need to tell rubygems that this gem has an executable:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Gem::Specification.new do |s|
  
  ...
  
  s.executables = ['rain_singing']
  s.default_executable = 'rain_singing'
  
  ...
  
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After generating your gem with&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;$ gem build rain_singing.gemspec&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and installing it with&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;$ gem install rain_singing-1.0.0.gem&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you are ready to run&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;$ rain_singing
Hello, you're running your web app from a gem!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Good stuff. Good stuff. Makes me want to sing in the rain.&lt;/p&gt;
&lt;h2&gt;In Picky&lt;/h2&gt;
&lt;p&gt;Picky uses this for two things.&lt;/p&gt;
&lt;p&gt;A statistics interface (&lt;code&gt;$ gem install picky-statistics&lt;/code&gt;), run&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;$ picky stats path/to/your/search.log 1234&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or the live interface to the running server (&lt;code&gt;$ gem install picky-live&lt;/code&gt;), run&lt;/p&gt;
&lt;pre class=&quot;sh_text&quot;&gt;&lt;code&gt;$ picky live localhost:8080/admin 1234&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You need to add &lt;code&gt;route %r{/admin} =&amp;gt; LiveParameters.new&lt;/code&gt; in the server to have it work. But then you get the interface &lt;a href=&quot;http://florianhanke.com/blog/2011/01/27/searching-with-picky-live-parameters-part-2.html&quot;&gt;described in this blog post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Nice, eh?&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Sinatra rocks my noodles&lt;/li&gt;
	&lt;li&gt;that a Gem can contain a whole webapp without footprint&lt;/li&gt;
	&lt;li&gt;that Picky uses both for maximal profit!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Parslet Intro</title>
   <link href="http://florianhanke.com/blog/2011/02/01/parslet-intro.html"/>
   <updated>2011-02-01T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/02/01/parslet-intro</id>
   <content type="html">&lt;p&gt;Tonight I wanted to take some time off from Picky to write about &lt;a href=&quot;http://kschiess.github.com/parslet/&quot;&gt;Parslet&lt;/a&gt;, a parser construction library by my dear friend &lt;a href=&quot;http://www.absurd.li/&quot;&gt;Kaspar&amp;nbsp;Schiess&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;Parslet is great.&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;gem install parslet&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;Look at &lt;a href=&quot;https://github.com/kschiess/parslet/tree/master/example&quot;&gt;any of the examples&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;Try, learn, try again, profit!&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What is it?&lt;/h2&gt;
&lt;p&gt;In Kaspar&amp;#8217;s words: &amp;#8220;A small Ruby library for constructing parsers in the &lt;a href=&quot;http://en.wikipedia.org/wiki/Parsing_expression_grammar&quot;&gt;&lt;span class=&quot;caps&quot;&gt;PEG&lt;/span&gt;&lt;/a&gt; (Parsing Expression Grammar) fashion&amp;#8221;.&lt;/p&gt;
&lt;p&gt;A parser is used to transform text data into a semantically meaningful structure by injecting information based on assumptions on the text&amp;#8217;s structure. For example, &lt;code&gt;&quot;Hello, Florian!&quot;&lt;/code&gt; could be parsed into something like: &lt;code&gt;[sentence: [greeting:hello, separation:comma, name:florian, mark:exclamation]]&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s probably best if you just &lt;a href=&quot;http://kschiess.github.com/parslet/get-started.html&quot;&gt;tried it for yourself&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Are there other parser constructors?&lt;/h2&gt;
&lt;p&gt;Yes, &lt;a href=&quot;https://github.com/mjijackson/citrus&quot;&gt;Citrus&lt;/a&gt; and &lt;a href=&quot;http://treetop.rubyforge.org/&quot;&gt;Treetop&lt;/a&gt;. But let&amp;#8217;s be frank here. Parslet eats these for breakfast in terms of ease of use and power, in my humble and almost unbiased opinion. Let me explain why.&lt;/p&gt;
&lt;h2&gt;Why is it so powerful and easy?&lt;/h2&gt;
&lt;p&gt;On the main page, Kaspar notes that Parslet is especially easy by &amp;#8220;providing the best error reporting&amp;#8221; and &amp;#8220;not generating reams of code for you to debug&amp;#8221;.&lt;/p&gt;
&lt;p&gt;While both are certainly true, and I do not disagree, but I don&amp;#8217;t think that that is what makes Parslet so easy or powerful. Surely easi-&lt;em&gt;er&lt;/em&gt;, but the main reason I love it is that it harnesses the power of Ruby.&lt;/p&gt;
&lt;p&gt;The second reason I consider it so great is that it split into a &lt;em&gt;parser&lt;/em&gt; and a &lt;em&gt;transformer&lt;/em&gt; step, with an intermediate syntax tree that is entirely in Ruby basic atoms, like hashes and arrays.&lt;/p&gt;
&lt;p&gt;Why is this cool? To repeat my example, above:
The parser would first parse &lt;code&gt;&quot;Hello, Florian!&quot;&lt;/code&gt; into &lt;code&gt;[sentence: [greeting:hello, separation:comma, name:florian, mark:exclamation]]&lt;/code&gt; and then, for example,
a &lt;code&gt;FrenchTransformer&lt;/code&gt; could be used to transform this into: &lt;code&gt;Bonjour, Florian!&lt;/code&gt;, the french representation of the english input sentence. So first we get an intermediate semantic expression that we can then transform into something else. And there can be a lot of transformers starting from where the parser ended. Thinking about a &lt;code&gt;SwedishTransformer&lt;/code&gt; or an &lt;code&gt;ItalianTransformer&lt;/code&gt;? Me too. &amp;#8220;Optimus Primo, transformate! Ciao!&amp;#8221;&lt;/p&gt;
&lt;p&gt;Or a chain of transformers that first take the intermediate tree and morph it into a different intermediate tree. The possibilities are endless.&lt;/p&gt;
&lt;h2&gt;Simple Example&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s consider a simple example. It is a subpart of the &lt;a href=&quot;http://www.ruby-doc.org/stdlib/libdoc/erb/rdoc/classes/ERB.html&quot;&gt;&lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt;&lt;/a&gt; &lt;a href=&quot;https://github.com/kschiess/parslet/blob/master/example/erb.rb&quot;&gt;parser&amp;nbsp;and&amp;nbsp;transformer&lt;/a&gt;  that I wrote. &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; is a Ruby templating language by Seki Masatoshi.&lt;/p&gt;
&lt;p&gt;We&amp;#8217;ll look at the whole thing later on.&lt;/p&gt;
&lt;p&gt;A simple &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; example would be &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; with a Ruby expression inside:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Hello, my name is&amp;lt;%= name &amp;gt;!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What we get out of the parser is the parts that are text, and the parts that are ruby code.
So with parslet we&amp;#8217;d write this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;require 'parslet'

class ErbParser &amp;lt; Parslet::Parser
  
  rule(:ruby_expression) { (str('%&amp;gt;').absnt? &amp;gt;&amp;gt; any).repeat.as(:ruby) }
  rule(:erb_with_tags) { str('&amp;lt;%=') &amp;gt;&amp;gt; ruby_expression &amp;gt;&amp;gt; str('%&amp;gt;') }
  
  rule(:text) { (str('&amp;lt;%=').absnt? &amp;gt;&amp;gt; any).repeat(1).as(:text) }
  
  rule(:text_with_ruby_expressions) { (text | erb_with_tags).repeat }
  root(:text_with_ruby_expressions)
end

p ErbParser.new.parse(&quot;Hello, my name is&amp;lt;%= name %&amp;gt;!&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Just run it :) What you get is a nice semantic tree:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[{:text=&amp;gt;&quot;Hello, my name is&quot;}, {:ruby=&amp;gt;&quot; name &quot;}, {:text=&amp;gt;&quot;!&quot;}]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let me go through it in steps. I&amp;#8217;ve found out that it is easiest for me to go top-down to define a parser. I hope this suits you too.&lt;/p&gt;
&lt;p&gt;We define the starting point, aka the &lt;code&gt;root&lt;/code&gt; of the parser with the &lt;code&gt;root&lt;/code&gt; method:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;root(:text_with_ruby_expressions)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This just says, start with the &lt;code&gt;rule(:text_with_ruby_expressions)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So, now what we know about our simple-&lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; language is that it is basically a sequence of text and ruby expressions, repeating. So let&amp;#8217;s define that:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:text_with_ruby_expressions) { (text | erb_with_tags).repeat }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So either we have text OR (&lt;code&gt;|&lt;/code&gt;) a ruby expression. And we have that in a repeating fashion. Just as the rule says.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s look at the text rule we just used:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:text) { (str('&amp;lt;%=').absnt? &amp;gt;&amp;gt; any).repeat(1).as(:text) }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This means: As long as you don&amp;#8217;t encounter a &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; start tag (&lt;code&gt;&amp;lt;%=&lt;/code&gt;), keep taking everything as text. This will stop if it encounters a &lt;code&gt;&amp;lt;%=&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;At which point Parslet will try to apply the other rule:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:erb_with_tags) { str('&amp;lt;%=') &amp;gt;&amp;gt; ruby_expression &amp;gt;&amp;gt; str('%&amp;gt;') }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This rule just matches anything with erb start &lt;code&gt;&amp;lt;%=&lt;/code&gt; and end tags &lt;code&gt;%&amp;gt;&lt;/code&gt; around it, with a ruby expression inside.&lt;/p&gt;
&lt;p&gt;The ruby expression is simple:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:ruby_expression) { (str('%&amp;gt;').absnt? &amp;gt;&amp;gt; any).repeat.as(:ruby) }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We know this already: As long as you don&amp;#8217;t encounter an &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; end tag, keep consuming as ruby code.&lt;/p&gt;
&lt;p&gt;Got it?&lt;/p&gt;
&lt;p&gt;Again, if you run it, you get:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;[{:text=&amp;gt;&quot;Hello, my name is&quot;}, {:ruby=&amp;gt;&quot; name &quot;}, {:text=&amp;gt;&quot;!&quot;}]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Niiice.&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s not think about the transform step for a second and look at some of the good shit.&lt;/p&gt;
&lt;h2&gt;Goodies that will blow your mind.&lt;/h2&gt;
&lt;p&gt;Parslet doesn&amp;#8217;t force you to use a class. It&amp;#8217;s totally ok to just do this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;include Parslet
parser = (str('Hello') | str('Hi')).as(:greeting)
p parser.parse('Hello')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In Parslet, you can run the parser with a subset of its rules:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;p ErbParser.new.erb_with_tags.parse(&quot;&amp;lt;%= name %&amp;gt;&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;while&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;p ErbParser.new.erb_with_tags.parse(&quot;Hello, &amp;lt;%= name %&amp;gt;!&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;would fail since the &lt;code&gt;erb_with_tags&lt;/code&gt; rule just covers text which starts with &lt;code&gt;&amp;lt;%=&lt;/code&gt; and ends with &lt;code&gt;%&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Running a &lt;code&gt;parse&lt;/code&gt; on a subrule works because a parser is composed of &lt;em&gt;Parslets&lt;/em&gt;, or parser atoms, hence the name. &lt;code&gt;str('hello')&lt;/code&gt; is one of these atoms, and so is a sequence of atoms, like &lt;code&gt;str('no') &amp;gt;&amp;gt; str('kidding')&lt;/code&gt;. And you can do a parse directly with one of these, if you want, &lt;code&gt;(str('Hello') | str('Hi')).parse('Hello')&lt;/code&gt; as we have seen before.&lt;/p&gt;
&lt;p&gt;Did I say it&amp;#8217;s pure Ruby? Why, yes! Let&amp;#8217;s harness the power of Ruby, and combine it with the power of Parslet parser atoms.&lt;/p&gt;
&lt;p&gt;I need a parser that is case insensitive regarding the string.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def case_insensitive string
  chars = string.split //
  chars.inject(str('')) do |parslet, char|
    parslet &amp;gt;&amp;gt; match(&quot;[#{char.downcase}|#{char.upcase}]&quot;)
  end
end

p case_insensitive('hello').parse('HeLLo')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What it does is generate this:
&lt;code&gt;match([h|H]) &amp;gt;&amp;gt; match([e|E]) &amp;gt;&amp;gt; match([l|L]) &amp;gt;&amp;gt; match([l|L]) &amp;gt;&amp;gt; match([o|O])&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This returns me a case insensitive parser that I can directly use to parse the &lt;code&gt;HeLLo&lt;/code&gt;. Or why not combine it with other parslets?&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;p (case_insensitive('hello') &amp;gt;&amp;gt; str(' ') &amp;gt;&amp;gt; str('Florian')).parse('HeLLo Florian')&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Transforming&lt;/h2&gt;
&lt;p&gt;Can you take a quick look at the &lt;a href=&quot;https://github.com/kschiess/parslet/blob/master/example/erb.rb&quot;&gt;&lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; parser&lt;/a&gt;, copy it into a script and give it a go?&lt;/p&gt;
&lt;p&gt;As you can see, it&amp;#8217;s not just able to parse text and ruby expressions (&lt;code&gt;&amp;lt;%= ruby expression %&amp;gt;&lt;/code&gt;), but also comments (&lt;code&gt;&amp;lt;%# comment %&amp;gt;&lt;/code&gt;) and normal ruby code (&lt;code&gt;&amp;lt;% ruby %&amp;gt;&lt;/code&gt;) that both will not be inserted into the rendered text.&lt;/p&gt;
&lt;p&gt;Now we&amp;#8217;ll have a look at the transformer that will spit out rendered text:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;evaluator = Parslet::Transform.new do
  
  erb_binding = binding
  
  rule(:code =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { eval(ruby, erb_binding); '' }  
  rule(:expression =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { eval(ruby, erb_binding) }
  rule(:comment =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { '' }
  
  rule(:text =&amp;gt; simple(:text)) { text }
  rule(:text =&amp;gt; sequence(:texts)) { texts.join }
  
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Ignore for now the part where bindings are used.&lt;/p&gt;
&lt;p&gt;A transformer consists of a number of rules. And a rule consists of a part that recognizes structure in the semantic tree, and a block which tells the transformer what to do with the recognized thing. Got it? So this rule,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:text =&amp;gt; sequence(:texts)) { texts.join }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;recognizes hashes that look like &lt;code&gt;:text =&amp;gt; sequence(:texts)&lt;/code&gt;, sequences of things that are denoted &lt;code&gt;as&lt;/code&gt; text. The identifier &lt;code&gt;:texts&lt;/code&gt; is used in the block where we tell the transformer what to do: &lt;code&gt;{ texts.join }&lt;/code&gt;. So what we do is simple, we just join a sequence of texts together.&lt;/p&gt;
&lt;p&gt;Another rule, the comment rule,&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:comment =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { '' }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;will return just nothing.&lt;/p&gt;
&lt;p&gt;Now, if we want to parse and transform something like this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;The &amp;lt;% a = 2 %&amp;gt;not printed result of &quot;a = 2&quot;.
The &amp;lt;%# a = 1 %&amp;gt;not printed non-evaluated comment &quot;a = 1&quot;, see the value of a below.
The &amp;lt;%= 'nicely' %&amp;gt; printed result.
The &amp;lt;% b = 3 %&amp;gt;value of a is &amp;lt;%= a %&amp;gt;, and b is &amp;lt;%= b %&amp;gt;.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It gets a little more complicated. If you look at line 1, you see that &lt;code&gt;a&lt;/code&gt; is given a value of &lt;code&gt;2&lt;/code&gt;. And then we will use that value in line 4, where we put the result of &lt;code&gt;2&lt;/code&gt; into the rendered template. Have you tried it? No? Run it and see :)&lt;/p&gt;
&lt;h2&gt;Remembering State&lt;/h2&gt;
&lt;p&gt;If you want the transformer rules to remember values in between transformations – like the &lt;code&gt;a&lt;/code&gt; that is set to &lt;code&gt;2&lt;/code&gt;, above, you&amp;#8217;ll need state of some sort.&lt;/p&gt;
&lt;p&gt;I can show you the way I did it with the &lt;span class=&quot;caps&quot;&gt;ERB&lt;/span&gt; transformer. I&amp;#8217;m sure you can think of many others that are perhaps safer, more powerful, or simply cleaner. But for now, we&amp;#8217;ll have a look at this:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;evaluator = Parslet::Transform.new do
  
  erb_binding = binding
  
  rule(:code =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { eval(ruby, erb_binding); '' }
  
  ...

end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What happens here?
First, I assign the binding of the block to &lt;code&gt;erb_binding&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;erb_binding = binding&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the object where we will safe the state.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s a good thing for me that the &lt;code&gt;rule&lt;/code&gt; method uses a block to define what to do when encountering a rule. Why? Well, since it is a block, the local variable &lt;code&gt;erb_binding&lt;/code&gt; is bound in the context of the block, which means that I have easy access to it in &lt;code&gt;{ eval(ruby, erb_binding); '' }&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So what I do with&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;eval(ruby, erb_binding); ''&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is: Evaluate the code piece that I get in the variable ruby, and evaluate it with the binding I have saved. Then, I return an empty string since &lt;code&gt;&amp;lt;% ruby code %&amp;gt;&lt;/code&gt; should not write anything into the resulting rendered template.&lt;/p&gt;
&lt;p&gt;Not so in the expression:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;rule(:expression =&amp;gt; { :ruby =&amp;gt; simple(:ruby) }) { eval(ruby, erb_binding) }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here I return whatever the evaluation returned to be inserted into the rendered result.&lt;/p&gt;
&lt;p&gt;Isn&amp;#8217;t it nice? And between parser and transformer I was able to look at my nice semantic tree, to check that everything is a-ok.&lt;/p&gt;
&lt;p&gt;Writing tests, as everything is in Ruby, is a breeze, as you can imagine!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;My personal conclusion is that this thing is here to stay.&lt;/p&gt;
&lt;p&gt;Not only is it easy to use, but you have the full power of Ruby available to write parsers, comfortably.&lt;/p&gt;
&lt;p&gt;It already has garnered the attention of quite a few excellent Rubyists – the hard core of parslet users – which hang out at the #parslet &lt;span class=&quot;caps&quot;&gt;IRC&lt;/span&gt; channel.&lt;/p&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that Parslet harnishes Ruby&amp;#8217;s powers for success and profit.&lt;/li&gt;
	&lt;li&gt;that it offers a parser constructor &lt;span class=&quot;caps&quot;&gt;AND&lt;/span&gt; a transformer constructor, which is a good thing.&lt;/li&gt;
	&lt;li&gt;that trying it yourself is fun and a piece of cake.&lt;/li&gt;
	&lt;li&gt;And: That using bindings is crazy fun when used at the right place :)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Live Parameters Part 2</title>
   <link href="http://florianhanke.com/blog/2011/01/27/searching-with-picky-live-parameters-part-2.html"/>
   <updated>2011-01-27T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/01/27/searching-with-picky-live-parameters-part-2</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This is the second part of the Live Parameters blog post that deals with the problem of hot replacing a configuration of a search server like Picky running in a multiprocessing server like &lt;a href=&quot;http://unicorn.bogomips.org/&quot;&gt;unicorn&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;code&gt;gem install picky-live&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;Server: In &lt;code&gt;app/application.rb&lt;/code&gt;, insert &lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/admin\Z} =&amp;gt; LiveParameters.new&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
	&lt;li&gt;Enter &lt;code&gt;picky live&lt;/code&gt; on the command line.&lt;/li&gt;
	&lt;li&gt;Open &lt;a href=&quot;http://localhost:4568/&quot;&gt;The Suckerfish Interface&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;Have fun!&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What was the problem, again?&lt;/h2&gt;
&lt;p&gt;The goal is that we want to update Picky&amp;#8217;s config while it is answering search requests.&lt;/p&gt;
&lt;p&gt;The problem is that we need to update the config in the master process, but most multiprocessing servers don&amp;#8217;t allow easy access. And it&amp;#8217;s good like that.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/suckerfish2.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;What I&amp;#8217;d like to do is provide access for a suckerfish. But since it isn&amp;#8217;t easy or a good idea to open direct access to the parent, the suckerfish must go through the child.&lt;/p&gt;
&lt;p&gt;The child would accept data incoming from the suckerfish, process it and tell the parent what to change.&lt;/p&gt;
&lt;p&gt;So what we&amp;#8217;d need is for the child to be able to write the parent. It&amp;#8217;s actually quite easy to do in Ruby. But how?&lt;/p&gt;
&lt;h2&gt;The simplest way to write your parents.&lt;/h2&gt;
&lt;p&gt;… apart from picking up a pen once in a while? Your mother didn&amp;#8217;t spend 20 hours of her life in labor just for fun, you know!&lt;/p&gt;
&lt;p&gt;Heh.&lt;/p&gt;
&lt;p&gt;First, you open an &lt;a href=&quot;http://ruby.runpaint.org/io#pipes&quot;&gt;IO.pipe&lt;/a&gt;. Then, in the &lt;code&gt;fork&lt;/code&gt; (the child), you &lt;code&gt;close&lt;/code&gt; off the &amp;#8220;child&amp;#8221; and then you are ready to &lt;code&gt;write&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In the parent, you do the opposite, and call &lt;code&gt;gets&lt;/code&gt; (for example) then wait for a message from the child.&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;child, parent = IO.pipe

fork do
  # In child.
  #
  child.close
  puts &quot;#{Process.pid}: I'll write soon.&quot;
  parent.write &quot;Hello from child!&quot;
end

# In parent.
#
parent.close
message = child.gets '!'
puts &quot;#{Process.pid}: #{message}&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It&amp;#8217;s copy-and-try!&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Process.pid&lt;/code&gt; returns the current process id, which is different in the child and the parent, as you can see after trying the example.
In the parent, the &lt;code&gt;child.gets&lt;/code&gt; with a parameter will read up until having received that string, then return whatever has been read so far.&lt;/p&gt;
&lt;p&gt;I always look at child and server as if the child was a perfect copy of the parent. And anything you change in the child won&amp;#8217;t affect the parent. But if you change something in the parent, it will affect all future children.&lt;/p&gt;
&lt;h2&gt;How Picky does it&lt;/h2&gt;
&lt;p&gt;Five steps:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;The Picky child receives the config update request.&lt;/li&gt;
	&lt;li&gt;It tries to update its config (more on that below).&lt;/li&gt;
	&lt;li&gt;If successful, it tells the parent. If not, it kills itself, and tells Suckerfish which config was wrong.&lt;/li&gt;
	&lt;li&gt;The parent, on receiving the message, updates itself and kills off all other children (more on that below).&lt;/li&gt;
	&lt;li&gt;The child will answer Suckerfish with the current configuration.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The messaging is basically the same as above, but a bit more elaborate in Picky, since:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Picky doesn&amp;#8217;t have control over the forking.
This means Picky doesn&amp;#8217;t know when to close the &lt;code&gt;child&lt;/code&gt;, which is why on each call received on the &lt;span class=&quot;caps&quot;&gt;API&lt;/span&gt;, we just do a &lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;@child.close unless @child.closed?&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
	&lt;li&gt;The server inside which Picky is running will fork off the parent multiple times, and not just at the beginning.
So, if the parent would do a &lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;@parent.close&lt;/code&gt;&lt;/pre&gt; as in the example, then yes, it would work fine. Up until the next time a child is forked. What happens when a child is forked?
The connection to the parent would already have been closed off by the parent itself, and the child would be unable to &lt;code&gt;write&lt;/code&gt; on it. Solution? I just leave it open, since the parent doesn&amp;#8217;t need to talk to the child.
(Ensuring years of therapy for the child)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;How does Picky ensure there will be no problems in the parent process?&lt;/h2&gt;
&lt;p&gt;What would happen if the Suckerfish had direct access to the master&amp;#8217;s configuration?&lt;/p&gt;
&lt;p&gt;We assume that the child is a close to perfect copy of the parent process. So what we do is try updating the configuration in the child first.&lt;/p&gt;
&lt;p&gt;If that works, we can assume that in the parent, it will work too (no malformed configuration input). So we just send the parent the data and the parent will use the exact same method as the child to update itself.&lt;/p&gt;
&lt;p&gt;Now we have the problem that there are still children hanging around with the old config. So what the parent process – any good parent ;) – does is kill all of these. The one giving it the ok config is spared, since it has the new config already. After that, new children are forked with the correct config.&lt;/p&gt;
&lt;p&gt;What happens if the config is malformed? The child that accepted the suckerfish request needs to die, since its config might now be malformed. So what it does is prepare for an honorable &lt;em&gt;Harakiri&lt;/em&gt;, tell the Suckerfish what is wrong, and perform a horizontal cut through its stomach, using &lt;code&gt;Process.kill(:QUIT, 0)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But… how do I get it to work in Picky?&lt;/p&gt;
&lt;h2&gt;How you configure it in Picky&lt;/h2&gt;
&lt;p&gt;Simple – you open a http interface in &lt;code&gt;app/application.rb&lt;/code&gt; the same way as you would for a query. But this time, instead of a query, you have it point to an instance of &lt;code&gt;LiveParameters&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Like that:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;route %r{\A/admin\Z} =&amp;gt; LiveParameters.new&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;And then, you have to…&lt;/p&gt;
&lt;p&gt;No, wait. That&amp;#8217;s it.&lt;/p&gt;
&lt;p&gt;This opens a &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; interface into the heart of your Picky configuration.&lt;/p&gt;
&lt;h2&gt;The interface&lt;/h2&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; query params in, &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; hash out.&lt;/li&gt;
	&lt;li&gt;On success, it returns the complete config, always.&lt;/li&gt;
	&lt;li&gt;On failure, it returns the offending key with the value &amp;#8220;&lt;span class=&quot;caps&quot;&gt;ERROR&lt;/span&gt;&amp;#8221;.&lt;/li&gt;
	&lt;li&gt;If you pass in no query params, nothing will get updated, but you still get the config hash.&lt;/li&gt;
	&lt;li&gt;If you pass in something like &amp;#8230;?querying_splits_text_on=\s, it will update its config to split text on whitespaces.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Beware&lt;/h2&gt;
&lt;p&gt;Just one thing: Be sure to not let your users have access to the live params url.&lt;/p&gt;
&lt;p&gt;And also, be sure not to let your users have access to the live params url.&lt;/p&gt;
&lt;h2&gt;The picky-live gem&lt;/h2&gt;
&lt;p&gt;Because sending the server configuration messages per &lt;span class=&quot;caps&quot;&gt;HTTP&lt;/span&gt; by hand is very tedious, Picky offers a much nicer interface, the picky-live gem.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;gem install picky-live&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Then, just enter&lt;/p&gt;
&lt;p&gt;&lt;code&gt;picky live&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This will start up the Suckerfish web interface on a default port, &lt;a href=&quot;http://localhost:4568&quot;&gt;localhost:4568&lt;/a&gt;, going through the default Suckerfish interface on &lt;a href=&quot;http://localhost:8080/admin&quot;&gt;/admin&lt;/a&gt; in the Picky server.&lt;/p&gt;
&lt;p&gt;If you have customized it to be on &lt;code&gt;/suckerfish&lt;/code&gt; and you don&amp;#8217;t want the Suckerfish web interface on the default port, you&amp;#8217;d type:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;picky live localhost:8080/suckerfish 1234&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This would start up the interface on &lt;a href=&quot;http://localhost:1234&quot;&gt;localhost:1234&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The interface looks like this:
&lt;img src=&quot;/blog/images/suckerfish_interface.png&quot; style=&quot;margin-left: -200px;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;What you see are three configs that you are currently able to change on the fly. These are the configs for query text handing and wrangling.&lt;/p&gt;
&lt;p&gt;If I change a config in the interface, it will tell me so (currently by changing the background color of the input): 
&lt;img src=&quot;/blog/images/suckerfish_updating.png&quot; style=&quot;margin-left: -200px;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then, as soon as I click on the &amp;#8220;Update server now&amp;#8221; button, a suckerfish speeds off, accesses the child through the right &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, tells the child to update. The child will try to update itself, and if that works, tell the master to update.&lt;/p&gt;
&lt;p&gt;In this example, the updating has failed. The child will tell me so, not tell the parent, and kill itself. (Man, this language we&amp;#8217;re using is brutal!)
&lt;img src=&quot;/blog/images/suckerfish_error.png&quot; style=&quot;margin-left: -200px;&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Picky needs the child to perform harakiri, since we do not know if the config is still ok.&lt;/p&gt;
&lt;p&gt;If all goes well, the master kills the &lt;strong&gt;other&lt;/strong&gt; children (since they need the updated config) and lets the one telling him to update the config live. You will get a confirmation message, and the interface will update with the current configuration.&lt;/p&gt;
&lt;p&gt;With suckerfish, children will die.&lt;/p&gt;
&lt;p&gt;Sorry about that. What you get in return is a comfortable way of updating the server config on the fly. And that is worth the tradeoff ;)&lt;/p&gt;
&lt;h2&gt;Performance?&lt;/h2&gt;
&lt;p&gt;I bombarded the search server with 100&amp;#8217;000 requests, concurrency 100 using ab:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ab -n 100000 -c 100 127.0.0.1:8080/all/full?query=s&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Then, I started a Suckerfish and updated the config.&lt;/p&gt;
&lt;p&gt;Result: Not really noticeable. A short hiccup when the master reforks, but not really noticeable.&lt;/p&gt;
&lt;p&gt;If the config update fails, since only one worker child dies, the effect is almost not noticeable.&lt;/p&gt;
&lt;p&gt;If the update works, one worker child remains, and the others need to be forked. But Unicorn handles this exceptionally gracefully. Thanks, Unicorn! Really proud of ya. Love you. Still, the harakiri stays.&lt;/p&gt;
&lt;h2&gt;Disclaimer&lt;/h2&gt;
&lt;p&gt;Updating everything on the fly is nice. But beware: The configuration in &lt;code&gt;app/application.rb&lt;/code&gt; will not be updated. After experimenting with Suckerfish, you still need to update the config by hand.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s syntax pepper.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that we can&amp;#8217;t just update a config in a child (in a multiprocessing server)&lt;/li&gt;
	&lt;li&gt;how a child can communicate with its parent&lt;/li&gt;
	&lt;li&gt;how Picky does it&lt;/li&gt;
	&lt;li&gt;how the the picky-live gem looks and works&lt;/li&gt;
	&lt;li&gt;How you can try it yourself&lt;/li&gt;
	&lt;li&gt;that it is fast&lt;/li&gt;
	&lt;li&gt;that it can be dangerous if you don&amp;#8217;t know what to do&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Live Parameters Part 1</title>
   <link href="http://florianhanke.com/blog/2011/01/25/searching-with-picky-live-parameters-part-1.html"/>
   <updated>2011-01-25T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/01/25/searching-with-picky-live-parameters-part-1</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its workings. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;p&gt;This time I want to do a two-part post on live parameters.&lt;/p&gt;
&lt;h2&gt;What are Live Parameters?&lt;/h2&gt;
&lt;p&gt;Imagine this situation:&lt;/p&gt;
&lt;p&gt;You are sitting at your desk. A few levels below is an array of Picky servers, contentedly humming at a bagillion requests per second…&lt;/p&gt;
&lt;p&gt;Ok, this is actually a fantasy of mine, but bear with me.&lt;/p&gt;
&lt;p&gt;Suddenly, your boss enters, his hair pointier than ever!&lt;/p&gt;
&lt;p&gt;He tells you that a customer&amp;#8217;s space bar is not working anymore and now he&amp;#8217;d like to use the comma &lt;code&gt;,&lt;/code&gt; character to designate where words are separated.&lt;/p&gt;
&lt;p&gt;Of course you roll your eyes, but he doesn&amp;#8217;t give up. The customer needs to be served, no matter what!&lt;/p&gt;
&lt;p&gt;At this point, what would be really good to have is a way of changing Picky&amp;#8217;s behaviour with splitting words in queries.&lt;/p&gt;
&lt;p&gt;(Btw, the &lt;code&gt;splits_text_on&lt;/code&gt; option, a regexp, defines how picky splits text into tokens, or words.)&lt;/p&gt;
&lt;p&gt;And you do, but: What you have to do now is change the config, deploy, restart the whole cluster (or send Unicorn the &lt;code&gt;HUP&lt;/code&gt; signal to have it restart), losing a fantastic amount of &lt;span class=&quot;caps&quot;&gt;CPU&lt;/span&gt; cycles that would have been better used for searching with Picky.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/suckerfish1.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This would be called changing &lt;em&gt;lame parameters&lt;/em&gt;. Live parameters are the cool counterpart of lame parameters, the ones with hair, a sunny disposition, having that certain &lt;em&gt;je-ne-sais-quoi&lt;/em&gt; that only surfers have.&lt;/p&gt;
&lt;p&gt;Live parameters are parameters that can be changed hot – in the running server.&lt;/p&gt;
&lt;p&gt;Now wouldn&amp;#8217;t that be nice? Turns out it isn&amp;#8217;t as easy as I thought.&lt;/p&gt;
&lt;h2&gt;How do I achieve this?&lt;/h2&gt;
&lt;p&gt;The problem is that the Unicorn master – or with any multiprocessing-based server – holds the original copy of the configuration. You can easily update it in a child, but if the child dies, it will be replaced with a new one which has forgotten everything.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/suckerfish2.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;So let&amp;#8217;s call this thing that updates the configuration a &lt;a href=&quot;http://en.wikipedia.org/wiki/Echeneidae&quot;&gt;Suckerfish&lt;/a&gt;. Suckerfishes – or Remoras – attach to a host (Fig. A), mostly sharks, by sucking onto them. This suckerfish (in the form of a request) would attach itself to a child, and from there open a channel, a pipe, to the master, where it could update the master config.&lt;/p&gt;
&lt;p&gt;So after attaching itself, this fish would then whisper Picky sweet and golden nothings in its ear, causing it to update its master config.&lt;/p&gt;
&lt;h2&gt;That&amp;#8217;s fine, but where can I try it?&lt;/h2&gt;
&lt;p&gt;Suckerfish is ready, but not release-ready yet. So you could &lt;a href=&quot;http://github.com/floere/picky&quot;&gt;clone picky&lt;/a&gt;, and call &lt;code&gt;./install&lt;/code&gt; in the top level directory to install all 1.3.0 gems locally.&lt;/p&gt;
&lt;p&gt;But bear with me, for in part 2 (after the release of 1.3.0 and the picky-live gem, the &amp;#8220;Suckerfish&amp;#8221; gem) I&amp;#8217;ll show how this can be done and how you can use Suckerfish as a weapon against pointy-haired bosses, or just for easy experimentation with your search parameters.&lt;/p&gt;
&lt;p&gt;Don&amp;#8217;t worry, it will get technical soon ;)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Data Sources</title>
   <link href="http://florianhanke.com/blog/2011/01/20/searching-with-picky-data-sources.html"/>
   <updated>2011-01-20T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/01/20/searching-with-picky-data-sources</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its configuration. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;h2&gt;What is a Data Source in Picky?&lt;/h2&gt;
&lt;p&gt;A data source is where the indexes get their data. Every index needs a data source.&lt;/p&gt;
&lt;p&gt;The way to do this is pass the &lt;code&gt;index(identifier, source)&lt;/code&gt; method&amp;#8217;s source param a source instance, like so (in &lt;code&gt;app/application.rb&lt;/code&gt;):
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index = index :books, Sources::DB.new('SELECT id, title, author FROM books', file: 'app/db.yml')&lt;/code&gt;&lt;/pre&gt;
Here we passed a database source that uses a simple select. Which database the source uses is defined in the file &lt;code&gt;app/db.yml&lt;/code&gt; and follows the configuration structure of Active Record. You could, instead of passing in a &lt;code&gt;file&lt;/code&gt; option, just pass in the Active Record config hash.&lt;/p&gt;
&lt;p&gt;There are various data sources already defined beside the DB source (see below), but if the one you need is missing, writing your own is easy.&lt;/p&gt;
&lt;p&gt;After that comes the most important part in Picky! :) No, really. Because what we are now going to do is categorize the data we got from the source.&lt;/p&gt;
&lt;p&gt;Categorizing the data is so important, because it allows Picky to make guesses as to which category a query word is in and get better feedback from the user. Say, if you categorized both first name and last name in the category &lt;code&gt;name&lt;/code&gt;, Picky would not be able to help your users find what you are looking for, since it can&amp;#8217;t ask back specifically what you mean, like &amp;#8220;Did you mean Florian as first name or last name?&amp;#8221;.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s best if you just &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;get started&lt;/a&gt;, and see for yourself. Picky is best experienced, and not told.&lt;/p&gt;
&lt;p&gt;Back to the example: Now that we have defined a data source, it&amp;#8217;s easy to define a category on it. If you define a &lt;code&gt;title&lt;/code&gt; category
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index.define_category :title&lt;/code&gt;&lt;/pre&gt;
it will use whatever data came back from the database.&lt;/p&gt;
&lt;p&gt;If your database doesn&amp;#8217;t have nice column names, don&amp;#8217;t worry, you have two options:
Do a &lt;code&gt;SELECT id, t_01 as title ...&lt;/code&gt; or use the &lt;code&gt;from&lt;/code&gt; option when you define the category:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index.define_category :title, :from =&amp;gt; :t_01&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;from&lt;/code&gt; option is quite cool, as it allows you to have multiple categories on the same data! Say you wanted a similarity search in one category and none on the other:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index.define_category :title, :from =&amp;gt; :t_01
books_index.define_category :similar_title, :from =&amp;gt; :t_01, similarity: Similarity::Phonetic.new(3)&lt;/code&gt;&lt;/pre&gt;
Lots of possibilities, I&amp;#8217;m sure you&amp;#8217;ll find more useful ones!&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s more. You can have crazy indexes where every category has its own data source:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;books_index.define_category :title, source: Sources::CSV.new(:title, :author, file: 'data/library.csv', col_sep: ',')&lt;/code&gt;&lt;/pre&gt;
Now the title category takes its data from a library.csv. If you do this, be careful that all data sources use the same ids or Picky&amp;#8217;s core mechanism won&amp;#8217;t work.&lt;/p&gt;
&lt;h2&gt;Currently available data sources&lt;/h2&gt;
&lt;p&gt;Picky offers a few data sources, &lt;code&gt;DB&lt;/code&gt; for databases, &lt;code&gt;CSV&lt;/code&gt; for comma-separated files, &lt;code&gt;Couch&lt;/code&gt; for couch DB, and &lt;code&gt;Delicious&lt;/code&gt;, for delicious bookmarks. Mmh.&lt;/p&gt;
&lt;p&gt;This is how you use them. We&amp;#8217;ve already seen the database source:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Sources::DB.new('SELECT id, title, author FROM books', file: 'app/db.yml')&lt;/code&gt;&lt;/pre&gt;
Don&amp;#8217;t hesitate to use JOINs or other &lt;span class=&quot;caps&quot;&gt;SQL&lt;/span&gt; expressions for some extreme databasing!
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Sources::CSV.new(:title, :author, :isbn, :year, :publisher, :subjects, file: 'data/books.csv')&lt;/code&gt;&lt;/pre&gt;
This source assumes that your first column is the id column. It takes its data from the file given in the &lt;code&gt;file&lt;/code&gt; option.
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Sources::Couch.new(:title, :author, :isbn, url: 'http://localhost:5984/picky', keys: Sources::Couch::UUIDKeys.new)&lt;/code&gt;&lt;/pre&gt;
The CouchDB source takes a url where couch DB serves its data. By default it assumes that you are using Hex Keys. But you can pass in one of &lt;code&gt;Sources::Couch::HexKeys.new&lt;/code&gt;, &lt;code&gt;Sources::Couch::UUIDKeys.new&lt;/code&gt;, or &lt;code&gt;Sources::Couch::IntegerKeys.new&lt;/code&gt; in the &lt;code&gt;keys&lt;/code&gt; option to tell Picky what keys you have.
I&amp;#8217;m afraid that currently you have to recalculate your keys in the client to get back the original keys. I am working on non-integer keys, but it takes its time. Sorry about that.
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Sources::Delicious.new(:username, :password)&lt;/code&gt;&lt;/pre&gt;
Delicious is the easiest source, since it comes with fixed data categories &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;tags&lt;/code&gt;, &lt;code&gt;url&lt;/code&gt; that you can categorize.&lt;/p&gt;
&lt;h2&gt;How do I define my own Data Source?&lt;/h2&gt;
&lt;p&gt;Defining your own source is easy. The Couch DB source for example has actually been sent in by &lt;a href=&quot;http://github.com/stanley&quot;&gt;Stanley&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This piece of code is the superclass of all sources in Picky and is there simply for illustrative purposes, so you can see what methods should be implemented:
&lt;a href=&quot;http://github.com/floere/picky/blob/master/server/lib/picky/sources/base.rb&quot;&gt;http://github.com/floere/picky/blob/master/server/lib/picky/sources/base.rb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I recommend to make your source also its subclass, since it implements empty methods that are called by the indexer. But it actually just needs one worker method. This one:
&lt;code&gt;harvest(index, category)&lt;/code&gt;
It gets the index and the current category and should &lt;code&gt;yield(id, text_data_for_id)&lt;/code&gt;. It is called by the indexer when it needs the data.&lt;/p&gt;
&lt;p&gt;The two other methods that are called by the indexer are
&lt;code&gt;connect_backend&lt;/code&gt;, which is called once per index/category, and &lt;code&gt;take_snapshot&lt;/code&gt;, which is called once for each index, before &lt;code&gt;harvest&lt;/code&gt;-ing the data. Use it to create temporary tables etc.&lt;/p&gt;
&lt;p&gt;So if your duck subclasses &lt;code&gt;Sources::Base&lt;/code&gt;, quacks &lt;code&gt;#harvest&lt;/code&gt; and yields &lt;code&gt;id, text_data_for_id&lt;/code&gt; your data source is set to go!&lt;/p&gt;
&lt;p&gt;Simple and easy to understand, isn&amp;#8217;t it?&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;what a data source in Picky is.&lt;/li&gt;
	&lt;li&gt;what data sources are currently available.&lt;/li&gt;
	&lt;li&gt;how you write your own.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;h2&gt;Contributing one to Picky&lt;/h2&gt;
&lt;p&gt;If you write your own data source, please let me know!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Partial Search</title>
   <link href="http://florianhanke.com/blog/2011/01/17/searching-with-picky-partial-search.html"/>
   <updated>2011-01-17T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/01/17/searching-with-picky-partial-search</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its configuration. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;h2&gt;What is a Partial Search?&lt;/h2&gt;
&lt;p&gt;Partial searching is when the user only enters part of a query word, but the search engine still manages to find the whole word.&lt;/p&gt;
&lt;p&gt;Example:
We want to find all &lt;code&gt;chunky bacon&lt;/code&gt;. If the search engine supports a partial search, we should be able to search for just &lt;code&gt;chunky ba&lt;/code&gt; and &lt;code&gt;chunky bacon&lt;/code&gt; will still be found.&lt;/p&gt;
&lt;p&gt;Note that &lt;code&gt;chunky bards&lt;/code&gt; will also be found, and so will &lt;code&gt;chunky babes&lt;/code&gt;. So beware.&lt;/p&gt;
&lt;p&gt;Usually, the character used for partial searches is the asterisk, &lt;code&gt;*&lt;/code&gt;.
So you would search for &lt;code&gt;chunky ba*&lt;/code&gt; to have the search engine look for &lt;code&gt;ba&lt;/code&gt; followed by anything.&lt;/p&gt;
&lt;h2&gt;In Picky&lt;/h2&gt;
&lt;p&gt;At the time of writing, Picky offers a postfix partial search, meaning that only words &lt;em&gt;ending&lt;/em&gt; in anything can be searched. (Or a &lt;code&gt;Partial::None&lt;/code&gt; partial search that just ignores the &lt;code&gt;*&lt;/code&gt;.)&lt;/p&gt;
&lt;p&gt;The thing you use is &lt;code&gt;Partial::Substring&lt;/code&gt;, like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;some_index = index :main, Sources::DB.new('SELECT id, title FROM books', file: 'app/db.yml')
some_index.define_category :title, partial: Partial::Substring.new(from: 1)&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;So you define a data category on the index and give it the &lt;code&gt;partial&lt;/code&gt; option. With this option you tell Picky to use the following class for generating the index in a special way to support partial indexing and querying.&lt;/p&gt;
&lt;p&gt;What we want in the example above is have Picky use a &lt;code&gt;Partial::Substring&lt;/code&gt;, and have a query word match &lt;code&gt;from&lt;/code&gt; the first position (position &lt;code&gt;1&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Example:
A word like &lt;code&gt;picky&lt;/code&gt; would match on &lt;code&gt;p&lt;/code&gt;, &lt;code&gt;pi&lt;/code&gt;, &lt;code&gt;pic&lt;/code&gt;, &lt;code&gt;pick&lt;/code&gt; and &lt;code&gt;picky&lt;/code&gt;. If you defined &lt;code&gt;from: 3&lt;/code&gt;, then it would only match &lt;code&gt;pic&lt;/code&gt;, &lt;code&gt;pick&lt;/code&gt;, &lt;code&gt;picky&lt;/code&gt;. Setting &lt;code&gt;from&lt;/code&gt; to &lt;code&gt;1&lt;/code&gt; is indexing intensive, but will find everything.&lt;/p&gt;
&lt;p&gt;It is super-easy to write your own partial search. See below for that. The sky is the limit, basically.&lt;/p&gt;
&lt;p&gt;On a side-note: Picky will always search the last word of a query with a &lt;code&gt;*&lt;/code&gt;, except if you use double quotes, like so: &lt;code&gt;&quot;chunky bac&quot;&lt;/code&gt;. This will really only find &lt;code&gt;chunky bac&lt;/code&gt;, not &lt;code&gt;chunky bacon&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;How does Picky do this?&lt;/h2&gt;
&lt;p&gt;Picky aims to be very extensible, so what it does is very simple.&lt;/p&gt;
&lt;p&gt;Picky uses a partial generator, like &lt;code&gt;Partial::Substring&lt;/code&gt; which takes an exact index (more below) and returns a partial index.&lt;/p&gt;
&lt;p&gt;An exact index in Picky is just a hash that maps words to an array of ids.&lt;/p&gt;
&lt;p&gt;So &lt;code&gt;Partial::Substring.new(from: 3)&lt;/code&gt; takes something like that:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :picky =&amp;gt; [1, 16, 3, 999],
  :pickle =&amp;gt; [800, 3, 55]
}
&lt;/code&gt;&lt;/pre&gt;
(the index for exact matches) and transforms it into something like that:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :pickle =&amp;gt; [800, 3, 55],
  :pickl  =&amp;gt; [800, 3, 55],
  :picky =&amp;gt; [1, 16, 3, 999],
  :pick  =&amp;gt; [1, 16, 3, 999, 800, 3, 55],
  :pic  =&amp;gt; [1, 16, 3, 999, 800, 3, 55]
}
&lt;/code&gt;&lt;/pre&gt;
So in &lt;code&gt;pic&lt;/code&gt;, there are both the ids from &lt;code&gt;picky&lt;/code&gt; and the ids from &lt;code&gt;pickle&lt;/code&gt;. If someone looks for &lt;code&gt;pic&lt;/code&gt;, we return a mix of both ids.&lt;/p&gt;
&lt;h2&gt;How do I define my own Partial Search?&lt;/h2&gt;
&lt;p&gt;It is extremely simple. A partial search just needs to implement a &lt;code&gt;generate_from(exact_index)&lt;/code&gt; method that returns the new partial index.&lt;/p&gt;
&lt;p&gt;You could for example implement a partial index that has &lt;em&gt;random&lt;/em&gt; substring matches of up to 3 characters (silly, I know :)):
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;class Partial::Random
  def generate_from exact_index
    exact_index.inject({}) do |partial_index, word_and_ids|
      word, ids = *word_and_ids
      start  = rand word.size
      ending = rand(3) + 1
      random_substring = word[start, ending]
      partial_index[random_substring] ||= []
      partial_index[random_substring] += ids
      partial_index
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;
This method returns a new index that might look like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;Partial::Random.new.generate_from(:picky =&amp;gt; [1,2,3]) # =&amp;gt; { :ick =&amp;gt; [1,2,3] }
&lt;/code&gt;&lt;/pre&gt;
Of course, the example is not very performant – but legible for you.&lt;/p&gt;
&lt;p&gt;Finally, you&amp;#8217;d use it for your data categories in &lt;code&gt;app/application.rb&lt;/code&gt; like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;some_index = index :main, Sources::DB.new('SELECT id, title FROM books', file: 'app/db.yml')
some_index.define_category :title, partial: Partial::Random.new
&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;A better idea might be to create a substring partial that generates a partial index where the asterisk is actually at the front of the word:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;{
  :picky =&amp;gt; [1,2,3],
  :icky  =&amp;gt; [1,2,3],
  :cky   =&amp;gt; [1,2,3],
  :ky    =&amp;gt; [1,2,3],
  :y     =&amp;gt; [1,2,3]
}
&lt;/code&gt;&lt;/pre&gt;
This will match &lt;code&gt;picky&lt;/code&gt; if you enter just a &lt;code&gt;y&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;Picky is very flexible – do what you want however you want it.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;what a partial search is.&lt;/li&gt;
	&lt;li&gt;how Picky does a partial search.&lt;/li&gt;
	&lt;li&gt;how a partial search is configured in Picky.&lt;/li&gt;
	&lt;li&gt;how you can write your own.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;h2&gt;Contributing one to Picky&lt;/h2&gt;
&lt;p&gt;If you write your own, please let me know!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Searching with Picky&amp;#58; Character&amp;nbsp;Substitution</title>
   <link href="http://florianhanke.com/blog/2011/01/13/searching-with-picky-character-substituters.html"/>
   <updated>2011-01-13T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2011/01/13/searching-with-picky-character-substituters</id>
   <content type="html">&lt;p&gt;This is a post in the &lt;a href=&quot;http://florianhanke.com/picky/index.html&quot;&gt;Picky&lt;/a&gt; series on its configuration. If you haven&amp;#8217;t tried it yet, do so in the &lt;a href=&quot;http://florianhanke.com/picky/getting_started.html&quot;&gt;Getting Started&lt;/a&gt; section. It&amp;#8217;s quick and painless :)&lt;/p&gt;
&lt;h2&gt;What is Character Substitution?&lt;/h2&gt;
&lt;p&gt;Character substitution in a search engine is one of the first steps in the process of sanitizing your users&amp;#8217; input.&lt;/p&gt;
&lt;p&gt;Examples:
&lt;code&gt;ä =&amp;gt; ae&lt;/code&gt;,
&lt;code&gt;ø =&amp;gt; o&lt;/code&gt;,
&lt;code&gt;é =&amp;gt; e&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This is used to make the search engine indifferent to a user&amp;#8217;s origin or way of writing.&lt;/p&gt;
&lt;p&gt;For example, my hometown is called &lt;code&gt;Zürich&lt;/code&gt;, with an &lt;em&gt;umlaut&lt;/em&gt; character, &lt;code&gt;ü&lt;/code&gt;.
German users will search with an ü. However, most users of the world don&amp;#8217;t know this character, and will simply type &lt;code&gt;Zurich&lt;/code&gt;. So what we want is make the search engine ignore the &lt;em&gt;umlaut diacritics&lt;/em&gt;, the two dots over the u.&lt;/p&gt;
&lt;h2&gt;How do we do this?&lt;/h2&gt;
&lt;p&gt;Usually, what search engines do is perform a sort of &lt;em&gt;character substitution&lt;/em&gt; before putting text into the index, so &lt;code&gt;Zürich&lt;/code&gt; will go into the index as &lt;code&gt;zurich&lt;/code&gt;. For that, we character substituted &lt;code&gt;ü =&amp;gt; u&lt;/code&gt;. I also &lt;em&gt;lowercased&lt;/em&gt; it, since that is what search engines also do, to significantly save index space.&lt;/p&gt;
&lt;p&gt;So now we have &lt;code&gt;Zurich&lt;/code&gt; in the index. If a user now searched for &lt;code&gt;Zürich&lt;/code&gt;, the search engine wouldn&amp;#8217;t find it.&lt;/p&gt;
&lt;p&gt;So what we do is also perform this character substitution in a query, so that if the user enters an &lt;code&gt;ü&lt;/code&gt;, it is replaced by an &lt;code&gt;u&lt;/code&gt;, making &lt;code&gt;Zurich&lt;/code&gt; out of &lt;code&gt;Zürich&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In a nutshell, the indexing and the querying map both &lt;code&gt;Zürich&lt;/code&gt; and &lt;code&gt;Zurich&lt;/code&gt; to &lt;code&gt;Zurich&lt;/code&gt; and a user will find it, regardless if they searched for my hometown with or without umlaut.&lt;/p&gt;
&lt;h2&gt;How do we do this in Picky?&lt;/h2&gt;
&lt;p&gt;Picky offers two class methods in a Picky &lt;code&gt;Application&lt;/code&gt; where you can define how characters are substituted, amongst other things:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;code&gt;default_indexing options = {}&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;default_querying options = {}&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The &lt;code&gt;default_&lt;/code&gt; in the method name comes from the fact that whatever options are given, will be used for all indexing and querying unless overridden. So most of the time you will be configuring it there.&lt;/p&gt;
&lt;p&gt;One of the options is &lt;code&gt;substitutes_characters_with&lt;/code&gt; and you give it a character substituter object that has a &lt;code&gt;#substitute(text)&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;Picky already includes one for west european character sets. You use it as follows:&lt;/p&gt;
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;default_indexing substitutes_characters_with: CharacterSubstituters::WestEuropean.new&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I use the Ruby 1.9 hash style, &lt;code&gt;key: value&lt;/code&gt;, for that. The rocket I use for mapping things, &lt;code&gt;map '/some/path' =&amp;gt; controller&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What the west european character substituter does is this:
&lt;code&gt;ä =&amp;gt; ae&lt;/code&gt;,
&lt;code&gt;Ä =&amp;gt; Ae&lt;/code&gt;,
&lt;code&gt;ë =&amp;gt; e&lt;/code&gt;,
&lt;code&gt;Ë =&amp;gt; E&lt;/code&gt;,
&lt;code&gt;ï =&amp;gt; i&lt;/code&gt;,
&lt;code&gt;Ï =&amp;gt; I&lt;/code&gt;,
&lt;code&gt;ö =&amp;gt; oe&lt;/code&gt;,
&lt;code&gt;Ö =&amp;gt; Oe&lt;/code&gt;,
&lt;code&gt;ü =&amp;gt; ue&lt;/code&gt;,
&lt;code&gt;Ü =&amp;gt; Ue&lt;/code&gt;,
and 22 others. See &lt;a href=&quot;http://github.com/floere/picky/blob/master/server/spec/lib/character_substituters/west_european_spec.rb&quot;&gt;the spec&lt;/a&gt; if you&amp;#8217;d like to know more.&lt;/p&gt;
&lt;p&gt;So a query like &lt;code&gt;Hände Nüsse&lt;/code&gt; will be sanitized to &lt;code&gt;haende nuesse&lt;/code&gt; before being further processed. Again also lowercasing it, since this is usually also done.&lt;/p&gt;
&lt;h2&gt;How do I define my own character substituter?&lt;/h2&gt;
&lt;p&gt;It is extremely simple. A character substituter just needs to implement a &lt;code&gt;substitute(text)&lt;/code&gt; method that returns the substituted text.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&quot;http://github.com/floere/picky/blob/master/server/lib/picky/character_substituters/west_european.rb&quot;&gt;the source of the west european substituter&lt;/a&gt; if you want to see how I did it.&lt;/p&gt;
&lt;p&gt;Why is it so illegibly written?&lt;/p&gt;
&lt;p&gt;It is heavily optimized. Since this method will be called for all indexed data, and for each query, it should be performant.&lt;/p&gt;
&lt;p&gt;The west european spec includes two performance specs for that:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;describe &quot;speed&quot; do
  it &quot;is fast&quot; do
    result = performance_of { @substituter.substitute('ä') }
    result.should &amp;lt; 0.00009
  end
  it &quot;is fast&quot; do
    result = performance_of { @substituter.substitute('abcdefghijklmnopqrstuvwxyz1234567890') }
    result.should &amp;lt; 0.00015
  end
end
&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;The method &lt;code&gt;performance_of&lt;/code&gt; is used in Picky quite often to maintain performance and notify me should anything get slower. It looks like this:
&lt;pre class=&quot;sh_ruby&quot;&gt;&lt;code&gt;def performance_of &amp;amp;block
  GC.disable
  result = Benchmark.realtime &amp;amp;block
  GC.enable
  result
end
&lt;/code&gt;&lt;/pre&gt;&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So we&amp;#8217;ve seen&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;that most search engines need a character substituter.&lt;/li&gt;
	&lt;li&gt;that character substituter help your international users find things.&lt;/li&gt;
	&lt;li&gt;how they are configured in Picky.&lt;/li&gt;
	&lt;li&gt;how you can write your own.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope you learnt something new :)&lt;/p&gt;
&lt;h2&gt;Contributing one to Picky&lt;/h2&gt;
&lt;p&gt;If you write your own, please let me know!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Speccing methods called in initialize</title>
   <link href="http://florianhanke.com/blog/2010/10/27/speccing-methods-called-in-initialize.html"/>
   <updated>2010-10-27T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2010/10/27/speccing-methods-called-in-initialize</id>
   <content type="html">&lt;p&gt;Recently when writing &lt;a href=&quot;http://floere.github.com/picky/&quot;&gt;Picky, the clever small text search engine&lt;/a&gt;, I encountered the following problem: How do I test methods that are called in an initializer?&lt;/p&gt;
&lt;p&gt;(Of course I could call &lt;code&gt;Testee.new&lt;/code&gt; in the spec and then just call the method again. But what if that method sets a state?)&lt;/p&gt;
&lt;p&gt;In code:
&lt;script src=&quot;http://gist.github.com/648878.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Why open sourcing security critical software is important</title>
   <link href="http://florianhanke.com/blog/2010/10/06/why-open-sourcing-security-critical-software-is-important.html"/>
   <updated>2010-10-06T00:00:00+11:00</updated>
   <id>http://florianhanke.com/blog/2010/10/06/why-open-sourcing-security-critical-software-is-important</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://www.freedom-to-tinker.com/blog/jhalderm/hacking-dc-internet-voting-pilot&quot;&gt;Why open sourcing security critical software is important&lt;/a&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Profiling MySQL Queries</title>
   <link href="http://florianhanke.com/blog/2010/09/27/in-detail-performance-measurements-for-MySQL.html"/>
   <updated>2010-09-27T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/09/27/in-detail-performance-measurements-for-MySQL</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html&quot;&gt;Profiling MySQL Queries&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;In-detail performance measurements for MySQL queries.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Drawing in the browser</title>
   <link href="http://florianhanke.com/blog/2010/05/22/drawing-in-the-browser.html"/>
   <updated>2010-05-22T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/05/22/drawing-in-the-browser</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://mugtug.com/sketchpad/&quot;&gt;Drawing in the Browser&lt;/a&gt;&lt;/a&gt;
&amp;#8230; using HTML5.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Most important Ruby method 2010</title>
   <link href="http://florianhanke.com/blog/2010/05/21/most-important-ruby-method.html"/>
   <updated>2010-05-21T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/05/21/most-important-ruby-method</id>
   <content type="html">&lt;p&gt;It&amp;#8217;s &lt;code&gt;squeeze&lt;/code&gt;!
&lt;script src=&quot;http://gist.github.com/408937.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Programming Applied Mathematics</title>
   <link href="http://florianhanke.com/blog/2010/05/13/programming-applied-mathematics.html"/>
   <updated>2010-05-13T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/05/13/programming-applied-mathematics</id>
   <content type="html">&lt;p&gt;&lt;code&gt;Programming is one of the most difficult branches of applied mathematics / the poorer mathematicians had better remain pure mathematicians.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;– Edsger W. Dijkstra&lt;/p&gt;
&lt;p&gt;(via &lt;a href=&quot;http://fuckyeahcomputerscience.tumblr.com/&quot;&gt;fuckyeahcomputerscience&lt;/a&gt;)&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">New programming jargon</title>
   <link href="http://florianhanke.com/blog/2010/05/11/new-programming-jargon.html"/>
   <updated>2010-05-11T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/05/11/new-programming-jargon</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://www.globalnerdy.com/2010/05/09/new-programming-jargon/&quot;&gt;New Programming Jargon&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Need to remember Bugfoot and Shrug Report… And especially Duck!&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Fat slows you down</title>
   <link href="http://florianhanke.com/blog/2010/05/09/fat-slows-you-down.html"/>
   <updated>2010-05-09T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/05/09/fat-slows-you-down</id>
   <content type="html">&lt;p&gt;Fat slows you down.&lt;/p&gt;
&lt;p&gt;If you really need speed in Ruby 1.9, consider this example:
&lt;script src=&quot;http://gist.github.com/395419.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">You already knew that, right? (Assigning with splats)</title>
   <link href="http://florianhanke.com/blog/2010/04/30/you-already-knew-that-right.html"/>
   <updated>2010-04-30T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/30/you-already-knew-that-right</id>
   <content type="html">&lt;script src=&quot;http://gist.github.com/385121.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Referring to the fact that I want to sleep with the splat operator…&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">97% is as good as a 100%</title>
   <link href="http://florianhanke.com/blog/2010/04/30/if-you-re-in-a-hurry.html"/>
   <updated>2010-04-30T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/30/if-you-re-in-a-hurry</id>
   <content type="html">&lt;p&gt;&lt;code&gt;If you're in a hurry and you need to pack up your bags and go, 97% is as good as a 100%. The 100% mark does not have the same (show-stopping) magic as 0%, where the difference between 3% and 0% really is important.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;– &lt;a href=&quot;http://www.two-sdg.demon.co.uk/curbralan/papers/minimalism/OmitNeedlessCode.html&quot;&gt;Omit Needless Code&lt;/a&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">elements.each(*p)</title>
   <link href="http://florianhanke.com/blog/2010/04/30/elements-each-p.html"/>
   <updated>2010-04-30T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/30/elements-each-p</id>
   <content type="html">&lt;p&gt;I often use&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ary.map(&amp;amp;:upcase)&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;instead of&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ary.map { |a| a.upcase }&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;But what can I do to use the elements as param as in the following code?&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ary.each { |a| p a }&lt;/code&gt;&lt;/p&gt;
&lt;script src=&quot;http://gist.github.com/385117.js&quot;&gt;&lt;/script&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Strategy pattern pattern pattern pattern</title>
   <link href="http://florianhanke.com/blog/2010/04/29/strategy-pattern-pattern-pattern-pattern.html"/>
   <updated>2010-04-29T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/29/strategy-pattern-pattern-pattern-pattern</id>
   <content type="html">&lt;p&gt;A pattern that I often see cropping up in my &lt;a href=&quot;http://github.com/floere/gosu_extensions&quot;&gt;game framework&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It can be used for configuring subclasses that act according to an order of calls defined in the superclass. How the calls exactly work can be defined in the subclasses (or in an external configuration) using the class methods.&lt;/p&gt;
&lt;script src=&quot;http://gist.github.com/383325.js&quot;&gt;&lt;/script&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Mastery is a mindset</title>
   <link href="http://florianhanke.com/blog/2010/04/26/mastery-is-a-mindset.html"/>
   <updated>2010-04-26T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/26/mastery-is-a-mindset</id>
   <content type="html">&lt;p&gt;&lt;code&gt;Mastery is a mindset.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;From the book &amp;#8220;Drive&amp;#8221;, by Pink.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d reformulate it as: &amp;#8220;Mastery is neither a question of time, or experience, but a mindset.&amp;#8221;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Ruby 1.9 params</title>
   <link href="http://florianhanke.com/blog/2010/04/20/ruby-19-params.html"/>
   <updated>2010-04-20T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/20/ruby-19-params</id>
   <content type="html">&lt;script src=&quot;http://gist.github.com/372372.js&quot;&gt;&lt;/script&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Riddle</title>
   <link href="http://florianhanke.com/blog/2010/04/18/riddle.html"/>
   <updated>2010-04-18T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/18/riddle</id>
   <content type="html">&lt;p&gt;&lt;code&gt;3735928559&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Why is this number unappealing to vegetarians?&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">A hole in the wall</title>
   <link href="http://florianhanke.com/blog/2010/04/18/a-hole-in-the-wall.html"/>
   <updated>2010-04-18T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/18/a-hole-in-the-wall</id>
   <content type="html">&lt;p&gt;His stool leaned back at a dangerous angle, he displays a pair of jamaica-colored sneakers to the public. Them sticking out of his business hole seems rather odd, considering the sober surroundings of the Niederdorf, or &amp;#8220;nether village&amp;#8221;, as this particular place in Zürich is called.&lt;/p&gt;
&lt;p&gt;Slurping a botanic tea, idly facebooking and tumbling through the depths, no, shallows of the net, waiting for customers. It&amp;#8217;s been that way now for more than a day, and he starts to wonder if the customer specific context ads are just a fluke.&lt;/p&gt;
&lt;p&gt;An abrupt &amp;#8220;Oh hey&amp;#8221; directed his way throws him out of the structural code improvements that have been waiting for him at the back of his mind. &amp;#8220;Hey&amp;#8221;, a burly businessman with slightly high blood pressure &amp;#8211; he surmises from the corona of hair still clinging on – asks: &amp;#8220;Are you the man that types?&amp;#8221;, &amp;#8220;Yes, yes I do, I code.&amp;#8221; &amp;#8220;Oh, code. Yeah, sorry, my bad. Well, look, I need a small program that does a few calculations based on this.&amp;#8221;&lt;/p&gt;
&lt;p&gt;And he whips out a napkin with a few calculations on it, in black lines what looks to be from an eyeliner, or a piece of coal. &amp;#8220;Don&amp;#8217;t mind the looks – how long do you think this takes?&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;Hmm, well. I think the design might take me a few hours. Then we&amp;#8217;d need to meet again to see if we&amp;#8217;re on the right track. Then I&amp;#8217;ll have to code it, and clean it up a little. Might take me another 2 hours.&amp;#8221;&lt;/p&gt;
&lt;p&gt;&amp;#8220;For 90 an hour, right?&amp;#8221; &amp;#8220;That&amp;#8217;s right, as advertised.&amp;#8221; &amp;#8220;Ok, well. See you in three.&amp;#8221;&lt;/p&gt;
&lt;p&gt;He rights his stool, leans forward, sketches boxes and lines, boxes and lines, lines and boxes. Then he goes for a quick walk, takes in the morning, letting the cogs turn. Half an hour of showing tourists the view, and a hot chocolate at the riverside. Finally, he plumps down in front of his sleek, metal-clad machine and types.&lt;/p&gt;
&lt;p&gt;What he did was transform the mascara lines into byroliner lines and boxes as a straw where the mind can cling on to, and from there to typed text on a luminescent screen, for him to read and others to understand, finally into the core of the machine, and the zeros and ones people who have no understanding regurgitate so often.&lt;/p&gt;
&lt;p&gt;Entering the formula was pretty straightforward. But there are other things to consider: What is the best user interface for a burly businessman? Will it be used repeatedly? As if on cue, burly biz arrives and asks &amp;#8220;Done yet?&amp;#8221; &amp;#8220;Oh hi.&amp;#8221;&lt;/p&gt;
&lt;p&gt;Back and forth: The customer starts with a lot of questions, have you put this in? He cuts him short, and explains what he will see, his understanding of the formula. There is much going on, but boils down to this: The clearing of misunderstandings. And they get cleared. It must be his happy day, the businessman knows the power of an ad-hoc team, and how it should work, how progress can come from it.&lt;/p&gt;
&lt;p&gt;The discussion dies down, lots of nodding all around, and smiles emerge. A handshake, and both are off – shorty no doubt to a meeting, where money and hand sweat is moved, our coder off to the plane of lines and boxes. A prototype stands, but this is not where it ends. He wants it to be perfect. After all, he is a craftsman, and craft is what defines him. The table might look nice to an outsider, but the craft is inside: The distribution of weight, the structure of the wood: What holds the thing together and doesn&amp;#8217;t make it bend, for year after year.&lt;/p&gt;
&lt;p&gt;Before he cleans up however, there is yoga waiting for him, and another stroll an the riverside. Can it be improved? How? The response comes to him during the most innocent of activities, stroking a cat that has found, purring, a new home around his legs. He leaves the cat slightly shocked behind – but she improves the situation by licking her paw – and runs up the street, repeating and repeating the idea, urging it not to leave his head.&lt;/p&gt;
&lt;p&gt;Panting, he types it in. The tests run, the code checker tool give him a green light. He opens it, it works. Puts it on a stick, wraps it in a package, puts it into a nice box which brandishes his logo – doodled on the back of a napkin by his sister, three years ago – and puts it aside for the customer, due to arrive in an hour.&lt;/p&gt;
&lt;p&gt;And finally. Finally the sneakers rest again on the sill of the hole.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Oh yeah, Amazon?</title>
   <link href="http://florianhanke.com/blog/2010/04/14/oh-yeah-amazon-question-mark.html"/>
   <updated>2010-04-14T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/14/oh-yeah-amazon-question-mark</id>
   <content type="html">&lt;p&gt;From the latest Newsletter:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Support for Session Stickiness in Elastic Load Balancing Amazon Elastic MapReduce Introduces Custom Cluster Configuration Option&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;They also have Gurble Blurble Fickleness, introducing Jambawambing Lordle Figuconation Schnorptions.&lt;/p&gt;
&lt;p&gt;At least that&amp;#8217;s what I hear when I read stuff like that.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">IE didn't get the CSS3 memo?</title>
   <link href="http://florianhanke.com/blog/2010/04/13/ie-didnt-get-the-css3-memo-question-mark.html"/>
   <updated>2010-04-13T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/13/ie-didnt-get-the-css3-memo-question-mark</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://kimblim.dk/css-tests/selectors/&quot;&gt;IE didn&amp;#8217;t get the CSS3 memo?&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Or, as is my guess: The code they based the new browsers on was fully untested, totally disorganized, and thus brutally hard to extend. IE9 though, gives one hope.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Challenged</title>
   <link href="http://florianhanke.com/blog/2010/04/12/challenged.html"/>
   <updated>2010-04-12T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/12/challenged</id>
   <content type="html">&lt;p&gt;The framework looms in front of you. Clouds cover the gray sky. You plunge in. Full unit test rewrite, nothing is where it was before, but right: The mailbox is in front of the house, the bathtub is finally in the bath, the fridge contains organic food. There is a pot on the fire, full of juicy stuff.&lt;/p&gt;
&lt;p&gt;But you are wearing glasses that let you only see 10 centimeters. You set wild eyes on the integration tests: Guests are entering the house, trying to eat from the toilet, sleeping in the oven, or jumping out of windows. It is fail, fail, fail, wherever you happen to look.&lt;/p&gt;
&lt;p&gt;You are close to despair. Everything is right. Right? You trudge on, teeth gnashing.&lt;/p&gt;
&lt;p&gt;Then, somehow, you adjust the doormat ever so slightly, piece in the last crumb of information. And magically, it just works. Everything. Just. Works. The gargantuan task is finished. For minutes, you revel in the sun&amp;#8217;s rays. The clouds, they never reappear.&lt;/p&gt;
&lt;p&gt;It is done.&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Stuttering Proc</title>
   <link href="http://florianhanke.com/blog/2010/04/07/stuttering-proc.html"/>
   <updated>2010-04-07T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/07/stuttering-proc</id>
   <content type="html">&lt;script src=&quot;http://gist.github.com/358947.js&quot;&gt;&lt;/script&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Reloading a running Ruby application</title>
   <link href="http://florianhanke.com/blog/2010/04/05/reloading-a-running-ruby-application.html"/>
   <updated>2010-04-05T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/05/reloading-a-running-ruby-application</id>
   <content type="html">&lt;p&gt;Here&amp;#8217;s how I do it:
&lt;script src=&quot;http://gist.github.com/351776.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;</content>
   
 </entry>
 
 <entry>
   <title type="html">Javuby?</title>
   <link href="http://florianhanke.com/blog/2010/04/05/javuby-question-mark.html"/>
   <updated>2010-04-05T00:00:00+10:00</updated>
   <id>http://florianhanke.com/blog/2010/04/05/javuby-question-mark</id>
   <content type="html">&lt;script src=&quot;http://gist.github.com/356227.js&quot;&gt;&lt;/script&gt;</content>
   
 </entry>
 

</feed>