Searching with Picky: Range/Area/Volume etc. Search

ruby / picky / gems

This is a post in the Picky series on its workings. If you haven’t tried it yet, do so in the Getting Started section. It’s quick and painless :)

This post is all about searching areas, volumes, space and time – and more!

tl;dr

Using ranged_category instead of category in index definition lets you search inside numeric ranges (instead of exact or partial strings). Example:

ranged_category :height,
                50,          # units "around" the searched value, here: meters
                precision: 5 # very high precision, 1% error

Warp Area

Range Search

Range, how?

Area Search

Volume Search

“Find all locations in a thin slice of N47.11 to N47.13, whose names start with F, that are in height 362m to 462m”

Space & Time Search

Different Radiuses/Volume Sizes etc.

Caveats

Conclusion

Range Search

Picky is good at intersecting stuff – and guessing which of the intersections you actually are looking for.

The pink part is where e.g. “name:eisenhower” and “title:wa”® intersects in a speech database, and Picky finds it. The blue part is where “name:eisenhower” and “title:wa”(rthog) intersect. Less interesting, and Picky thinks so too.

Usually, what Picky does is intersecting these circles of words you are looking for, resulting in funky Venn diagrams that have so successfully been used in 60s style living rooms.

Hey, doesn’t a map have grids that intersect somehow? What if Picky could intersect the area between the x lines (light blue) with the area between the y lines (also light blue)?

What we’d get is the results in the pinkish area.

This type of diagram has been successfully used by Piet Mondrian at the beginning of last century.

Now, if we could pass Picky the median x value, and the median y value and get it to return the results in the pink area, wouldn’t that be something?

Indeed it would, and indeed it already can. You probably just didn’t know.

But how can I do a range search?

Apart from searching exact or partial strings with the #category method, Picky offers a #ranged_category method for numerical values.

Let me show you how it works. Let’s say that I have a CSV file, mountains.csv, with the mountains of the world, from lowest to highest, in meters:

1, Tokelau (NZ), 5.0
...
124, Vaalserberg (NL), 321.9
...
78513, Mount Everest (NP), 8850.0

Now we want the user to be able to enter

and get all the mountains that are +/- 50 meters in height away from 200.

For that you use ranged_category(name, units_around, options = {}):

data_source = Sources::CSV.new(:location, :height, file: 'data/mountains.csv')
mountains = Index::Memory.new(:mountains, data_source) do
  category        :name
  ranged_category :height, 50, precision: 3 # 50 is the units around the searched height
end

So we’d have a name (that is searched with the default config, like text) and a height that is searched with a precision of 3, 50 meters around the number the user enters.

What does the precision mean?

Precision 1, the default, has 5% error and is really, really fast, and precision 5 has 1% error and is just fast. You can go up to wherever you want, but 5 is a good tradeoff if you need a precise result.

Note that – since Picky does intersections – you can also search for height AND name at the same time. If you add a full partial search option to the name category, category :name, partial: Partial::Substring.new(from: 1) then when you search for example for

you will find all the mountains from height 250 to 350 whose name starts with “va”. Nice eh?

Nice indeed, but can I use this for an area search?

Let’s say I have a CSV file, swiss_places.csv, with all places, 20910 in all, in Switzerland, like so:

1,Zuger See,47.11667,8.48333
2,Zwischbergental,46.16667,8.13333
3,Zwischbergen,46.16667,8.11667
4,Zwingen,47.43825,7.53027
...
20910,Les 4 Vallées,46.17572,7.32142

This is the data. Then I tell Picky where to find the data (in the CSV) and how to index it:

data_source = Sources::CSV.new(:location, :north, :east, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
  category        :location
  ranged_category :north, 0.01, precision: 3
  ranged_category :east,  0.01, precision: 3
end

This means that we can search for the location, and the north and the east value, with 0.01 leeway around the searched number. So entering 47.12 would find numbers in the range 47.11..47.13.

Now, if you search for

you find the “Zuger See”.

Since for Switzerland, the north and east coordinates are exclusive (one around 47, the other around 8.4), Picky knows what is what by itself.

If your values aren’t exclusive, for example both are in the range 1..3, then entering the search

might make Picky ask you which one is what. It’s not clear if you want 1.3 from the one and 2.4 from the other, and voice versa. This can be remedied by exclusively specifying what is what:

The best thing is that you don’t need to use the Picky interface. You could whip up a Javascript interface (of some area) where you click into and run searches on Picky, then returning results that are displayed in the area.

But now, let’s go a little crazy!

Volumetric Search

Say, the swiss data also had heights:

1,Zuger See,47.11667,8.48333,410.0
2,Zwischbergental,46.16667,8.13333,
...
20910,Les 4 Vallées,46.17572,7.32142,1205.3

Just add the new line in the index definition, and in the source:

data_source = Sources::CSV.new(:location, :north, :east, :height, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
  ...
  ranged_category :height, 50
end

Voilà!

This would make you find the “Zugerberg”, while using a height of 500 wouldn’t.

Let’s get funky!

We don’t need to use all categories:

Funky search, but this would find all locations in a thin band of north 47.11..47.13, whose names start with f, and that are in height 362..462.

Let’s add more dimensions.

Space and Time

So how would we search in space and time? Space is easy, that is just a volumetric search.

Now: How would you add in time?

Probably you’d index it in seconds from January 1st, 1970 or something like that, then define a ranged search with “radius” 1800. This would make Picky find things in the hour around the searched seconds since 1970.

I want to be able to search in 1m, 10m, 100m

Now, as you saw, we looked for heights 50 meters around it using:

ranged_category :height, 50

What if we want to search 1 meter, 10 meters, 100 meters around it, choosing as we go?

This is accomplished by adding more searchable categories, like so. You name the category specifically, and tell Picky from where in the data source it should get the data, using the from option.

ranged_category :height1,     1, from: height
ranged_category :height10,   10, from: height
ranged_category :height100, 100, from: height

Choosing from the categories is done as usual. If you want 10 meters, search like this:

This will find locations of heights 402..422.

Caveats

Actually, if you use the ranged_category on a larger area on a ball, like earth. For example in Australia – the place I am staying in, currently – what you will find is that the more south you go, towards the pole, the less square and more rectangular your search area will get. This is because Picky does not correct the ball’s sphere. I’m working on it.

So, Picky cannot handle your balls yet.

For small countries it is still useful, and of course for lots of graph searches etc.

Flat things it does marvellously. And super fast!

Conclusion

So we’ve seen

  1. how Picky can search areas.
  2. how Picky can search volumes.
  3. how Picky can search any number of dimensions.
  4. how you can choose any combination of areas and other features.
  5. how you search in different ranges on the same thing/category.
  6. that you cannot quite search on a ball, like earth.

Hope you learnt something new!

Next Searching with Picky: In the Terminal

Share


Previous

Comments?