Searching with Picky: Range/Area/Volume etc. Search Tweet
This is a post in the Picky series on its workings. If you haven’t tried it yet, do so in the Getting Started section. It’s quick and painless :)
This post is all about searching areas, volumes, space and time – and more!
tl;dr
Using ranged_category
instead of category
in index definition lets you search inside numeric ranges (instead of exact or partial strings). Example:
ranged_category :height,
50, # units "around" the searched value, here: meters
precision: 5 # very high precision, 1% error
Warp Area
Different Radiuses/Volume Sizes etc.
Range Search
Picky is good at intersecting stuff – and guessing which of the intersections you actually are looking for.
The pink part is where e.g. “name:eisenhower” and “title:wa”® intersects in a speech database, and Picky finds it. The blue part is where “name:eisenhower” and “title:wa”(rthog) intersect. Less interesting, and Picky thinks so too.
Usually, what Picky does is intersecting these circles of words you are looking for, resulting in funky Venn diagrams that have so successfully been used in 60s style living rooms.
Hey, doesn’t a map have grids that intersect somehow? What if Picky could intersect the area between the x lines (light blue) with the area between the y lines (also light blue)?
What we’d get is the results in the pinkish area.
This type of diagram has been successfully used by Piet Mondrian at the beginning of last century.
Now, if we could pass Picky the median x value, and the median y value and get it to return the results in the pink area, wouldn’t that be something?
Indeed it would, and indeed it already can. You probably just didn’t know.
But how can I do a range search?
Apart from searching exact or partial strings with the #category
method, Picky offers a #ranged_category
method for numerical values.
Let me show you how it works. Let’s say that I have a CSV file, mountains.csv
, with the mountains of the world, from lowest to highest, in meters:
1, Tokelau (NZ), 5.0
...
124, Vaalserberg (NL), 321.9
...
78513, Mount Everest (NP), 8850.0
Now we want the user to be able to enter
200
and get all the mountains that are +/- 50 meters in height away from 200.
For that you use ranged_category(name, units_around, options = {})
:
data_source = Sources::CSV.new(:location, :height, file: 'data/mountains.csv')
mountains = Index::Memory.new(:mountains, data_source) do
category :name
ranged_category :height, 50, precision: 3 # 50 is the units around the searched height
end
So we’d have a name (that is searched with the default config, like text) and a height that is searched with a precision of 3, 50 meters around the number the user enters.
What does the precision mean?
Precision 1, the default, has 5% error and is really, really fast, and precision 5 has 1% error and is just fast. You can go up to wherever you want, but 5 is a good tradeoff if you need a precise result.
Note that – since Picky does intersections – you can also search for height AND name at the same time. If you add a full partial search option to the name category, category :name, partial: Partial::Substring.new(from: 1)
then when you search for example for
300 va
you will find all the mountains from height 250 to 350 whose name starts with “va”. Nice eh?
Nice indeed, but can I use this for an area search?
Let’s say I have a CSV file, swiss_places.csv
, with all places, 20910 in all, in Switzerland, like so:
1,Zuger See,47.11667,8.48333
2,Zwischbergental,46.16667,8.13333
3,Zwischbergen,46.16667,8.11667
4,Zwingen,47.43825,7.53027
...
20910,Les 4 Vallées,46.17572,7.32142
This is the data. Then I tell Picky where to find the data (in the CSV) and how to index it:
data_source = Sources::CSV.new(:location, :north, :east, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
category :location
ranged_category :north, 0.01, precision: 3
ranged_category :east, 0.01, precision: 3
end
This means that we can search for the location, and the north and the east value, with 0.01 leeway around the searched number. So entering 47.12 would find numbers in the range 47.11..47.13.
Now, if you search for
47.12, 8.48
you find the “Zuger See”.
Since for Switzerland, the north and east coordinates are exclusive (one around 47, the other around 8.4), Picky knows what is what by itself.
If your values aren’t exclusive, for example both are in the range 1..3, then entering the search
1.3, 2.4
might make Picky ask you which one is what. It’s not clear if you want 1.3 from the one and 2.4 from the other, and voice versa. This can be remedied by exclusively specifying what is what:
north:1.3, east:2.4
The best thing is that you don’t need to use the Picky interface. You could whip up a Javascript interface (of some area) where you click into and run searches on Picky, then returning results that are displayed in the area.
But now, let’s go a little crazy!
Volumetric Search
Say, the swiss data also had heights:
1,Zuger See,47.11667,8.48333,410.0
2,Zwischbergental,46.16667,8.13333,
...
20910,Les 4 Vallées,46.17572,7.32142,1205.3
Just add the new line in the index definition, and in the source:
data_source = Sources::CSV.new(:location, :north, :east, :height, file: 'data/swiss_places.csv')
swiss_places = Index::Memory.new(:swiss_places, data_source) do
...
ranged_category :height, 50
end
Voilà!
47.12, 8.48, 400
This would make you find the “Zugerberg”, while using a height of 500 wouldn’t.
Let’s get funky!
We don’t need to use all categories:
47.12, f*, 412
Funky search, but this would find all locations in a thin band of north 47.11..47.13, whose names start with f, and that are in height 362..462.
Let’s add more dimensions.
Space and Time
So how would we search in space and time? Space is easy, that is just a volumetric search.
Now: How would you add in time?
Probably you’d index it in seconds from January 1st, 1970 or something like that, then define a ranged search with “radius” 1800. This would make Picky find things in the hour around the searched seconds since 1970.
I want to be able to search in 1m, 10m, 100m
Now, as you saw, we looked for heights 50 meters around it using:
ranged_category :height, 50
What if we want to search 1 meter, 10 meters, 100 meters around it, choosing as we go?
This is accomplished by adding more searchable categories, like so. You name the category specifically, and tell Picky from where in the data source it should get the data, using the from
option.
ranged_category :height1, 1, from: height
ranged_category :height10, 10, from: height
ranged_category :height100, 100, from: height
Choosing from the categories is done as usual. If you want 10 meters, search like this:
height10:412
This will find locations of heights 402..422.
Caveats
Actually, if you use the ranged_category
on a larger area on a ball, like earth. For example in Australia – the place I am staying in, currently – what you will find is that the more south you go, towards the pole, the less square and more rectangular your search area will get. This is because Picky does not correct the ball’s sphere. I’m working on it.
So, Picky cannot handle your balls yet.
For small countries it is still useful, and of course for lots of graph searches etc.
Flat things it does marvellously. And super fast!
Conclusion
So we’ve seen
- how Picky can search areas.
- how Picky can search volumes.
- how Picky can search any number of dimensions.
- how you can choose any combination of areas and other features.
- how you search in different ranges on the same thing/category.
- that you cannot quite search on a ball, like earth.
Hope you learnt something new!
Next Searching with Picky: In the TerminalShare
Previous On Searching