Stretchy

Stretchy is a query builder for Elasticsearch. It helps you quickly construct the JSON to send to Elastic, which can get rather complicated.

Stretchy is modeled after ActiveRecord's interface and architecture - query objects are immutable and chainable, which makes quickly building the right query and caching the results easy. The goals are:

Intuitive - If you've used ActiveRecord, Mongoid, or other query builders, Stretchy shouldn't be a stretch
Less Typing - Queries built here should be way fewer characters than building by hand
Easy - Implementing the right algorithms for your search needs should be simple

Stretchy is not:

an integration with ActiveModel to help you index your data - too application specific
a way to manage Elasticsearch configuration - see waistband
a general-purpose Elasticsearch API client - see the elasticsearch gem

Installation

Add this line to your application's Gemfile:

gem 'stretchy'

And then execute:

$ bundle

Or install it yourself as:

$ gem install stretchy

Usage

Stretchy is still in early development, so it does not yet support the full feature set of the Elasticsearch API. There may be bugs, though we try for solid spec coverage. We may introduce breaking changes in minor versions, though we try to stick with semantic versioning.

It does support fairly basic queries in an ActiveRecord-ish style.

Documentation

See the Stretchy docs on rubydocs for fairly detailed documentation on the API. Specifically, you'll probably want the docs for Stretchy Clauses, which make up the basis of the query builder.

Configuration

Stretchy.configure do |c|
  c.index_name = 'my_index'                       # REQUIRED
  c.client     = $my_client                       # ignore below, use a custom client
  c.url        = 'https://user:pw@my.elastic.url' # default is ENV['ELASTICSEARCH_URL']
  c.adapter    = :patron                          # default is :excon

  c.logger     = Logger.new(STDOUT)               # passed to elasticsearch-api gem
                                                  # Stretchy will also log, with the params
                                                  # specified below
  c.log_level  = :debug                           # default is :silence
  c.log_color  = :green                           # default is :blue
end

Base

query = Stretchy.query(type: 'model_name')

From here, you can chain the methods to build your desired query.

Chainable Query Methods

From here, you can chain the following query methods:

fulltext - generic fulltext search with proximity relevance
match - Elasticsearch match query
query - Add arbitrary json fragment to the query section
more_like - Get documents similar to a string or other documents
where - Filter based on fields in the document
terms - Filter without analyzing strings or symbols
filter - Add arbitrary json fragment to the filter section
range - Filter for a range of values
geo - Filter on geo_point fields within a specified distance
not - Get documents not matching passed conditions
should - Increase document score for matching documents
boost - Increasing document score based on different factors
near - Boost score based on how close a number / date / geo point is to an origin
field - Boost based on the numeric value of the passed field
random - Add a deterministic random factor to the document score
explain - Return score explanations along with documents
fields - Only return the specified fields
page - Limit, Offset, and Page to define which results to return

Fulltext

query = query.fulltext('Generic user-input phrase')
             .fulltext(author: 'John Romero')

Performs a query for the given string, either anywhere in the document or in specific fields. At least one of the terms must match, and the closer a document is to having the exact phrase, the higher its' score. See the Elasticsearch guide's article on proximity scoring for more info on how this works.

Match

query = query.match('welcome to my web site')
             .match(title: 'welcome to my web site')
             .match(image: 'loading construction flash', operator: 'or')

Performs a match query for the given string. If given a hash, it will use a match query on the specified fields, otherwise it will default to '_all'. By default, a match query searches for any of the analyzed terms in the document, and scores them using Lucene's practical scoring formula, which combines TF/IDF, the vector space model, and a few other niceties.

Query

query = query.match.query(
          multi_match: {
            query: 'super smash bros',
            fields: ['developer.games', 'developer.bio']
          }
        )

query = query.match.not.match.query(
          multi_match: {
            query: 'rez',
            fields: ['developer.games', 'developer.bio']
          }
        )

Adds arbitrary JSON to the query section of the final query. If you want to use a query type not currently supported by Stretchy, you can call this method and pass in the requisite json fragment. You can also prefix this with .not and .should to add your json to those sections of the query instead.

Caution

Stretchy tries to merge together matches on the same fields to optimize the final query to be sent to Elastic, but will not try to optimize any json added via the .query method.

More Like

query = query.more_like(ids: [1, 2, 3])
             .more_like(docs: other_search.results)
             .more_like(like_text: 'puppies and kittens are great', fields: :about_me)

Finds documents similar to a list of input documents. You must pass in one of the :ids, :docs or :like_text parameters, but everything else is optional. This method accepts any of the params available in the Elasticsearch more_like_this query. It can also be chained with .not and .should.

Where

query = query.where(
  name: 'alice',
  email: [
    'alice@company.com',
    'beatrice.christine@other_company.com'
  ],
  commit_count: 27..33,
  is_robot: nil
)

Allows passing a hash of matchable options similar to ActiveRecord's where method. To be returned, the document must match each of the parameters. If you pass an array of parameters for a field, the document must match at least one of those parameters.

Gotcha

If you pass a string or symbol for a field, it will be converted to a Match Query for the specified field. Since Elastic analyzes terms by default, string or symbol terms will be looked for by an analyzed query.

To query for exact matches against strings or symbols with underscores and punctuation intact, use the .where.terms clause.

Terms

query = query.where.terms(
          email: 'happy.developer@company.com',
          status: :awesome
        )

Sometimes you store values with punctuation, underscores, or other characters Elasticsearch would normally split into separate terms. If you want to query all comments that match a specific email address, you need to make sure that Elasticsearch doesn't analyze the query terms you send it before running the query. This clause allows you to do that.

Filter

query = query.filter(
          geo_polygon: {
              'person.location' => {
                  points: [
                      {lat: 40, lon: -70},
                      {lat: 30, lon: -80},
                      {lat: 20, lon: -90}
                  ]
              }
          }
        )

Adds arbitrary JSON to the filter section of the final query. If you want to use a filter type not currently supported by Stretchy, you can call this method and pass in the requisite json fragment. You can also prefix this with .not and .should to add your json to those sections of the filters instead.

Caution

Stretchy tries to merge together filters on the same fields to optimize the final query to be sent to Elastic, but will not try to optimize any json added via the .filter method.

Range

query = query.range(:rating, min: 3, max: 5)
             .range(:released, min: Time.now - 60*60*24*100)
             .range(:quantity, max: 100, exclusive: true)

Only documents with the specified field, and within the specified range match. You can also pass in dates and times as ranges. While you could pass a normal ruby Range object to .where, this allows you to specify only a minimum or only a maximum. Range filters are inclusive by default, but you can also pass :exclusive, :exclusive_min, or :exclusive_max.

Geo Distance

query = query.geo('coords', distance: '20mi', lat: 35.0117, lng: 135.7683)

Filters for documents where the specified geo_point field is within the given range.

Gotcha

The field must be mapped as a geo_point field. See Elasticsearch types for more info.

Not

query = query.where.not(rating: 0)
             .match.not('angry')
             .where.not.geo(field: 'coords', distance: '20mi', lat: 35.0117, lng: 135.7683)

Called after where or match will let you apply inverted filters. Any documents that match those filters will be excluded.

Should

query = query.should(name: 'Sarah', awesomeness: 1000).should.not(awesomeness: 0)

Should filters work similarly to .where. Documents that do not match are still returned, but they have a lower relevancy score and will appear after documents that do match in the results. See Elastic's documentation for BoolQuery and BoolFilter for more info.

Boost

query = query.boost.where(category: 3, weight: 100)
             .boost.range(:awesomeness, min: 10, weight: 10)
             .boost.match.not('sucks')

Boosts use a Function Score Query with filters to allow you to affect the score for the document. Each condition will be applied as a filter with an optional weight.

Near

query = query.boost.near(field: :published_at, origin: Time.now, scale: '5d')
             .boost.near(field: :coords, lat: 35.0117, lng: 135.7683, scale: '10mi', decay: 0.33, weight: 1000)

Boosts a document by how close a given field is to a given :origin . Accepts dates, times, numbers, and geographical points. Unlike .where.range or .boost.geo, .boost.near is not a binary operation. All documents get a score for that field, which decays the further it is away from the origin point.

The :scale param determines how quickly the value falls off. In the example above, if a document's :coords field is 10 miles away from the starting point, its score is about 1/3 that of a document at the origin point.

See the Function Score Query section on Decay Functions for more info.

Field

query = query.boost.field(:popularity)
             .boost.field(:timestamp, factor: 0.5, modifier: :sqrt)
             .boost.field(:votes, :bookmarks, :comments)

Boosts a document by a numeric value contained in the specified fields. You can also specify a factor (an amount to multiply the field value by) and a modifier (a function for normalizing values).

See the Boosting By Popularity Guide and the Field Value Factor documentation for more info.

Random

query = query.boost.random(user.id, 50)

Gives each document a randomized boost with a given seed and optional weight. This allows you to show slightly different result sets to different users, but show the same result set to that user every time.

Fields

query = query.fields(:name, :email, :id)

Instead of returning the entire document, only return the specified fields.

Limit, Offset, and Page

query = query.limit(20).offset(1000)
# or...
query = query.page(50, per_page: 20)

Works the same way as ActiveRecord's limit and offset methods - analogous to Elasticsearch's from and size parameters. The .page method allows you to set both at once, and is compatible with the Kaminari gem.

Explain

query = query.explain.where()

Tells Elasticsearch to return an explanation of the score for each document. See the explain parameter for how this is used, and the explain API for what the explanations will look like.

Result Methods

results - Result documents from this query
ids - Ids of result documents instead of the full source
response - Raw response data from Elasticsearch
total - Total number of matching documents
explanations - Explanations for document scores
per_page - Included with .limit_value for Kaminari compatibility

Results

query.results

Executes the query and provides the parsed json for each hit returned by Elasticsearch, along with _index, _type, _id, and _score fields.

Ids

query.ids

Provides only the ids for each hit. If your document ids are numeric (as is the case for many ActiveRecord integrations), they will be converted to integers.

Response

query.response

Executes the query, returns the raw JSON response from Elasticsearch and caches it. Use this to get at search API data not in the source documents.

Total

query.total

Returns the total number of matches returned by the query - not just the current page. Makes plugging into Kaminari a snap.

Explanations

query.explanations

Collect the '_explanation' field for each result, so you can easily see how the document scores were computed.

Per Page, Limit Value, and Total Pages

results = query.query_results
results.per_page
results.limit_value
results.total_pages

Included in the Results object for Kaminari compatibility.

Development

After checking out the repo, run bundle install to install dependencies. Then, run pry for an interactive prompt that will allow you to experiment.

Contributing

For bugs and feature requests, please open a new issue.

Please see the CONTRIBUTING guide for guidelines on contributing to Stretchy.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
bin		bin
lib		lib
spec		spec
.editorconfig		.editorconfig
.gitignore		.gitignore
.ruby-version		.ruby-version
.yardopts		.yardopts
CONTRIBUTING.md		CONTRIBUTING.md
Gemfile		Gemfile
README.md		README.md
Rakefile		Rakefile
solano.yml		solano.yml
stretchy.gemspec		stretchy.gemspec

ultrasaurus/stretchy

Folders and files

Latest commit

History

Repository files navigation

Stretchy

Installation

Usage

Documentation

Configuration

Base

Chainable Query Methods

Fulltext

Match

Query

Caution

More Like

Where

Gotcha

Terms

Filter

Caution

Range

Geo Distance

Gotcha

Not

Should

Boost

Near

Field

Random

Fields

Limit, Offset, and Page

Explain

Result Methods

Results

Ids

Response

Total

Explanations

Per Page, Limit Value, and Total Pages

Development

Contributing

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages