130 lines
5.4 KiB
Markdown
130 lines
5.4 KiB
Markdown
## DESCRIPTION
|
|
|
|
This ruby gem adds a PgSearchable Model Concern to be included into Rails models that allows easy (simple and limited) usage of english like queries using postgres's builtin fulltext search capabilities.
|
|
|
|
## REQUIREMENTS
|
|
|
|
- Ruby 2.1+
|
|
- ActiveRecord 4.2+
|
|
- Postgresql 9.2+
|
|
|
|
## INSTALL
|
|
|
|
Add this to your Gemfile:
|
|
|
|
```ruby
|
|
gem 'pg_searchable'
|
|
```
|
|
|
|
## USAGE
|
|
|
|
To add PgSearch to an Active Record model, include the PgSearchable module and call the added `pg_search` method:
|
|
|
|
```ruby
|
|
class Model < ActiveRecord::Base
|
|
include PgSearchable
|
|
pg_search fields: %i[name description]
|
|
end
|
|
|
|
Model.scope_search('cats or dogs') # Will find models that have either 'cat' or 'dog' keyword in their name or description
|
|
```
|
|
|
|
The pg_search method accepts the following properties:
|
|
|
|
- `fields`: An array of fields that are gonna be matched by the query. The field type needs to able to converted into text (using psql `field::text`)
|
|
- `scope`: The name of the scope method that will be added to the model. Defaults to `scope_search`
|
|
- `cache`: If present, then instead of dynamically calculating the tsvector during runtime, it will use the tsvector field provided instead. See "USING CACHE FIELD" below
|
|
- `skip_callback`: When a cache field is defined and this value is true, the cache field will not be updated automatically
|
|
- `language`: language of the dictionary to be used to generate the lexemes. Defaults to `english`, which uses lexemes comparison. Another useful value is `simple` which only removes stopwords but does lower case exact word comparison
|
|
- `wildcard`: enables or disabled wildcard search feature for the keywords. Defaults to true
|
|
- `external_cache_data`: a method that returns a String or an Array of Strings of extra data to be added to the cache value when updating the cache value.
|
|
- `joins`: specifies an array of relation names, or a hash (to do multiple associations on) and does a left outer join with them, allowing those fields to be dynamically searched upon. Can only be used without a cache field, and if relation data is required while using the cache use `external_cache_data` instead to populate the cache field. When using joins, its advisable to prefix the field names in the field options with the table names to avoid collisions (which would result in errors). Examples.:
|
|
|
|
```ruby
|
|
class Product < ActiveRecord::Base
|
|
include PgSearchable
|
|
pg_search fields: %i[products.name tags.value categories.name sections.name],
|
|
joins: [{tags: :category}, :sections]
|
|
has_many :tags
|
|
has_many :sections, through: :tags
|
|
end
|
|
|
|
class Tag < ActiveRecord::Base
|
|
belongs_to :product
|
|
belongs_to :category
|
|
has_many :sections
|
|
end
|
|
|
|
class Section < ActiveRecord::Base
|
|
belongs_to :tag
|
|
end
|
|
|
|
class Category < ActiveRecord::Base
|
|
has_many :tags
|
|
end
|
|
```
|
|
|
|
Do note that if you specify the table names on the fields property, they should refer after the table names (usually pluralized in rails), while on the joins property it follows the relation name (which can be singular or plural depending on the relation type)
|
|
|
|
## USING CACHE FIELD
|
|
|
|
For perfomance improvements, a cache field can be added and configured for pg_search to use it instead of dynamically generating it during runtime.
|
|
|
|
To use it, first add the vector field to the model with a migration:
|
|
|
|
```ruby
|
|
class AddSearchVectorToModels < ActiveRecord::Migration
|
|
def change
|
|
add_column :models, :search_cache, :tsvector
|
|
add_index :models, :search_cache, using: :gin
|
|
end
|
|
end
|
|
```
|
|
|
|
Then set the `cache` property on the `pg_search` call:
|
|
|
|
```ruby
|
|
pg_search fields: %i[name description], cache: :search_cache
|
|
```
|
|
|
|
This will add an after_save callback to the model which will automatically update the cache field with the new values everytime the record is saved. If you wanna search the vector field but manually update the cache, you can do so by passing `skip_callback` to false, and then manually running the `update_pg_search_cache` method on a model instance.
|
|
|
|
If you required to index data of external relationships, this can be accomplished by using the cache field with the `external_cache_data` option, passing the name of an instance method for the model that retrieves the external data.
|
|
|
|
For example, considering a Tag model that has a value column and a 1:N relation with the Product model, this can be achieved by doing:
|
|
|
|
```ruby
|
|
class Product < ActiveRecord::Base
|
|
include PgSearchable
|
|
pg_search fields: %i[name], cache: :search_cache, external_cache_data: :tag_values
|
|
has_many :tags
|
|
|
|
def tag_values
|
|
tags.pluck(:value)
|
|
end
|
|
end
|
|
|
|
class Tag < ActiveRecord::Base
|
|
belongs_to :product
|
|
after_save { product.update_pg_search_cache if product.present? }
|
|
end
|
|
```
|
|
|
|
Since its an external data, everytime the external data has changed you need to make sure to call `update_pg_search_cache` method or a save/update that will trigger the update method in order for that value to be cached and searchable.
|
|
|
|
## DEVELOPING
|
|
|
|
To run the test suite create `.env.test.local` file containing the same entries as with `.env.test` but with the correct local settings to a postgres database, run `bundle install` to download dependencies and run `rake`
|
|
|
|
## CONTRIBUTING
|
|
|
|
Make sure the test coverage remains at 100%, there are no rubocop complaints (`bundle exec rubocop`) and make a Pull Request.
|
|
|
|
|
|
## Modifying parser and lexer
|
|
|
|
* `rake lexer` - generates `lexer.rb` file based on `specification.rex` file
|
|
* `rake parser` - generates `parser.rb` file based on `grammar.y` file
|
|
* `rake generate` - generates both `lexer.rb` and `parser.rb` files
|
|
|