## DESCRIPTION This ruby gem adds a PgSearchable Model Concern to be included into Rails models that allows easy (simple and limited) usage of english like queries using postgres's builtin fulltext search capabilities. ## REQUIREMENTS - Ruby 2.1+ - ActiveRecord 4.2+ - Postgresql 9.2+ ## INSTALL Add this to your Gemfile: ```ruby gem 'pg_searchable' ``` ## USAGE To add PgSearch to an Active Record model, include the PgSearchable module and call the added `pg_search` method: ```ruby class Model < ActiveRecord::Base include PgSearchable pg_search fields: %i[name description] end Model.scope_search('cats or dogs') # Will find models that have either 'cat' or 'dog' keyword in their name or description ``` The pg_search method accepts the following properties: - `fields`: An array of fields that are gonna be matched by the query. The field type needs to able to converted into text (using psql `field::text`) - `scope`: The name of the scope method that will be added to the model. Defaults to `scope_search` - `cache`: If present, then instead of dynamically calculating the tsvector during runtime, it will use the tsvector field provided instead. See "USING CACHE FIELD" below - `skip_callback`: When a cache field is defined and this value is true, the cache field will not be updated automatically - `language`: language of the dictionary to be used to generate the lexemes. Defaults to `english`, which uses lexemes comparison. Another useful value is `simple` which only removes stopwords but does lower case exact word comparison - `wildcard`: enables or disabled wildcard search feature for the keywords. Defaults to true - `external_cache_data`: a method that returns a String or an Array of Strings of extra data to be added to the cache value when updating the cache value. - `joins`: specifies an array of relation names, or a hash (to do multiple associations on) and does a left outer join with them, allowing those fields to be dynamically searched upon. Can only be used without a cache field, and if relation data is required while using the cache use `external_cache_data` instead to populate the cache field. When using joins, its advisable to prefix the field names in the field options with the table names to avoid collisions (which would result in errors). Examples.: ```ruby class Product < ActiveRecord::Base include PgSearchable pg_search fields: %i[products.name tags.value categories.name sections.name], joins: [{tags: :category}, :sections] has_many :tags has_many :sections, through: :tags end class Tag < ActiveRecord::Base belongs_to :product belongs_to :category has_many :sections end class Section < ActiveRecord::Base belongs_to :tag end class Category < ActiveRecord::Base has_many :tags end ``` Do note that if you specify the table names on the fields property, they should refer after the table names (usually pluralized in rails), while on the joins property it follows the relation name (which can be singular or plural depending on the relation type) ## USING CACHE FIELD For perfomance improvements, a cache field can be added and configured for pg_search to use it instead of dynamically generating it during runtime. To use it, first add the vector field to the model with a migration: ```ruby class AddSearchVectorToModels < ActiveRecord::Migration def change add_column :models, :search_cache, :tsvector add_index :models, :search_cache, using: :gin end end ``` Then set the `cache` property on the `pg_search` call: ```ruby pg_search fields: %i[name description], cache: :search_cache ``` This will add an after_save callback to the model which will automatically update the cache field with the new values everytime the record is saved. If you wanna search the vector field but manually update the cache, you can do so by passing `skip_callback` to false, and then manually running the `update_pg_search_cache` method on a model instance. If you required to index data of external relationships, this can be accomplished by using the cache field with the `external_cache_data` option, passing the name of an instance method for the model that retrieves the external data. For example, considering a Tag model that has a value column and a 1:N relation with the Product model, this can be achieved by doing: ```ruby class Product < ActiveRecord::Base include PgSearchable pg_search fields: %i[name], cache: :search_cache, external_cache_data: :tag_values has_many :tags def tag_values tags.pluck(:value) end end class Tag < ActiveRecord::Base belongs_to :product after_save { product.update_pg_search_cache if product.present? } end ``` Since its an external data, everytime the external data has changed you need to make sure to call `update_pg_search_cache` method or a save/update that will trigger the update method in order for that value to be cached and searchable. ## DEVELOPING To run the test suite create `.env.test.local` file containing the same entries as with `.env.test` but with the correct local settings to a postgres database, run `bundle install` to download dependencies and run `rake` ## CONTRIBUTING Make sure the test coverage remains at 100%, there are no rubocop complaints (`bundle exec rubocop`) and make a Pull Request. ## Modifying parser and lexer * `rake lexer` - generates `lexer.rb` file based on `specification.rex` file * `rake parser` - generates `parser.rb` file based on `grammar.y` file * `rake generate` - generates both `lexer.rb` and `parser.rb` files