Using the PostgreSQL hstore extension in Rails
by Robert Johansson - 2013-11-29
Hstore is an extension of PostgreSQL that allows you to store key-value pairs in a single PostgreSQL column. In this article we go through how to set up hstore withing a Rails app. We will also look on some use-cases, see how one can select and group on an hstore column and also have a quick look on the different ways to index them.
Introduction
The hstore in PostgreSQL stores key-value pairs of strings. Currently mySQL does not have any equivalent structure (2013-11-25) so if you choose to use hstore you are locking yourself to PostgreSQL. You can find the PostgreSQL reference for hstore here.
If you just need to serialize a hash you will not get any super gains from using hstore, in fact, you get some extra hassle from the fact that the keys and values are always converted to strings. If you for example want to store boolean values you will need to do some tweaks.
The real win comes if you need to query on the hash values. This would be really tricky if you where just using a serialized hash column but if you use hstore it is super simple. You can also use group statements on the values of a specific key inside the hstore.
How to add an hstore column to your Rails app
The hstore support is built in to Rails 4, if you are using Rails 3, take a look at the gem activerecord-postgres-hstore.
First thing is to add the Hstore extension to your PostgreSQL database, there are a few different ways you can do, I find this database migration works best for me:
class AddHstoreExtension < ActiveRecord::Migration def self.up execute "CREATE EXTENSION IF NOT EXISTS hstore" end def self.down execute "DROP EXTENSION hstore" end end
The next step is to add model with an hstore attribute, or you could add an hstore attribute to an existing model
class CreateMyModels < ActiveRecord::Migration def change create_table :my_models do |t| # Default to empty string and dont allow null values means you will start with an empty hash instead of nil t.hstore :my_field, default: '', null: false end end end class MyModel < ActiveRecord::Base end
For Rails 4 you don't need to add anything else, your column my_column
will behave as a serialized hash, in Rails 3, follow the instructions for activerecord-postgres-hstore . After that you can do things like:
m = MyModel.new m.my_field = {"my_key" => "my_value"}One gotcha worth mentioning is the following
m = MyModel.new m.my_field["new_key"] = "new_value" m.save m.my_field["new_key"] # => "new_value" m.reload m.my_field["new_key"] # => nil
This is due to the that field does not get marked as changed if you edit the hash in this way, so make sure to assign directly to the field to avoid the problem.
Querying
You can query on the keys and value of the hash, for example:
# All models where the myfield column has the key 'my_key' set to 'my_value' MyModel.where("my_field -> 'my_key' = ?" , "my_value") # All models where the myfield column has the key 'my_key' with any value MyModel.where("my_field ? 'my_key'")
You can also group on the values of the hash. This works like a normal group by but you can group on a specific hash key:
.group( "my_field -> 'my_key'" )
It is also possible to sum the values of the a certain hash key, note that the values are strings so if you for example are storing integers you need to convert them before you sum them.
.select("SUM((my_field->'my_key')::integer) as sum_value")
Things you can't do
It is currently (2013-11-27) not possible to store anything else than simple hashes with string keys and values. That means you cannot store nested hashes, like in for example MongoDB. Also, if you try to store anything else than strings in an hstore, the values will be automatically converted to strings, so you need to handle those cases if you for example need to store integers or booleans.
Using store_accessor
ActiveRecord::Store is a wrapper around a serialized field, such as the hstore, to use it you just add a line to your model.
class MyModel < ActiveRecord::Base store_accessor :my_field, [:color, :locale] end
Then you can use these fields as any other field (but beware) , the content of your fields are still converted to strings, this is done when the object is saved.
m = MyModel.new m.color = "blue"
Use-cases
I think one of the best use-cases is for storing attributes that vary between different instances. For example in a store-application where products might have a set of different attributes, like size, color and so on. If you want users to be able to add their own custom attributes, then you could use a hstore column. The main advantage compared to using a text column with a serialized hash is the possibility to search on the hash values. A demo of this use-case can be found here. If you add some additional models, you could make something that solves the same type of problems as the EAV (Entity Attribute Value) design.
Indexing an hstore column
You can add indexes to an hstore column to speed up queries. There are a few different options.
If you add a BTREE index to an hstore column you will make the = operator fast, however this is not usually the type of query you would want to do.
It is also possible to add a BTREE index on a specific key of the hstore, that will speed up your queries on that key. This could be useful, but if you are considering this type of index, think about if you should extract this key of the hstore into its own column.
You can also add GIN or GIST indexes on the hstore columns. These indexes support the @>, ?, ?& and ?| operators. So if you add this you can rewrite your queries that check for equality on a specific key like this
# Original MyModel.where("my_field -> 'my_key' = ?", "my_value") # Rewritten with better performance if you have added a GIN or GIST index MyModel.where("my_field @> 'my_key=>my_value'")
According to this page the second way is at least two times faster.
Should one use a GIN or a GIST index
As a rule of thumb, GIN indexes are best for static data because lookups are faster. For dynamic data, GiST indexes are faster to update. More detailed documentation can be found here.
Links
Some other code I found, I have not used any of these personally.
Back
Comments
comments powered by Disqus