Composite Primary Keys are Good

Aug 26, 2006 4 comments

Composite Primary Keys (CPKs) haven’t ever been supported by the Rails core. Dr. Nic’s composite_primary_keys gem was promoted on the Rails’ Blog, but only as a solution for legacy db schemas.

What I alluded to over on Err, after having built a plugin to create CPKs in your migrations, was that our CNET team actively uses them and supports their use…from scratch.

The heresy, I know. The bottom line is that if you don’t agree with all of the assumptions DHH has made, that’s fine, you just make little more work for yourself. So, we do.

Lets take a look at a simple enough use case for CPKs on a site that gets ~7 million page views a month.

On the site, every database call is stored in memcached, except for one piece of SQL. (For more info on scaling to the moon, I strongly suggest attending Chris Wanstrath’s talk at the San Francisco Ruby Meetup next month)

The SQL is so we can keep track of the last post you’ve read on every topic. It’s a pretty standard practice for bulletin boards to offer this, but it’s wildly important to make sure it’s done right. Lets take a look at how Rails would have you do it by default.

Here’s the migration:


class AddViewsTable < ActiveRecord::Migration
  def self.up
    create_table :views do |t|
      t.column :topic_id, :integer
      t.column :user_id,  :integer
      t.column :post_id,  :integer
    end
  end
end

And the code that is being called constantly:


class View < ActiveRecord::Base
  def self.hit!(topic_id, user_id, post_id)
    view = View.find_by_topic_id_and_user_id(topic_id, user_id)

    if view.nil?
      View.create(:topic_id => topic_id, :user_id => user_id, :post_id => post_id)
    else
      view.update_attributes(:post_id => post_id)
    end
  end
end

Garbage, we can do better. Here’s our migration:


class AddViewsTable < ActiveRecord::Migration
  def self.up
    create_table :views, :primary_key => [:topic_id, :user_id] do |t|
      t.column :topic_id, :integer
      t.column :user_id,  :integer
      t.column :post_id,  :integer
    end
  end
end

And this is where CPKs come into their own:


class View < ActiveRecord::Base
  def self.hit!(topic_id, user_id, post_id)
    sql = %{INSERT INTO views (topic_id,user_id,post_id) 
            VALUES (#{topic_id},#{user_id},#{post_id}) 
            ON DUPLICATE KEY UPDATE post_id='#{post_id}'}
    self.connection.insert(sql)
  end
end

The bottom line is you save yourself one call to the database which is always the bottleneck in larger sites. To put this into perspective on how important this is, in two months our table has grown to just over 1,000,000 rows.

4 comments


Labrat said 1 day later:

Is this really a composite primary key or a join table in disguise?

I’m still researching the issue myself. I wish there was a more theoretical overview of why Rails considers composite primary keys bad.

Dr Nic said 7 days later:

@Labrat – its a join table that uses composite keys :)

@PJ – SQL databases have been around since the 70s – you are right, they do a great number of very useful things right next to the data. ActiveRecords might never have a friendly relationship with other DB mechanisms like Stored Procedures, but I think CPKs are definitely worth investigating for any project needing performance tweeking.

And its so easy to migrate from a table with a redundant ‘id’ primary key to a composite key: you just delete the redundant key :) (ok… one or two other tables might be affected, but in your example above, only the join table cares about its primary key which is nice encapsulation).

Joe said 19 days later:

I’m fairly certain that all of you can finish this simple test. Right? RIGHT? Your motivation: she’s hot :-) NaughtyAmericanHistory.com

Mike H said 5 months later:

“DUPLICATE KEY UPDATE” will also update an existing row if the insert would have violated a unique index (assuming we’re talking about MySQL)

http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

So have a unique integer key, put a unique index on [post_id,user_id], and you can do the same thing.

Name
Url