Composite Primary Keys are Good
Composite Primary Keys (CPKs) haven’t ever been supported by the Rails core. Dr. Nic’s composite_primary_keys gem was promoted on the Rails’ Blog, but only as a solution for legacy db schemas.
What I alluded to over on Err, after having built a plugin to create CPKs in your migrations, was that our CNET team actively uses them and supports their use…from scratch.
The heresy, I know. The bottom line is that if you don’t agree with all of the assumptions DHH has made, that’s fine, you just make little more work for yourself. So, we do.
Lets take a look at a simple enough use case for CPKs on a site that gets ~7 million page views a month.
On the site, every database call is stored in memcached, except for one piece of SQL. (For more info on scaling to the moon, I strongly suggest attending Chris Wanstrath’s talk at the San Francisco Ruby Meetup next month)
The SQL is so we can keep track of the last post you’ve read on every topic. It’s a pretty standard practice for bulletin boards to offer this, but it’s wildly important to make sure it’s done right. Lets take a look at how Rails would have you do it by default.
Here’s the migration:
class AddViewsTable < ActiveRecord::Migration
def self.up
create_table :views do |t|
t.column :topic_id, :integer
t.column :user_id, :integer
t.column :post_id, :integer
end
end
end
And the code that is being called constantly:
class View < ActiveRecord::Base
def self.hit!(topic_id, user_id, post_id)
view = View.find_by_topic_id_and_user_id(topic_id, user_id)
if view.nil?
View.create(:topic_id => topic_id, :user_id => user_id, :post_id => post_id)
else
view.update_attributes(:post_id => post_id)
end
end
end
Garbage, we can do better. Here’s our migration:
class AddViewsTable < ActiveRecord::Migration
def self.up
create_table :views, :primary_key => [:topic_id, :user_id] do |t|
t.column :topic_id, :integer
t.column :user_id, :integer
t.column :post_id, :integer
end
end
end
And this is where CPKs come into their own:
class View < ActiveRecord::Base
def self.hit!(topic_id, user_id, post_id)
sql = %{INSERT INTO views (topic_id,user_id,post_id)
VALUES (#{topic_id},#{user_id},#{post_id})
ON DUPLICATE KEY UPDATE post_id='#{post_id}'}
self.connection.insert(sql)
end
end
The bottom line is you save yourself one call to the database which is always the bottleneck in larger sites. To put this into perspective on how important this is, in two months our table has grown to just over 1,000,000 rows.
Is this really a composite primary key or a join table in disguise?
I’m still researching the issue myself. I wish there was a more theoretical overview of why Rails considers composite primary keys bad.
@Labrat – its a join table that uses composite keys :)
@PJ – SQL databases have been around since the 70s – you are right, they do a great number of very useful things right next to the data. ActiveRecords might never have a friendly relationship with other DB mechanisms like Stored Procedures, but I think CPKs are definitely worth investigating for any project needing performance tweeking.
And its so easy to migrate from a table with a redundant ‘id’ primary key to a composite key: you just delete the redundant key :) (ok… one or two other tables might be affected, but in your example above, only the join table cares about its primary key which is nice encapsulation).
I’m fairly certain that all of you can finish this simple test. Right? RIGHT? Your motivation: she’s hot :-) NaughtyAmericanHistory.com
“DUPLICATE KEY UPDATE” will also update an existing row if the insert would have violated a unique index (assuming we’re talking about MySQL)
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
So have a unique integer key, put a unique index on [post_id,user_id], and you can do the same thing.