Opened 10 years ago
Closed 10 years ago
#920 closed defect (fixed)
Broken migration for #894
Reported by: | Elrond | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | 0.7.0 |
Component: | programming | Keywords: | sql |
Cc: | Parent Tickets: |
Description
tl;dr the index-add on User.username might be broken. If it is, this is a blocker.
Intro
We used to have unique=True
on User.username
. In #894 (commit 892eed590fb30131ea2e8612da5ba22fa24f690c) we decided to add an extra index=True
(discussion, on whether this was is needed, is not the scope here).
On most (if not all) databases, a UniqueConstraint will create an internal index, so that checking the constraint is reasonably performant. This is different from an actual index (with or without uniqueness attribute) in a concptual way. Whether this is a real difference on a database, is another story.
According to the sqlalchemy docs unique=True, index=True
column creates only an Index with a uniqueness attribute. This is actually exactly what we would like after all.
Situations
(These are the situations I expect after reading our code and the docs.)
Old schema::
unique=True
- One Constraint on the table.
Old schema + migration
- The migration adds an Index (without uniqueness).
- One Constraint on the table (from before)
- One Index (from the migration)
New schema, when creating database anew::
unique=True, index=True
- One Index(with uniqueness attribute) on the table
As you can see, the situation with migration or building from scratch are different.
How to verify
- Create a fresh database before the migration.
- Run the migration
- Investigate/safe variant A
pg_dump MEDIAGOBLINDB >variant_A.sql
psql MEDIAGOBLINDB
, then\d core__users
, look at the Indexes-list
- Create a fresh database after the migration
- Investigate/safe variant B
pg_dump MEDIAGOBLINDB >variant_B.sql
psql MEDIAGOBLINDB
, then\d core__users
, look at the Indexes-list
diff -u variant_A.sql variant_B.sql
If the indexes / pg_dump differ, then something is wrong and this bug is a valid. Otherwise, I'd love to know, what's wrong with my assumptions.
Change History (8)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
And now: postgres.
*pre-the-new-migration on postgresql*
Table "public.core__users" Column | Type | Modifiers | Storage | Stats target | Description ----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+------------- id | integer | not null default nextval('core__users_id_seq'::regclass) | plain | | username | character varying | not null | extended | | email | character varying | not null | extended | | pw_hash | character varying | | extended | | created | timestamp without time zone | not null | plain | | wants_comment_notification | boolean | | plain | | wants_notifications | boolean | | plain | | license_preference | character varying | | extended | | url | character varying | | extended | | bio | text | | extended | | uploaded | integer | | plain | | upload_limit | integer | | plain | | Indexes: "core__users_pkey" PRIMARY KEY, btree (id) "core__users_username_key" UNIQUE CONSTRAINT, btree (username)
*post-the-new-migration on postgresql*
Table "public.core__users" Column | Type | Modifiers | Storage | Stats target | Description ----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+------------- id | integer | not null default nextval('core__users_id_seq'::regclass) | plain | | username | character varying | not null | extended | | email | character varying | not null | extended | | pw_hash | character varying | | extended | | created | timestamp without time zone | not null | plain | | wants_comment_notification | boolean | | plain | | wants_notifications | boolean | | plain | | license_preference | character varying | | extended | | url | character varying | | extended | | bio | text | | extended | | uploaded | integer | | plain | | upload_limit | integer | | plain | | Indexes: "core__users_pkey" PRIMARY KEY, btree (id) "core__users_username_key" UNIQUE CONSTRAINT, btree (username) "ix_core__users_uploader" btree (username)
*brand new latest database, using postgresql and on master*
Table "public.core__users" Column | Type | Modifiers | Storage | Stats target | Description ----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+------------- id | integer | not null default nextval('core__users_id_seq'::regclass) | plain | | username | character varying | not null | extended | | email | character varying | not null | extended | | pw_hash | character varying | | extended | | created | timestamp without time zone | not null | plain | | wants_comment_notification | boolean | | plain | | wants_notifications | boolean | | plain | | license_preference | character varying | | extended | | url | character varying | | extended | | bio | text | | extended | | uploaded | integer | | plain | | upload_limit | integer | | plain | | Indexes: "core__users_pkey" PRIMARY KEY, btree (id) "ix_core__users_username" UNIQUE, btree (username)
comment:3 by , 10 years ago
So, it looks like Elrond is right about how the indexes are created.
So I guess it's on to figuring out what to do next. Should we set things to the way they were before this migration or not? Do we have confirmation that a "unique" constraint is enough for both postgres and sqlite? Or is the way we set things up (not including the way the migration does it, but the "brand new db" best?
comment:4 by , 10 years ago
Status: | new → accepted |
---|
comment:5 by , 10 years ago
Okay, so here's the rundown from my conversation with Elrond:
- The difference between
unique=True}} and {{{unique=True, index=True
is mostly conceptualunique=True
yields an implicit index on all relevant databases, it seemsunique=True, index=True
yields a unique-index.- These state different intents, but the actual result is they work about the same, as far as we can tell. In the future, when we intend to express a
unique=True, index=True
type scenario on *new* models, we should do that, just because it's more clearly expressed. But in reality it does not matter much.
- We have a couple of options for migrationss, though Elrond and I agree on which one we should do... though I'll list both
- The one we should do: make migration null for new people who run it, make a new migration to see if the old migration ran, if so, "set it back".
- The other one, too much work for no real gains for those who haven't run the migration yet: also mark the problematic migration as null, add a new migration setting things up as
index=True, unique=True
for both people who did and didn't run the previous migration.
Time to try and see if that works, anyway!
comment:6 by , 10 years ago
This might not be so easy. The whole reason behind the specified plan was that there was a "unique" restriction that came bundled with an index that made it Just Work (TM). But if you set up a database from master, that "unique" constraint that does not come bundled with the table's CREATE statement, at least on sqlite.
fuuuuuuuuuu
comment:7 by , 10 years ago
Status: | accepted → review |
---|
Okay. So I pushed a branch that seems to fix everything.
https://gitorious.org/mediagoblin/cwebbers-mediagoblin/commits/920_fix_index
Jessica has reviewed, gives it a thumbs up. If possible, I'd like Elrond (who caught these things in the first place) to review before merging and pushing, but not critical.
comment:8 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | review → closed |
Elrond has reviewed! I have made changes! Merged into master!
Good riddance to this one!
Okay, I'm going to paste the results one after another in here, before analysis. First, here's the sqlite results:
pre-the-new-migration on sqlite
post-the-new-migration on sqlite
brand new latest database, using sqlite and on master