Opened 9 years ago

Closed 8 years ago

#920 closed defect (fixed)

Broken migration for #894

Reported by: Elrond Owned by:
Priority: blocker Milestone: 0.7.0
Component: programming Keywords: sql
Cc: Parent Tickets:

Description

tl;dr the index-add on User.username might be broken. If it is, this is a blocker.

Intro

We used to have unique=True on User.username. In #894 (commit 892eed590fb30131ea2e8612da5ba22fa24f690c) we decided to add an extra index=True (discussion, on whether this was is needed, is not the scope here).

On most (if not all) databases, a UniqueConstraint will create an internal index, so that checking the constraint is reasonably performant. This is different from an actual index (with or without uniqueness attribute) in a concptual way. Whether this is a real difference on a database, is another story.

According to the sqlalchemy docs unique=True, index=True column creates only an Index with a uniqueness attribute. This is actually exactly what we would like after all.

Situations

(These are the situations I expect after reading our code and the docs.)

Old schema::

  • unique=True
  • One Constraint on the table.

Old schema + migration

  • The migration adds an Index (without uniqueness).
  • One Constraint on the table (from before)
  • One Index (from the migration)

New schema, when creating database anew::

  • unique=True, index=True
  • One Index(with uniqueness attribute) on the table

As you can see, the situation with migration or building from scratch are different.

How to verify

  1. Create a fresh database before the migration.
  2. Run the migration
  3. Investigate/safe variant A
    • pg_dump MEDIAGOBLINDB >variant_A.sql
    • psql MEDIAGOBLINDB, then \d core__users, look at the Indexes-list
  4. Create a fresh database after the migration
  5. Investigate/safe variant B
    • pg_dump MEDIAGOBLINDB >variant_B.sql
    • psql MEDIAGOBLINDB, then \d core__users, look at the Indexes-list
  6. diff -u variant_A.sql variant_B.sql

If the indexes / pg_dump differ, then something is wrong and this bug is a valid. Otherwise, I'd love to know, what's wrong with my assumptions.

Subtickets

Change History (8)

comment:1 Changed 9 years ago by Christopher Allan Webber

Okay, I'm going to paste the results one after another in here, before analysis. First, here's the sqlite results:

pre-the-new-migration on sqlite

sqlite> .schema core__users
CREATE TABLE core__users (
	id INTEGER NOT NULL, 
	username VARCHAR NOT NULL, 
	email VARCHAR NOT NULL, 
	pw_hash VARCHAR, 
	created DATETIME NOT NULL, 
	wants_comment_notification BOOLEAN, 
	wants_notifications BOOLEAN, 
	license_preference VARCHAR, 
	url VARCHAR, 
	bio TEXT, 
	uploaded INTEGER, 
	upload_limit INTEGER, 
	PRIMARY KEY (id), 
	UNIQUE (username), 
	CHECK (wants_comment_notification IN (0, 1)), 
	CHECK (wants_notifications IN (0, 1))
);

post-the-new-migration on sqlite

sqlite> .schema core__users
CREATE TABLE core__users (
	id INTEGER NOT NULL, 
	username VARCHAR NOT NULL, 
	email VARCHAR NOT NULL, 
	pw_hash VARCHAR, 
	created DATETIME NOT NULL, 
	wants_comment_notification BOOLEAN, 
	wants_notifications BOOLEAN, 
	license_preference VARCHAR, 
	url VARCHAR, 
	bio TEXT, 
	uploaded INTEGER, 
	upload_limit INTEGER, 
	PRIMARY KEY (id), 
	UNIQUE (username), 
	CHECK (wants_comment_notification IN (0, 1)), 
	CHECK (wants_notifications IN (0, 1))
);
CREATE INDEX ix_core__users_uploader ON core__users (username);

brand new latest database, using sqlite and on master

sqlite> .schema core__users
CREATE TABLE core__users (
	id INTEGER NOT NULL, 
	username VARCHAR NOT NULL, 
	email VARCHAR NOT NULL, 
	pw_hash VARCHAR, 
	created DATETIME NOT NULL, 
	wants_comment_notification BOOLEAN, 
	wants_notifications BOOLEAN, 
	license_preference VARCHAR, 
	url VARCHAR, 
	bio TEXT, 
	uploaded INTEGER, 
	upload_limit INTEGER, 
	PRIMARY KEY (id), 
	CHECK (wants_comment_notification IN (0, 1)), 
	CHECK (wants_notifications IN (0, 1))
);
CREATE UNIQUE INDEX ix_core__users_username ON core__users (username);

comment:2 Changed 9 years ago by Christopher Allan Webber

And now: postgres.

*pre-the-new-migration on postgresql*

                                                                 Table "public.core__users"
           Column           |            Type             |                        Modifiers                         | Storage  | Stats target | Description 
----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+-------------
 id                         | integer                     | not null default nextval('core__users_id_seq'::regclass) | plain    |              | 
 username                   | character varying           | not null                                                 | extended |              | 
 email                      | character varying           | not null                                                 | extended |              | 
 pw_hash                    | character varying           |                                                          | extended |              | 
 created                    | timestamp without time zone | not null                                                 | plain    |              | 
 wants_comment_notification | boolean                     |                                                          | plain    |              | 
 wants_notifications        | boolean                     |                                                          | plain    |              | 
 license_preference         | character varying           |                                                          | extended |              | 
 url                        | character varying           |                                                          | extended |              | 
 bio                        | text                        |                                                          | extended |              | 
 uploaded                   | integer                     |                                                          | plain    |              | 
 upload_limit               | integer                     |                                                          | plain    |              | 
Indexes:
    "core__users_pkey" PRIMARY KEY, btree (id)
    "core__users_username_key" UNIQUE CONSTRAINT, btree (username)

*post-the-new-migration on postgresql*

                                                                 Table "public.core__users"
           Column           |            Type             |                        Modifiers                         | Storage  | Stats target | Description 
----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+-------------
 id                         | integer                     | not null default nextval('core__users_id_seq'::regclass) | plain    |              | 
 username                   | character varying           | not null                                                 | extended |              | 
 email                      | character varying           | not null                                                 | extended |              | 
 pw_hash                    | character varying           |                                                          | extended |              | 
 created                    | timestamp without time zone | not null                                                 | plain    |              | 
 wants_comment_notification | boolean                     |                                                          | plain    |              | 
 wants_notifications        | boolean                     |                                                          | plain    |              | 
 license_preference         | character varying           |                                                          | extended |              | 
 url                        | character varying           |                                                          | extended |              | 
 bio                        | text                        |                                                          | extended |              | 
 uploaded                   | integer                     |                                                          | plain    |              | 
 upload_limit               | integer                     |                                                          | plain    |              | 
Indexes:
    "core__users_pkey" PRIMARY KEY, btree (id)
    "core__users_username_key" UNIQUE CONSTRAINT, btree (username)
    "ix_core__users_uploader" btree (username)

*brand new latest database, using postgresql and on master*

                                                                 Table "public.core__users"
           Column           |            Type             |                        Modifiers                         | Storage  | Stats target | Description 
----------------------------+-----------------------------+----------------------------------------------------------+----------+--------------+-------------
 id                         | integer                     | not null default nextval('core__users_id_seq'::regclass) | plain    |              | 
 username                   | character varying           | not null                                                 | extended |              | 
 email                      | character varying           | not null                                                 | extended |              | 
 pw_hash                    | character varying           |                                                          | extended |              | 
 created                    | timestamp without time zone | not null                                                 | plain    |              | 
 wants_comment_notification | boolean                     |                                                          | plain    |              | 
 wants_notifications        | boolean                     |                                                          | plain    |              | 
 license_preference         | character varying           |                                                          | extended |              | 
 url                        | character varying           |                                                          | extended |              | 
 bio                        | text                        |                                                          | extended |              | 
 uploaded                   | integer                     |                                                          | plain    |              | 
 upload_limit               | integer                     |                                                          | plain    |              | 
Indexes:
    "core__users_pkey" PRIMARY KEY, btree (id)
    "ix_core__users_username" UNIQUE, btree (username)

comment:3 Changed 9 years ago by Christopher Allan Webber

So, it looks like Elrond is right about how the indexes are created.

So I guess it's on to figuring out what to do next. Should we set things to the way they were before this migration or not? Do we have confirmation that a "unique" constraint is enough for both postgres and sqlite? Or is the way we set things up (not including the way the migration does it, but the "brand new db" best?

comment:4 Changed 9 years ago by Elrond

Status: newaccepted

comment:5 Changed 8 years ago by Christopher Allan Webber

Okay, so here's the rundown from my conversation with Elrond:

  • The difference between unique=True}} and {{{unique=True, index=True is mostly conceptual
    • unique=True yields an implicit index on all relevant databases, it seems
    • unique=True, index=True yields a unique-index.
    • These state different intents, but the actual result is they work about the same, as far as we can tell. In the future, when we intend to express a unique=True, index=True type scenario on *new* models, we should do that, just because it's more clearly expressed. But in reality it does not matter much.
  • We have a couple of options for migrationss, though Elrond and I agree on which one we should do... though I'll list both
    • The one we should do: make migration null for new people who run it, make a new migration to see if the old migration ran, if so, "set it back".
    • The other one, too much work for no real gains for those who haven't run the migration yet: also mark the problematic migration as null, add a new migration setting things up as index=True, unique=True for both people who did and didn't run the previous migration.

Time to try and see if that works, anyway!

comment:6 Changed 8 years ago by Christopher Allan Webber

This might not be so easy. The whole reason behind the specified plan was that there was a "unique" restriction that came bundled with an index that made it Just Work (TM). But if you set up a database from master, that "unique" constraint that does not come bundled with the table's CREATE statement, at least on sqlite.

fuuuuuuuuuu

comment:7 Changed 8 years ago by Christopher Allan Webber

Status: acceptedreview

Okay. So I pushed a branch that seems to fix everything.

https://gitorious.org/mediagoblin/cwebbers-mediagoblin/commits/920_fix_index

Jessica has reviewed, gives it a thumbs up. If possible, I'd like Elrond (who caught these things in the first place) to review before merging and pushing, but not critical.

comment:8 Changed 8 years ago by Christopher Allan Webber

Resolution: fixed
Status: reviewclosed

Elrond has reviewed! I have made changes! Merged into master!

Good riddance to this one!

Note: See TracTickets for help on using tickets.