I have a table that stores a person's information with close to 10 million rows.
Currently State is a char(2) field on the person table. This leads to tons of duplication of data as you would expect. If I normalize State data into it's own table and create an FK to it in the person table would this result in faster query times?
Before:
SELECT Name, City, State FROM Person WHERE State = 'WI'
After:
SELECT p.Name, p.City, s.Name as State
FROM Person p
INNER JOIN State s ON p.State == s.Id
WHERE s.Name = 'WI'
It seems to me that this would accomplish an increase in performance but I am far from an expert when it comes to optimizing queries.
Statecolumn may help you better instead ofFKI guess.CHAR(2), you seem to have no other data associated with states (e.g. population), and we don't expect states two character codes to change anytime soon, I would hardly even consider making a separate states table "normalization." If you do want to normalize, consider making a separateCitytable with columns fornameandstateCHAR(2)(which never changes) is not. I don't think there are any more repeating values using aCHAR(2)than if using some numeric foreign key.Statetable with fieldscode,name,population,area, etc where thecodeis a primary key of typeCHAR(2)using values like 'VA' and 'NC'. But in this case, there seems to be no additional information associated with a state, so there's no actual need for aStatetable as it would only have thecodewith no additional fields.