Using single column table

Question

I'm creating a database to store the events of mobile apps recovered from multiple sources. Problem is that rows from the event table don't have much meaning to the user as it's mostly a succession of integers. Forcing them to make multiple joins or multiple queries.

CREATE TABLE source (
    id              serial PRIMARY KEY,
    value           string NOT NULL
); 

CREATE TABLE application (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE platform (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE country (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE event (
    id              serial PRIMARY KEY,
    source_id       integer REFERENCES source(id),
    application_id  integer REFERENCES application(id),
    platform_id     integer REFERENCES platform(id),
    country_id      integer REFERENCES country(id),

    ...

    updated_at      date NOT NULL,
    value           decimal(100, 2) NOT NULL
);

I thought of directly using the value of the "secondary" tables as a primary key (as it's unique and not null) that I would reference in the event table. It would look like that:

CREATE TABLE source (
    value              string PRIMARY KEY
); 

CREATE TABLE application (
    value              string PRIMARY KEY
);

CREATE TABLE platform (
    value              string PRIMARY KEY
);

CREATE TABLE country (
    value              string PRIMARY KEY
);

CREATE TABLE event (
    id              serial PRIMARY KEY,
    source          string REFERENCES source(value),
    application     string REFERENCES application(value),
    platform        string REFERENCES platform(value),
    country         string REFERENCES country(value),

    ...

    updated_at      date NOT NULL,
    value           decimal(100, 2) NOT NULL
);

I think it might also be good this way as I don't really see an added value at using a surrogate key in this situation. Also prevents me from using views which might have slower performances as it executes a query every time I use the view in a query.

What do you think of this option ?

If the value columns in your source, platform, application, and country tables satisfy the criteria for a primary key (not null, unique, and unchanging) then I like the second design better. — Bob Jarvis - Слава Україні
– Bob Jarvis - Слава Україні, Commented Mar 29, 2017 at 11:09
The downside of the second approach is that you will duplicate the text values in event table. If the values are long, it will take up much space, which can be avoided if you were using integer keys. — pumbo
– pumbo, Commented Mar 29, 2017 at 11:10
If the strings are codes then use them as keys. Simple beats complex, if it is correct. — Clodoaldo Neto
– Clodoaldo Neto, Commented Mar 29, 2017 at 12:10

Gordon Linoff · Accepted Answer · 2017-03-29 11:12:19Z

4

"real" systems usually use surrogate keys. There are multiple reasons why:

Integers are more efficient for indexes, because they are fixed length.
Integers are more efficient for foreign key references, because they are only four bytes (strings are often larger).
String values may change and then referring tables need to be updated.
Auto generated primary keys contain other information, such as the order of insertion.
End users do not directly access tables. If such functionality is needed, then views are fine.

There is nothing per se wrong with using strings. But in practice, they are not used for this purpose.

answered Mar 29, 2017 at 11:12

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Using single column table

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related