0

I want to create a database for my blog website where I have many posts and categories. A post can have many categories and a category can have many posts. I couldn't figure out how to implement it in PostgreSQL.

This is the category table:

CREATE TABLE category (
   category_name VARCHAR(255) PRIMARY KEY
);

And this is the Post table:

CREATE TABLE post (
   post_id INTEGER NOT NULL PRIMARY KEY,
   title VARCHAR(255),
   category_name VARCHAR(255) REFERENCES category (category_name)
                 ON UPDATE CASCADE ON DELETE CASCADE,
   date DATE
);

I find it difficult to INSERT a record.

The following doesn't work:

INSERT INTO post(post_id, title, category_name, date)
VALUES (1, 'How to create a website',['technology','website'], '2025-06-04');

I could've inserted some data into the category table, but only 1 value works at a time. How do I add multiple values?

1
  • While many-to-many relations require a third bridging table as @hardillb showed, in postgresql you could also create it with arrays. But in that case you would need to keep the content of array(s) synced yourself. That doesn't need an array on categories table and you wouldn't have a REFERENCES constraint (and type would be an array). Commented Jul 5 at 9:59

2 Answers 2

4

Learn about different levels of database normalisation .

Basically you create an extra table that just holds the post ID and a category ID (you also need to update the category table to have an ID column).

You can then do joins to find all the posts in a category or all the categories for a post

e.g.

CREATE TABLE post
(
   post_id INTEGER NOT NULL PRIMARY KEY,
   title VARCHAR(255),
   date DATE
);

CREATE TABLE category
(
   id INTEGER NOT NULL PRIMARY KEY,
   category_name VARCHAR(255) UNIQUE
);

CREATE TABLE post_category (
  post_id INTEGER NOT NULL REFERENCES post (id),
  category_id INTEGER NOT NULL REFERENCES category (id)
  PRIMARY KEY (post_id, category_id)
);
Sign up to request clarification or add additional context in comments.

4 Comments

Without an example, this isn't an answer. It's a comment.
true...............
I think this is the answer.
Thanks for the answer, I learnt I need to learn a bit more of Database Design.
0

With arrays

PostgreSQL supports arrays of values. This will be the simplest way to implement your needs:
You'll just have to declare your column as category_name VARCHAR(255)[],
and insert as ARRAY['technology','website'] (instead of the ['technology','website'] you put in your question).

However, even if PostgreSQL provides a bunch of functions that make arrays easier to handle, quicker, and cleaner than comma-separating your values, this is still a shaky model that won't be helped by the RDBMS, and especially, as of 2025, you can't get them be FOREIGN KEYS.
Maintaining integrity will rely on defining some triggers on both tables (INSERT OR UPDATE OF category_name ON post, UPDATE OR DELETE ON category). Hint: using serials instead of the name could remove the need for UPDATE ON category, and make it lighter on storage, on the other hand SELECTing the categories will then of course require one additional indirection.

You can get a glimpse at it in a small fiddle to play with.

With an n..n link table

Then of course you've got the all-terrain join table, that you could call post_category, to remove that multivalued column and at least satisfy 1NF.

This will require a two-steps INSERT, or using PostgreSQL's ability to include INSERT … RETURNING in CTES:

WITH
i AS
(
    INSERT INTO post(post_id, title, date)
    VALUES (1, 'How to create a website', '2025-06-04')
    RETURNING post_id
)
INSERT INTO post_category
SELECT post_id, category_name
FROM i, category
WHERE category_name in ('technology','website');

This is shown in its own fiddle.

Of course with CTEs you can add a lot of other things, for example:

  • having a first CTE where all input parameters stay together, so you can input multiple VALUES each with its own categories
  • dynamically insert missing categories to category before using them in the just inserted post.
Handling SELECTs in an 1NF model

You perhaps already noticed that having a clean model forces to think of a clean implementation, even for SELECTs.

For example, if you do not pay attention, with a blog containing 2 posts (post 1 with categories A and B, and post 2 with categories A, B and C), a summary SELECT 'Over the whole blog, you posted '||COUNT(post.post_id)||' posts over '||COUNT(category)||' different categories' FROM post LEFT JOIN post_category USING (post_id);, would return "you posted 5 posts over 5 categories" instead of "2 posts over 3 categories".

DISTINCT is a quick-and-really-dirty way (that will not work with joins of more that two 1-to-n or n-to-n tables),
so I'd advize you to always think in sets of rows that each represent 1 physical entity (while DISTINCT is just hiding that you got 2 or 3 rows to represent each "physical unit" which is a post).

Better strategies would be based on grouped subqueries:

  • CTEs to reduce the n side of an 1-to-n relation to 1 thanks to aggregates
    For example, 1 post having 3 categories has only 1 count of categories (whose value is 3) (or subquery-defined tables: SELECT … FROM (SELECT … GROUP BY post_id) AS agg_cats_of_post which is equivalent to CTE WITH agg_cats_of_post AS (SELECT … GROUP BY post_id) … FROM agg_cats_of_post, possibly correlated (LATERAL join))
  • correlated subqueries as columns
    (SELECT (SELECT COUNT(*) FROM post_cat pc WHERE pc.post_id = p.post_id) AS count_of_cats FROM post p)
  • EXISTS filters
    (FROM post p WHERE EXISTS (SELECT 1 FROM post_cat pc WHERE pc.post_id = p.post_id AND p.category_name = 'website'))

4 Comments

I have to downvote an answer that calls arrays to implement an m-to-n relationship "clean".
@LaurenzAlbe you're right, my answer was quite ambiguous; I just wanted to express that arrays are cleaner than storing a CSV, as can frequently be seen. Here OP, by asking about arrays and declaring foreign keys in the question, shown an interest in having data well organized, that's something that I wanted to highlight. I modified the answer to better reflect the unfinished nature of a model relying on arrays; still, I see arrays as a potential pragmatic first solution for a personal blog, even if the solution I insist on is the second one.
Thank you so much for helping out especially with the insert code. I was stuck for a month because of this problem. I couldn't find the answer I want. But now it seems this is the answer, it works.
Fair enough; undownvoted.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.