MySQL database structure: more columns or more rows?

Question

I'm creating an online dictionary and I have to use three different dictionaries for this purpose: everyday terms, chemical terms, computer terms. I have tree options:

1) Create three different tables, one table for each dictionary

2) Create one table with extra columns, i.e.:

id    term    dic_1_definition    dic_2_definition    dic_3_definition
----------------------------------------------------------------------
1     term1   definition
----------------------------------------------------------------------
2     term2                       definition
----------------------------------------------------------------------
3     term3                                           definition
----------------------------------------------------------------------
4     term4                       definition
----------------------------------------------------------------------
5     term5   definition                              definition
----------------------------------------------------------------------
etc.

3) Create one table with an extra "tag" column and tag all my terms depending on it's dictionary, i.e.:

id    term     definition    tag
------------------------------------
1     term1    definition    dic_1
2     term2    definition    dic_2
3     term3    definition    dic_3
4     term4    definition    dic_2
5     term1    definition    dic_2
etc.

A term can be related to one or more dictionaries, but have different definitions, let's say a term in everyday use can differ from the same term in IT field. That's why term1 (in my last) table can be assigned two tags - dic_1 (id 1) and dic_2 (id 5).

In future I'll add more dictionaries, so there probably will be more than three dics. I think if I'll use option 2 (with extra columns) I'll get in future one table and many many columns. I don't know if it's bad for performance or not.

Which option is the best approach in my case? Which one is faster? Why? Any suggestions and other options are greatly appreciated.

Thank you.

How much data is being loaded into this, a full dictionary or a couple of hundred to thousand words? — Zyris Development Team
– Zyris Development Team, Commented Nov 30, 2009 at 13:07
for example, first table has more than 200 000 rows. So I suppose it'll be around 500 000 rows. — Anthony
– Anthony, Commented Nov 30, 2009 at 13:11
The third approach is better, from my view. I have done a little modification in my post below. — Orson
– Orson, Commented Nov 30, 2009 at 13:53

James Goodwin · Accepted Answer · 2009-11-30 14:22:29Z

6

2) Create one table with extra column

You definitely shouldn't be using the 2nd approach. What if in the future you decide that you want 10 dictionaries? You would have to create an additional 10 columns which is madness..

What you should do is create a single table for all your dictionaries, and a single table for all your terms and a single table for all your definitions, that way all your data is grouped together in a logical fashion.

Then you can create a unique ID for each of your dictionaries, which is referenced in the terms table. Then all you need is a simple query to obtain the terms for a particular dictionary.

edited Nov 30, 2009 at 14:22

answered Nov 30, 2009 at 13:08

James Goodwin

7,4065 gold badges32 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Orson · Accepted Answer · 2009-11-30 13:08:30Z

5

I think you should have a lookup table for your dictionary types

DictionaryType(DTId, DTName)

Have another Table for you terms

Terms(TermID, TermName)

Then your definitions

Difinitions(DifinitionId, TermID, Definition, DTId)

This should work.

answered Nov 30, 2009 at 13:08

Orson

15.5k12 gold badges60 silver badges72 bronze badges

9 Comments

Zyris Development Team Over a year ago

Whats the DictionaryType, your answer is the best but i dont see how that table is needed at all.

Orson Over a year ago

The DictionaryType table contain all dictionary names. He said, "I'm creating an online dictionary and I have to use three different dictionaries"

Anthony Over a year ago

What if I have 3 the same terms with different definitions? Will this term have three id's or one id and 3 definitions?

Guffa Over a year ago

@Anthony: The term would have one record in the Terms table and thus one identifier, and three records in the Definitions table, one for each dictionary type.

markus Over a year ago

This is the way to go, if you want to be flexible with all three objects in the future. This way you can use ORM easily and add properties to all your three object types independently of the other object types.

|

Vincent Ramdhanie · Accepted Answer · 2009-11-30 13:05:00Z

2

Option 3 sounds like to most appropriate choice for your scenario. It makes the queries a little simpler and is definitely more maintainable in the long run.

Option 2 is definitely not the way to go because you will end up with a lot of null values and writing queries against such a table will be a nightmare.

Option 1 is not bad but before your application could query it has to deceide which table to query against and that could be a problem.

So option 3 would result in simple queries like:

Select term, definition from table where tag = 'dic_1'

You may even create another tag table to keep info about the tags themselves.

answered Nov 30, 2009 at 13:05

Vincent Ramdhanie

103k23 gold badges143 silver badges196 bronze badges

1 Comment

Peter Stuifzand Over a year ago

Instead of using a tag, he can create a new dictionary table (id, name) and use the id in the table. Takes less memory and is faster to check and join on.

Emre Yazici · Accepted Answer · 2009-11-30 14:46:24Z

2

I have developed similar project and my design was as follows. Storing words, definitons and dictionaries in different tables is a flexible choice especially where you will add new dictionaries in future.

alt text http://img300.imageshack.us/img300/6550/worddict.png

edited Nov 30, 2009 at 14:46

answered Nov 30, 2009 at 13:50

Emre Yazici

10.2k6 gold badges50 silver badges56 bronze badges

2 Comments

Whimusical Over a year ago

Could I ask the name of the UML tool you used?

Emre Yazici Over a year ago

Sure, I use MySQL Workbech for that purpose.

DRapp · Accepted Answer · 2009-11-30 13:04:25Z

1

Data Normalization .. I would go with 3, then you don't have to do any fancy queries to identify how many definitions are applicable per a given term

answered Nov 30, 2009 at 13:04

DRapp

48.3k13 gold badges80 silver badges149 bronze badges

Comments

Steve De Caux · Accepted Answer · 2009-11-30 13:18:14Z

1

There's always an "it depends..."

Having said that, option 2 will usually be a bad choice - both from the purist perspective (Data Normalisation) and the practical perspective - you have to alter the table definition to add a new dictionary (or remove an old one)

If your main access is always going to be looking for a matching term, and the dictionary name ('everyday', 'chemical', 'geek') is an attribute, then option 3 makes sense.

If on the other hand your access is always primarily by dictionary type as well as term, and dictionary 1 is huge but rarely used, while dictionaries 2..n are small but commonly used, then option 1 might make more sense (or option 1a => 1 table for rarely used dictionaries, another for heavily used dictionaries)... this is a very hypothetical case !

answered Nov 30, 2009 at 13:18

Steve De Caux

1,7791 gold badge12 silver badges13 bronze badges

1 Comment

Disillusioned Over a year ago

+1 I agree with you. The requirements here a far too vague, resulting in the 'accepted answer' being totally over-'solved'. That said, working off the little provided; I'd go with a variation on #3.

Donnie · Accepted Answer · 2009-11-30 13:56:30Z

1

Your database structure should contain data, the structure itself should not be data. This rules out option 2 immediately, unless you create the different tables in order to build separate applications running on the different dictionaries. If they are being shared, then it is the wrong way to do it.

Option 1 requires a database modification and queries to be rewritten in order to accommodate addition of new dictionaries. It also adds excessive complication to simple queries, such as "what dictionaries are this word in?"

Option 3 Is the most flexible and best choice here. If your data grows too large you can eventually use DB side details like table partitioning to speed up things.

answered Nov 30, 2009 at 13:56

Donnie

47k10 gold badges67 silver badges88 bronze badges

Comments

Guffa · Accepted Answer · 2009-11-30 13:57:20Z

1

You want to fetch data based on the dictionary type, that means that the dictionary type is data.

Data should be in the fields of the tables, not as table names or field names. If you don't have the data in the fields, you have a data model that needs changes if the data chances, and you need to create queries dynamically to get the data.

The first option uses the dictionary type as table names.

The second option uses the dictionary type as field names.

The third option correctly places the dictionary type as data in a field.

However, the term and the tag should not be strings, they should rather be foreign keys to tables where the terms and dictionary types are defined.

edited Nov 30, 2009 at 13:57

answered Nov 30, 2009 at 13:25

Guffa

703k111 gold badges760 silver badges1k bronze badges

Comments

Disillusioned · Accepted Answer · 2009-12-06 19:10:22Z

The requirements here are far too vague, resulting in the 'accepted answer' being totally over-'solved'. The requirements need to provide more information about how the dictionaries will be used.

That said, working off the little provided; I'd go with a variation on #3.

Number 1 is perfectly viable if the dictionaries will be used entirely independently, and the only reason the concept of shared terms was mentioned is that it just happens to be a coincidental possibility.
Ditch 2; it unnecessarily leads to NULL values in columns, and DB designs don't like that.
Number 3 is the best, but ditch the artificial key, and key on Term + Tag. Apart from the artificial key creating the possibility of duplicate entries (by Term + Tag). If no other tables reference TermDefinitions, the key is a waste; if something does; then they say (for example) "I'm referencing TermDefinition #3... Uhhm, whatever that is. :S"

In a nutshell, nothing provided so far in the requirement indicates any need for anything more complicated than option 3.

Collectives™ on Stack Overflow

MySQL database structure: more columns or more rows?

9 Answers 9

Comments

9 Comments

1 Comment

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

Comments

9 Comments

1 Comment

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related