 |
|
 |
|
Next: Single or Multiple Tables?
|
| Author |
Message |
External

Since: Aug 15, 2008 Posts: 3
|
(Msg. 1) Posted: Fri Aug 15, 2008 1:13 pm
Post subject: Modeling independent attributes on m:n links Archived from groups: comp>databases (more info?)
|
|
|
Consider you have two tables - for example, "user" and "url". Now there
are several attributes for each m:n relation that can be set
independently; for example, "visited", "bookmarked" and "open on
startup". Some of these are only set for few of the m:n relations,
others can be set for most of them. Most are boolean, but a few might
carry data as well.
I see two basic ways of modeling this, both with advantages and
disadvantages.
1.) Build a standard m:n link table and add the attributes as additional
columns.
- Pro: Read queries stay simple
- Con: Write operations always have to consider that the row might not
exist yet and needs to be inserted first (using triggers or "on
duplicate key" constructs)
2.) For each attribute, add a separate m:n table. For boolean
attributes, an existing row means that the attribute is set. For
non-boolean attributes, add an extra data row.
- Pro: Clean separation of attributes, update operations are straightforward
- Con: Queries require more complex queries or extra views; possibly
reduced performance because more joins are required?
Which way of modeling do you prefer? Why, and/or in which situation? >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Apr 02, 2008 Posts: 68
|
(Msg. 2) Posted: Mon Aug 18, 2008 4:25 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Robert,
the way presented under 1) is called EAV design and is having a lot of
disadvantages - just search in this group and the sql server groups
and find a lot of discussions on this.
Number 2) is the usually accepted one. You should make sure to have
the proper primary keys - here I would use "URL" - and then you might
be able to omit a lot of joins if you just want to query the
corresponding web address.
brgds
Philipp Post >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Jan 11, 2008 Posts: 1089
|
(Msg. 3) Posted: Mon Aug 18, 2008 2:23 pm
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On Aug 15, 7:13 am, "Robert M." wrote:
> Consider you have two tables - for example, "user" and "url". Now there
> are several attributes for each m:n relation that can be set
> independently; for example, "visited", "bookmarked" and "open on
> startup". Some of these are only set for few of the m:n relations,
> others can be set for most of them. Most are boolean, but a few might
> carry data as well.
>
> I see two basic ways of modeling this, both with advantages and
> disadvantages.
>
> 1.) Build a standard m:n link table and add the attributes as additional
> columns.
>
> - Pro: Read queries stay simple
>
> - Con: Write operations always have to consider that the row might not
> exist yet and needs to be inserted first (using triggers or "on
> duplicate key" constructs)
>
> 2.) For each attribute, add a separate m:n table. For boolean
> attributes, an existing row means that the attribute is set. For
> non-boolean attributes, add an extra data row.
>
> - Pro: Clean separation of attributes, update operations are straightforward
>
> - Con: Queries require more complex queries or extra views; possibly
> reduced performance because more joins are required?
>
> Which way of modeling do you prefer? Why, and/or in which situation?
CREATE TABLE Users
(user_id CHAR(255) NOT NULL PRIMARY KEY,
..);
CREATE TABLE URLs
(url CHAR(255) NOT NULL PRIMARY KEY,
..);
CREATE TABLE <<table name here>>
(user_id CHAR(255) NOT NULL
REFERENCES Users,
url CHAR(255) NOT NULL
REFERENCES URLs,
url_status VARCHAR (20) NOT NULL
CHECK url_status IN ('visited', 'bookmarked' , 'open on
startup', ..),
PRIMARY KEY (user_id, url, url_status)); >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Jan 11, 2008 Posts: 1089
|
(Msg. 4) Posted: Wed Aug 20, 2008 7:48 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
>> This is basically the EAV design Philipp Post mentioned, but without an extra status value table, right? <<
NO! DEAR GOD NO! This is a normalized skeleton schema in which
entities (users, urls) have their own tables, and the relationship
between them is its own table. Please Google up EAV so you will know
it when you see it.
>> The "value" column is omitted here,because there are only boolean parameters (so an existing row implicitly means "true")? <<
There are no Booleans in SQL; this is a predicate language (high
level, abstract, we detect fact from data) and not a Boolean flag
language (low level, very physical bit flags and Booleans set by
external users without regard to the actual state of the data).
You are supposed to sit down and actually design your encoding schemes
or (better) find industry standards for the attributes of the entities
or relationships you are modeling. >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Aug 15, 2008 Posts: 3
|
(Msg. 5) Posted: Wed Aug 20, 2008 9:20 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
--CELKO-- wrote:
> CREATE TABLE Users
> (user_id CHAR(255) NOT NULL PRIMARY KEY,
> ..);
>
> CREATE TABLE URLs
> (url CHAR(255) NOT NULL PRIMARY KEY,
> ..);
>
> CREATE TABLE <<table name here>>
> (user_id CHAR(255) NOT NULL
> REFERENCES Users,
> url CHAR(255) NOT NULL
> REFERENCES URLs,
> url_status VARCHAR (20) NOT NULL
> CHECK url_status IN ('visited', 'bookmarked' , 'open on
> startup', ..),
> PRIMARY KEY (user_id, url, url_status));
This is basically the EAV design Philipp Post mentioned, but without an
extra status value table, right? The "value" column is omitted here,
because there are only boolean parameters (so an existing row implicitly
means "true")?
- Robert >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Aug 15, 2008 Posts: 3
|
(Msg. 6) Posted: Wed Aug 20, 2008 9:20 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Philipp Post wrote:
> Robert,
>
> the way presented under 1) is called EAV design and is having a lot of
> disadvantages - just search in this group and the sql server groups
> and find a lot of discussions on this.
>
> Number 2) is the usually accepted one. You should make sure to have
> the proper primary keys - here I would use "URL" - and then you might
> be able to omit a lot of joins if you just want to query the
> corresponding web address.
>
Thanks for your reply
- Robert >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Apr 02, 2008 Posts: 68
|
(Msg. 7) Posted: Thu Aug 21, 2008 2:37 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
For good order sake, here the design I had in mind. Although it was
not mentioned by Robert, I assumed a data management for a web-browser
like application where the relationships between user <-> url have its
own business rules and further attributes:
CREATE TABLE Users
(user_id CHAR(  NOT NULL PRIMARY KEY,
first_name VARCHAR(30) NOT NULL,
last_name VARCHAR(30) NOT NULL);
CREATE TABLE URLs
(url VARCHAR(255) NOT NULL PRIMARY KEY
CHECK (url LIKE 'http://%' OR url LIKE 'https://%' OR url LIKE 'ftp://
%'),
url_description VARCHAR(50) NOT NULL); -- can have a global
description
-- each user can just have one startup URL
CREATE TABLE Startup_URLs
(user_id CHAR(  NOT NULL UNIQUE
REFERENCES Users(user_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
url VARCHAR(255) NOT NULL
REFERENCES URLs(url)
ON UPDATE CASCADE
ON DELETE CASCADE,
-- no further columns needed here
PRIMARY KEY (user_id, url));
-- user can bookmark as much urls as he like, but not the same twice
CREATE TABLE Bookmarked_URLs
(user_id CHAR(  NOT NULL
REFERENCES Users(user_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
url VARCHAR(255) NOT NULL
REFERENCES URLs(url)
ON UPDATE CASCADE
ON DELETE CASCADE,
bookmark_url_description VARCHAR(50) NOT NULL, -- user can set its
own description for display
bookmark_date DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (user_id, url)); -- not the same twice enforced here
-- this is the history what web sites were visited chronologically
-- each site can be visited more than once
CREATE TABLE Visited_URLs
(user_id CHAR(  NOT NULL
REFERENCES Users(user_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
url VARCHAR(255) NOT NULL
REFERENCES URLs(url)
ON UPDATE CASCADE
ON DELETE CASCADE,
visit_date DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (user_id, url, visit_date));
brgs
Philipp Post >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Aug 21, 2008 Posts: 6
|
(Msg. 8) Posted: Thu Aug 21, 2008 6:25 pm
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On Aug 20, 11:48 am, --CELKO-- wrote:
> >> This is basically the EAV design Philipp Post mentioned, but without an extra status value table, right? <<
>
> NO! DEAR GOD NO! This is a normalized skeleton schema in which
> entities (users, urls) have their own tables, and the relationship
> between them is its own table. Please Google up EAV so you will know
> it when you see it.
>
> >> The "value" column is omitted here,because there are only boolean parameters (so an existing row implicitly means "true")? <<
>
> There are no Booleans in SQL; this is a predicate language (high
> level, abstract, we detect fact from data)
Isn't that compatible with the idea expressed above that the existence
of a row implies a [in this case Boolean] fact?
> and not a Boolean flag
> language (low level, very physical bit flags and Booleans set by
> external users without regard to the actual state of the data).
>
> You are supposed to sit down and actually design your encoding schemes
> or (better) find industry standards for the attributes of the entities
> or relationships you are modeling. >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Jan 11, 2008 Posts: 1089
|
(Msg. 9) Posted: Fri Aug 22, 2008 8:04 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
This is often a subtle thing for people who did not get a dose of
"Scales and Measurements" theory in their database class. The columns
in an RDBMS are assumed to be scalar values. That means they have a
scale, the scale's values are drawn from a domain and a few other
things.
A "scalar value" is not the same as "atomic value" and this also
causes confusion in data modeling. An atomic value cannot be
decomposed without destroying meaning. Therefore a scalar value is
always atomic. But consider (longitude, latitude), or (x,y,z) co-
ordinates, or (start_time, end_time) durations. They have scalar
components, but the tuple is atomic; pull out latitude and all you can
say is that your position is one point of infinitely many points
between the North and South Poles.
Flags are not the same as a two-valued scale. For example, Rh blood
factor is either positive or negative. Nobody would keep medical
records with a "is_positive" and/or "is_negative" column(s). Those
would be flags and not scales. See the difference?
Another disadvantage is that flags become insanely complex even for
simple situations. How many times have you seen "is_male" in poorly
designed schemas? My favorite is to pull out the ISO sex_code, which
uses (0= unknown, 1= male, 2= female, 9= lawful person (corporations,
organizations, etc.)) Suddenly, "is_male" does not work when we are
doing business with IBM. Add an "is_corporation" flag and you need
rules about ignoring "is_male" when ("is_corporation" = TRUE). Soon
you are engaged in playing "20-Questions" for even the simplest
query. Instead of adding a new value to the scale, you add 2^n
possible check constraints. Or do what most Newbies do; forget about
data integrity and pray that all applications -- present and future --
are perfect.
Did you notice that in this example I have to implement a business
rule that I must visit a website before I can bookmark it? I can
enforce that rule with a state transition constraint without much
trouble and easily change my state transitions. Now do that with
flags. You wind up with triggers and a ton of procedural code that is
a bitch to maintain.
This is really too much to put into a newsgroup posting; you really
need a class or a book for this. IT does not fit into 500 words or
less. >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Apr 02, 2008 Posts: 68
|
(Msg. 10) Posted: Tue Aug 26, 2008 11:32 am
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Axel,
> That is certainly not an EAV design <
Indeed this was not a good comparison. My apologies if this introduced
some confusion.
To come back to Roberts question as to which design I would prefer
when (without boolean flags!). I can basically see two situations:
1) Some simple m:n relationships which just differ by some sort of
"relation_type" column and carry no further attributes with them. This
will fit well into one table (see Joes example)
2) Several independent relationships between the same tables which are
very different creatures (see the example I was giving) having
different business rules and their own attributes. They might not fit
well into one table, unless you get rid of most constraints and might
also be unable to assign a proper primary key. Further you would
introduce columns which apply to one relationship, but not to the
other. That was how I was understanding the initial post and therefore
was thinking at the mindset of "one-table-does-it-all" which also
drives EAV and OTLT/MUCT.
brgds
Philipp Post >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
External

Since: Aug 26, 2008 Posts: 1
|
(Msg. 11) Posted: Tue Aug 26, 2008 1:38 pm
Post subject: Re: Modeling independent attributes on m:n links [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Philipp Post wrote:
> Robert,
>
> the way presented under 1) is called EAV design and is having a lot of
> disadvantages - just search in this group and the sql server groups
> and find a lot of discussions on this.
That is certainly not an EAV design, as far as I can see, since each
attribute would have a dedicated, well um, attribute. In an EAV design
you store attribute values as name-value pairs.
Axel Hallez >> Stay informed about: Modeling independent attributes on m:n links |
|
| Back to top |
|
 |  |
| Related Topics: | Erwin modeling basic question - Hi, Can anyone tell me when doing db design in ERWin, how do you make the table column area longer so more attributes can be shown in the diagram? I thought it'd be as simple as dragging the lower corner of the table, but it didn't work. And I didn't....
number of attributes - Is there a maximum to how many attributes a table should have? I'm implementing a user-based web-application for school. Should I, for example, put all the user characteristics and preferences in one table or should I split it in multiple tables.
Call for Papers: The 2008 International Conference on Mode.. - Dear Colleagues: I would appreciate if you would share the announcement below with those who might be interested. Best regards, A. M. G. Solo Publicity Chair, MSV'08 ----------------------- Extended Paper Submission Deadline: March 10, 2008 ...
Multi part attributes design question - I am trying to improve the design of a table and was hoping this might be the correct group in which to ask for help. The table is called PartCodes and has three columns: ProductID, PartCode, SequenceNumber Each product may have multiple codes. Each..
History function - G'day, First of all, I'm using PostgreSQL 8.0.1 on a Debian Linux machine. I've got to set up a database which as well features some sort of history function. For example, if there's a table tbl_persons like: id | name | surname | ... there should b... |
|
You can post new topics in this forum You can reply to topics in this forum You can edit your posts in this forum You can delete your posts in this forum You can vote in polls in this forum
|
|
|
|
 |
|
|