Welcome to dbFreaks.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Best way to design table to store attributes?

 
   Database Help (Home) -> Technology and Theory RSS
Next:  Getting function return type(s) from SMI  
Author Message
carl.henthorn

External


Since: Jan 22, 2009
Posts: 2



(Msg. 1) Posted: Thu Jan 22, 2009 12:35 pm
Post subject: Best way to design table to store attributes?
Archived from groups: comp>databases>theory (more info?)

I have an entity that along with all of the usual attributes, has a
set of 140 bit flag attributes that I need to track all yes/no values
of. I started going down the path of creating a table 150 columns
wide, but after typing a few of them thought there may be a better
way. Smile
I am thinking of creating a table like this:
Create table details (
DetailsID int identity(1000,1),
Name nvarchar(40),
Descr nvarchar(60),
Category nvarchar(40) )
Thus I can take what would have been my column headers and use them in
Name field, throw what would have been the bit flag value and throw
that into the description field. I would tie them together with a
category value to combine them into logical groups.
I have around 1.2 million entities to store this info for, so this new
table will get long fast.

Another passing thought was to store these sets of values as XML for
each entity. I have to admit I dont know what the advantage is for
that other than the row count savings. how easy would it be to search
through a 140 field xml blob for 1 million entites?

Any advice is appreciated on how to tackle this.

 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
carl.henthorn

External


Since: Jan 22, 2009
Posts: 2



(Msg. 2) Posted: Thu Jan 22, 2009 4:18 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 22, 12:53 pm, Bob Badour wrote:
> wrote:
> > I have an entity that along with all of the usual attributes, has a
> > set of 140 bit flag attributes that I need to track all yes/no values
> > of. I started going down the path of creating a table 150 columns
> > wide, but after typing a few of them thought there may be a better
> > way. Smile
> > I am thinking of creating a table like this:
> > Create table details (
> >   DetailsID int identity(1000,1),
> >   Name nvarchar(40),
> >   Descr nvarchar(60),
> >   Category nvarchar(40) )
> > Thus I can take what would have been my column headers and use them in
> > Name field, throw what would have been the bit flag value and throw
> > that into the description field. I would tie them together with a
> > category value to combine them into logical groups.
> > I have around 1.2 million entities to store this info for, so this new
> > table will get long fast.
>
> > Another passing thought was to store these sets of values as XML for
> > each entity. I have to admit I dont know what the advantage is for
> > that other than the row count savings. how easy would it be to search
> > through a 140 field xml blob for 1 million entites?
>
> > Any advice is appreciated on how to tackle this.
>
> Before folks start pointing and laughing and snickering about you
> re-inventing EAV yet again, I thought I would point out that the
> simplest logical representation for a boolean in a relational system is
> simply the existence of a tuple in a relation.
>
> Thus, a design that has 150 tables each with a single column
> theoretically suffices.
>
> On the other hand, such a design would clearly violate the POOD. Some
> might question the principle of orthogonal design, but I think one
> ignores it at one's own risk when creating 150 tables with identical
> predicates.
>
> At a more basic level, though, are you sure you have correctly modelled
> your problem? 150 independent booleans creates a state machine with
> somewhere around 10^45 states. That's a big state machine. Without any
> sort of transition constraints, that creates a fully connected state
> machine with 10^45 states and somewhere around 10^90 allowable transitions.
>
> Those are big numbers, and it seems unlikely you really need such an
> unwieldy state machine for each row of your table.- Hide quoted text -
>
> - Show quoted text -

Thank you for bypassing the hazing of the rookie, I appreciate it
greatly! Smile
For the purposes of this table, my entity is a person, and the 150 bit
fields are a combination of columns describing all of the travel
interests that they may have in a "yes I am interested", or "No, I am
not interested" format. While it is possible to have a person
interested in all 150 travel destinations, it is very unlikely, thus
giving me a subset to work with. Since the set of "yes" values
determines which marketing pieces get sent out, I do need to keep
track of those, obviously, but I do not see the reason for keeping
track of the "no's" since marketing would not be sent to them anyways.
In my proposed table the absense of a row is the same as a "No" value.

Being moslty self taught, I was not familier with the term EAV. But
after looking it up, I can see that my issue is very simualr to a
patient-symptom set of data for a doctor. Sounds like this is the way
to go after all.

I still have to wonder if there is a better way, however, to store/
read that info. Perhaps a quasi-binary where each bit represents a
field in a footprint? or even the XML blob.
thank you again for the response!

 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
Bob Badour

External


Since: Jan 15, 2008
Posts: 1017



(Msg. 3) Posted: Thu Jan 22, 2009 4:25 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

wrote:

> I have an entity that along with all of the usual attributes, has a
> set of 140 bit flag attributes that I need to track all yes/no values
> of. I started going down the path of creating a table 150 columns
> wide, but after typing a few of them thought there may be a better
> way. Smile
> I am thinking of creating a table like this:
> Create table details (
> DetailsID int identity(1000,1),
> Name nvarchar(40),
> Descr nvarchar(60),
> Category nvarchar(40) )
> Thus I can take what would have been my column headers and use them in
> Name field, throw what would have been the bit flag value and throw
> that into the description field. I would tie them together with a
> category value to combine them into logical groups.
> I have around 1.2 million entities to store this info for, so this new
> table will get long fast.
>
> Another passing thought was to store these sets of values as XML for
> each entity. I have to admit I dont know what the advantage is for
> that other than the row count savings. how easy would it be to search
> through a 140 field xml blob for 1 million entites?
>
> Any advice is appreciated on how to tackle this.

Before folks start pointing and laughing and snickering about you
re-inventing EAV yet again, I thought I would point out that the
simplest logical representation for a boolean in a relational system is
simply the existence of a tuple in a relation.

Thus, a design that has 150 tables each with a single column
theoretically suffices.

On the other hand, such a design would clearly violate the POOD. Some
might question the principle of orthogonal design, but I think one
ignores it at one's own risk when creating 150 tables with identical
predicates.

At a more basic level, though, are you sure you have correctly modelled
your problem? 150 independent booleans creates a state machine with
somewhere around 10^45 states. That's a big state machine. Without any
sort of transition constraints, that creates a fully connected state
machine with 10^45 states and somewhere around 10^90 allowable transitions.

Those are big numbers, and it seems unlikely you really need such an
unwieldy state machine for each row of your table.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
Brian Selzer

External


Since: Jan 15, 2008
Posts: 527



(Msg. 4) Posted: Thu Jan 22, 2009 7:58 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

wrote in message

On Jan 22, 12:53 pm, Bob Badour wrote:
> > wrote:
> > > I have an entity that along with all of the usual attributes, has a
> > > set of 140 bit flag attributes that I need to track all yes/no values
> > > of. I started going down the path of creating a table 150 columns
> > > wide, but after typing a few of them thought there may be a better
> > > way. Smile
> > > I am thinking of creating a table like this:
> > > Create table details (
> > > DetailsID int identity(1000,1),
> > > Name nvarchar(40),
> > > Descr nvarchar(60),
> > > Category nvarchar(40) )
> > > Thus I can take what would have been my column headers and use them in
> > > Name field, throw what would have been the bit flag value and throw
> > > that into the description field. I would tie them together with a
> > > category value to combine them into logical groups.
> > > I have around 1.2 million entities to store this info for, so this new
> > > table will get long fast.
> >
> > > Another passing thought was to store these sets of values as XML for
> > > each entity. I have to admit I dont know what the advantage is for
> > > that other than the row count savings. how easy would it be to search
> > > through a 140 field xml blob for 1 million entites?
> >
> > > Any advice is appreciated on how to tackle this.
> >
> > Before folks start pointing and laughing and snickering about you
> > re-inventing EAV yet again, I thought I would point out that the
> > simplest logical representation for a boolean in a relational system is
> > simply the existence of a tuple in a relation.
> >
> > Thus, a design that has 150 tables each with a single column
> > theoretically suffices.
> >
> > On the other hand, such a design would clearly violate the POOD. Some
> > might question the principle of orthogonal design, but I think one
> > ignores it at one's own risk when creating 150 tables with identical
> > predicates.
> >
> > At a more basic level, though, are you sure you have correctly modelled
> > your problem? 150 independent booleans creates a state machine with
> > somewhere around 10^45 states. That's a big state machine. Without any
> > sort of transition constraints, that creates a fully connected state
> > machine with 10^45 states and somewhere around 10^90 allowable
> > transitions.
> >
> > Those are big numbers, and it seems unlikely you really need such an
> > unwieldy state machine for each row of your table.- Hide quoted text -
> >
> > - Show quoted text -
>
> Thank you for bypassing the hazing of the rookie, I appreciate it
> greatly! Smile
> For the purposes of this table, my entity is a person, and the 150 bit
> fields are a combination of columns describing all of the travel
> interests that they may have in a "yes I am interested", or "No, I am
> not interested" format. While it is possible to have a person
> interested in all 150 travel destinations, it is very unlikely, thus
> giving me a subset to work with. Since the set of "yes" values
> determines which marketing pieces get sent out, I do need to keep
> track of those, obviously, but I do not see the reason for keeping
> track of the "no's" since marketing would not be sent to them anyways.
> In my proposed table the absense of a row is the same as a "No" value.
>
> Being moslty self taught, I was not familier with the term EAV. But
> after looking it up, I can see that my issue is very simualr to a
> patient-symptom set of data for a doctor. Sounds like this is the way
> to go after all.
>
> I still have to wonder if there is a better way, however, to store/
> read that info. Perhaps a quasi-binary where each bit represents a
> field in a footprint? or even the XML blob.
> thank you again for the response!

Use three tables, one for persons, one for destinations, and one for
interests which maps persons to destinations. The interests table would
have one row {person, destination} for each destination that a person is
actually interested in.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
patrick61z

External


Since: Oct 31, 2008
Posts: 27



(Msg. 5) Posted: Fri Jan 23, 2009 12:27 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 22, 3:53 pm, Bob Badour wrote:
> At a more basic level, though, are you sure you have correctly modelled
> your problem? 150 independent booleans creates a state machine with
> somewhere around 10^45 states. That's a big state machine. Without any
> sort of transition constraints, that creates a fully connected state
> machine with 10^45 states and somewhere around 10^90 allowable transitions.
>
> Those are big numbers, and it seems unlikely you really need such an
> unwieldy state machine for each row of your table.

I've often wondered about this line of thinking, that if your system
isn't implemented in a purely relational methodology, you have no
choice but to implement it as a state machine. However, I think that
I'm settling on the idea that its more the malfunction of the
relational advocate that they aren't able to handle situations outside
of their admittedly good relational foundations.

Just because the original design used what appears to be a perfectly
decent instance of repeating groups does not mean the designer is then
condemned to the complexity of Mr. Badour's imaginary zillion states.

While the DATA in these tables could have a zillion states, thats like
saying that an ssn field has 10^9 states because it has nine
characters that can range from zero to nine. Yes, there may be many
many possible values, but simple payroll programs for example are not
burdened with a zillion states as either entire ranges of values are
handled identically, or the unneeded values are simply never entered
into storage.

C'mon people, programming 101 here.

Does anybody have knowledge of the origin of this silly "state
machine" argument? I'm really curious why it carries any weight in
these sort of discussions but I also imagine that it originally was a
valid point offered by someone a bit more academically inclined than
Mr. Badour, and I'm genuinely interested in reading about it.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
patrick61z

External


Since: Oct 31, 2008
Posts: 27



(Msg. 6) Posted: Fri Jan 23, 2009 12:39 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 22, 7:18 pm, wrote:
> On Jan 22, 12:53 pm, Bob Badour wrote:
>
>
>
> > wrote:
> > > I have an entity that along with all of the usual attributes, has a
> > > set of 140 bit flag attributes that I need to track all yes/no values
> > > of. I started going down the path of creating a table 150 columns
> > > wide, but after typing a few of them thought there may be a better
> > > way. Smile
> > > I am thinking of creating a table like this:
> > > Create table details (
> > > DetailsID int identity(1000,1),
> > > Name nvarchar(40),
> > > Descr nvarchar(60),
> > > Category nvarchar(40) )
> > > Thus I can take what would have been my column headers and use them in
> > > Name field, throw what would have been the bit flag value and throw
> > > that into the description field. I would tie them together with a
> > > category value to combine them into logical groups.
> > > I have around 1.2 million entities to store this info for, so this new
> > > table will get long fast.
>
> > > Another passing thought was to store these sets of values as XML for
> > > each entity. I have to admit I dont know what the advantage is for
> > > that other than the row count savings. how easy would it be to search
> > > through a 140 field xml blob for 1 million entites?
>
> > > Any advice is appreciated on how to tackle this.
>
> > Before folks start pointing and laughing and snickering about you
> > re-inventing EAV yet again, I thought I would point out that the
> > simplest logical representation for a boolean in a relational system is
> > simply the existence of a tuple in a relation.
>
> > Thus, a design that has 150 tables each with a single column
> > theoretically suffices.
>
> > On the other hand, such a design would clearly violate the POOD. Some
> > might question the principle of orthogonal design, but I think one
> > ignores it at one's own risk when creating 150 tables with identical
> > predicates.
>
> > At a more basic level, though, are you sure you have correctly modelled
> > your problem? 150 independent booleans creates a state machine with
> > somewhere around 10^45 states. That's a big state machine. Without any
> > sort of transition constraints, that creates a fully connected state
> > machine with 10^45 states and somewhere around 10^90 allowable transitions.
>
> > Those are big numbers, and it seems unlikely you really need such an
> > unwieldy state machine for each row of your table.- Hide quoted text -
>
> > - Show quoted text -
>
> Thank you for bypassing the hazing of the rookie, I appreciate it
> greatly! Smile
> For the purposes of this table, my entity is a person, and the 150 bit
> fields are a combination of columns describing all of the travel
> interests that they may have in a "yes I am interested", or "No, I am
> not interested" format. While it is possible to have a person
> interested in all 150 travel destinations, it is very unlikely, thus
> giving me a subset to work with. Since the set of "yes" values
> determines which marketing pieces get sent out, I do need to keep
> track of those, obviously, but I do not see the reason for keeping
> track of the "no's" since marketing would not be sent to them anyways.
> In my proposed table the absense of a row is the same as a "No" value.
>
> Being moslty self taught, I was not familier with the term EAV. But
> after looking it up, I can see that my issue is very simualr to a
> patient-symptom set of data for a doctor. Sounds like this is the way
> to go after all.
>
> I still have to wonder if there is a better way, however, to store/
> read that info. Perhaps a quasi-binary where each bit represents a
> field in a footprint? or even the XML blob.
> thank you again for the response!

I would use repeating groups, or simply use a blob. I ran into the
same situation and had 150 or so "fields" with a range of a few values
(I went ahead and simple used a character position and it worked
fine).

You don't have to do everything in your database layer, and even if
you did, many sql implementations give you tools to programmatically
dissect the field to extract further values that could then be used in
sql queries themselves.

If all you have knowledge of is RDBMS tools then yes, you are going to
not do well with problem domains that aren't as well served by RDBMS
tools. But then again you don't see SQL specialists designing video
codecs either, so theres no real dilemma despite some of the noise I
see posted in this group protesting otherwise.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
-CELKO-

External


Since: Jan 31, 2008
Posts: 63



(Msg. 7) Posted: Sat Jan 24, 2009 8:26 am
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

-As I recall, there is a standard code for destinations in the travel
industry, so you would have a two-column, 5NF table with (customer_id,
destination_code) as the key and name it something like Vacations.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
patrick61z

External


Since: Oct 31, 2008
Posts: 27



(Msg. 8) Posted: Sat Jan 24, 2009 1:44 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 24, 6:57 pm, Bob Badour wrote:

>>> Those are big numbers, and it seems unlikely you really need such an
>>> unwieldy state machine for each row of your table.

then I ask

>> Does anybody have knowledge of the origin of this silly "state
>> machine" argument?

then he wrote

> Yes, of course, we all do. It originated in Patrick's skull as a figment
> of his imagination. I doubt any argument exists beyond that scope.

Oh well, I'm not sure Bob's willing to educate me on this "state
machine" business. But I'm willing to read if anyone else wants to
dumb Bob's text down for me since I'm willing to accept that I'm so
dumb that Bob cannot stoop to my level. So what are we really talking
about here?

>At a more basic level, though, are you sure you have correctly modelled
>your problem? 150 independent booleans creates a state machine with
>somewhere around 10^45 states. That's a big state machine. Without any
>sort of transition constraints, that creates a fully connected state
>machine with 10^45 states and somewhere around 10^90 allowable transitions.

Anyone else? Why are we worried about state here? Carl, if you're
still reading, why are you writing a state machine? Or are you?
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
David BL

External


Since: Jan 22, 2008
Posts: 177



(Msg. 9) Posted: Sun Jan 25, 2009 5:22 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 25, 3:57 am, Bob Badour wrote:
> paul c wrote:
> > wrote:
>
> >> While the DATA in these tables could have a zillion states, thats like
> >> saying that an ssn field has 10^9 states because it has nine
> >> characters that can range from zero to nine.
>
> Yes, exactly like that. One of my high school french teachers would have
> reponded to that with "You have a fine grasp of the obvious." Except
> elsewhere Patrick utterly fails to demonstrate any such grasp.
>
> I suggest one stop to ponder that a billion seems like a large number
> but it is infinitesimally small compared to 10^45, which is itself
> infinitesimally small comparet to 10^90.
>
> >> Yes, there may be many
> >> many possible values, but simple payroll programs for example are not
> >> burdened with a zillion states as either entire ranges of values are
> >> handled identically, or the unneeded values are simply never entered
> >> into storage.
>
> That depends on the payroll program. ADP regularly handles on the order
> of 10^7 social security numbers, and the IRS handles on the order of
> 10^8 or 10^9 of them--when one stops and considers that even
> non-resident aliens like myself sometimes have one.
>
> Nevertheless, 10^45 states is a vast number of states. It is on the
> order of the mass of the sun in picograms. However, Carl's explanation
> suggests the first thing anyone will do with that data is dimension
> reduction.

Bob's 10^45 states argument appeared in his first response to Carl's
original post and in that post I can't see any suggestion that "the
first thing anyone will do is data reduction". Nevertheless, I don't
have a problem with Bob asking the question of whether it really makes
sense to record so many independent boolean attributes.

Bob needs to refine his apparent claim that an excessive number of
possible states suggest poor design. For example, a 1 mega-pixel 24
bpp image has about 10^(7 million) possible states.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
DBMS_Plumber

External


Since: Mar 07, 2008
Posts: 7



(Msg. 10) Posted: Tue Jan 27, 2009 8:48 pm
Post subject: Re: Best way to design table to store attributes? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Jan 24, 1:44 pm, wrote:

> Anyone else? Why are we worried about state here? Carl, if you're
> still reading, why are you writing a state machine? Or are you?

You can say:

"it's a graph"

or

"it's a sparse matrix"

But the point is, it's a mapping from set A to set B.

And Bob's point is that there are other ways to model mappings,
especially the kind of sparse mapping associated with human
preferences among 150 choices, than as a dense matrix.
 >> Stay informed about: Best way to design table to store attributes? 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Little design mistakes that can be easily avoided (1): Con.. - Hi, One of the most frequent misconception I face when working is about how concatenation became a synonym for addition of attributes. Lately, I have had the following example in a board... COL1 COL2 ABC DEF AB CDEF It came to my...

Newbie question on table design. - Hi, I have a table T with -- let's say -- a million+ records. The application allows logical deletion of a small fraction of these records. (Un-deletion of deleted records is allowed if you have admin privs via a Priveleges table.) I can easily think...

RA with MV attributes - Here is a partial formalization of a relational algebra based on MV attributes. The approach appears simple and intuitive. In particular the join of two relations is rather elegant. Definition: An attribute a consists of an attribute-name N(a) and a....

Cardinality of relation with empty set of attributes is 1 - Proof: MVFD with empty antecedent, for example {} -> {X} | {Y,Z} claims that cardinality of the XYZ relation is the product of cardinalities of its projection onto the sets X and YZ of attributes. Take {X} to be an empty set, then: {} -> {} | {X...

Design question - Hi all, not sure this is right group. Hoping someone could give me so advice on my database design. What is the best way to join these tables Short version goes like this.... I have tables : Person or Employee, Dependent (person is fk in dep table) ....
   Database Help (Home) -> Technology and Theory All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]