Welcome to dbFreaks.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

can pivot be applied to group by ?

 
   Database Help (Home) -> Programming RSS
Next:  Copying whole field of data from a table to other..  
Author Message
John A Grandy

External


Since: Jun 26, 2008
Posts: 80



(Msg. 1) Posted: Mon Jul 14, 2008 12:53 pm
Post subject: can pivot be applied to group by ?
Archived from groups: microsoft>public>sqlserver>programming (more info?)

For this table

Id bigint identity(1,1)
Attribute nvarchar(50) not null
PeriodNumber bigint not null
DataDateTime DateTime not null
Value decimal(18,5) not null



I need to retrieve each period as a row with its columns as a set of
attributes' values.

For example, if the attributes are 'Attr1' , 'Attr2' , 'Attr3' , 'Attr4' ,
my returned rowset should be :

PeriodNumber DataDateTime Attr1 Attr2 Attr3 Attr4
1 7/1/2008 1.45 1.75 1.20 1.59
2 7/2/2008 1.59 1.80 1.30 1.80
etc.




PIVOT seemed like a likely candidate, but I would need to GROUP BY
PeriodNumber, and it does not appear that PIVOT can be applied to the
individual groups.

I suppose a temp table is a possibility, but would involve initial insert
sql and update sql for each additional attribute. Could this be performant
?

Is there some way to accomplish this in a single sql statement ?



( And yes I realize this is an example of an EAV (Entity-Attribute-Value)
table which apparently is frowned upon but what other choice do I have if I
need to support a set of attributes only known at data-load-time ? )

 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
John A Grandy

External


Since: Jun 26, 2008
Posts: 80



(Msg. 2) Posted: Mon Jul 14, 2008 1:54 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

I've coded 2 solutions :

1. using a temp table
2. using inner-joins of multiple derived tables (one for each attribute).

Performance is the issue.

I've read that using derived tables where possible can be twice as
performant as using temp tables ( in-memory ram vs tempdb on disk ).

However, could the reverse be true for very large resultsets ? If the total
memory needed by the derived tables forces paging to occur , could it be
that using an indexed temp table is actually faster ?

My resultsets could potentially be in the 100s of millions ...



"John A Grandy" <johnagrandy-at-gmail-dot-com> wrote in message

> For this table
>
> Id bigint identity(1,1)
> Attribute nvarchar(50) not null
> PeriodNumber bigint not null
> DataDateTime DateTime not null
> Value decimal(18,5) not null
>
>
>
> I need to retrieve each period as a row with its columns as a set of
> attributes' values.
>
> For example, if the attributes are 'Attr1' , 'Attr2' , 'Attr3' , 'Attr4' ,
> my returned rowset should be :
>
> PeriodNumber DataDateTime Attr1 Attr2 Attr3 Attr4
> 1 7/1/2008 1.45 1.75 1.20 1.59
> 2 7/2/2008 1.59 1.80 1.30 1.80
> etc.
>
>
>
>
> PIVOT seemed like a likely candidate, but I would need to GROUP BY
> PeriodNumber, and it does not appear that PIVOT can be applied to the
> individual groups.
>
> I suppose a temp table is a possibility, but would involve initial insert
> sql and update sql for each additional attribute. Could this be
> performant ?
>
> Is there some way to accomplish this in a single sql statement ?
>
>
>
> ( And yes I realize this is an example of an EAV (Entity-Attribute-Value)
> table which apparently is frowned upon but what other choice do I have if
> I need to support a set of attributes only known at data-load-time ? )
>

 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
--CELKO--

External


Since: Jan 11, 2008
Posts: 1089



(Msg. 3) Posted: Mon Jul 14, 2008 2:04 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

>> ( And yes I realize this is an example of an EAV (Entity-Attribute-Value) table which apparently is frowned upon <<

Frowned? More like "spit upon", regarded as Cancer, etc.

>> .. but what other choice do I have if I need to support a set of attributes only known at data-load-time?) <<

0) RDBMS is not meant for a "I have no idea what the hell is going on
and I am not going to figure it out first!" environment. A better
question is why don't you know? This sounds like web-surfing random
porno sites to see what kind of new virus you can get into your
system.

1) Free text. This is what we did in the old days with pens and field
books when researching a totally new subject area. The idea was to
come up with the scales, measurements, data models, etc. to build a
file system or DBMS for it.

2) XML. This assumes that you have a hierarchical problem and don't
need the performance and power of SQL.

You keep posting HOW you want to kludge something together, but you
never post WHAT you are actually trying to do. What is the actual
problem you are trying to model?
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
John A Grandy

External


Since: Jun 26, 2008
Posts: 80



(Msg. 4) Posted: Mon Jul 14, 2008 2:31 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Thank you so much for expanding my consciousness. I was not aware that I was
"kludging", rather I was under the apparently severe misimpression that I
was researching and designing.

I'm designing a system for working with per-period financial data where the
set of attribute values is unknown until load time of various data-sets.

For example, it might be bar data ( open, high, low, close, volume ).

Or it might be derivatives data ( equity price, near-strike near-month call
option price, near-strike near-month put option price, implied volatility,
etc.)

Ad-hoc design/add of additional tables for the particulars of a each set of
new data to be loaded is not an option.



"--CELKO--" wrote in message

>>> ( And yes I realize this is an example of an EAV
>>> (Entity-Attribute-Value) table which apparently is frowned upon <<
>
> Frowned? More like "spit upon", regarded as Cancer, etc.
>
>>> .. but what other choice do I have if I need to support a set of
>>> attributes only known at data-load-time?) <<
>
> 0) RDBMS is not meant for a "I have no idea what the hell is going on
> and I am not going to figure it out first!" environment. A better
> question is why don't you know? This sounds like web-surfing random
> porno sites to see what kind of new virus you can get into your
> system.
>
> 1) Free text. This is what we did in the old days with pens and field
> books when researching a totally new subject area. The idea was to
> come up with the scales, measurements, data models, etc. to build a
> file system or DBMS for it.
>
> 2) XML. This assumes that you have a hierarchical problem and don't
> need the performance and power of SQL.
>
> You keep posting HOW you want to kludge something together, but you
> never post WHAT you are actually trying to do. What is the actual
> problem you are trying to model?
>
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
--CELKO--

External


Since: Jan 11, 2008
Posts: 1089



(Msg. 5) Posted: Mon Jul 14, 2008 3:10 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

>> I'm designing a system for working with per-period financial data where the set of attribute values is unknown until load time of various data-sets. <<

You ought to know the attributes from each source and something about
the scales and units of measurement used.

>> For example, it might be bar data ( open, high, low, close, volume ). <<

That is one kind of table and you seem to have a start on the column
names for it ( I was looking for an ISIN, CUSIP or something for a
key).

>> Or it might be derivatives data .. <<

That is a second kind of table and you seem to have a start on the
column names for it, too.

I hope that it doesn't bother you that you don't put your squids and
your automobiles in the same table. But for your database, are these
really different creatures? That is, do you want to put differently
formatted data of the same kind into one table? Statisticians have
had this problem for centuries and have tools for it. There are some
simple things like converting time to UTC, measurements to common
scales (pounds to kilograms, gallons to liters, etc.), or time based
conversions (dollars to yen on a particular day). Then there is the
complex stuff to adjust for missing or erroneous data.

What you want is a data warehouse and some good data mining tools.
And with 100's of millions of data points, you want a RDBMS designed
for this kind of work.

>> Ad-hoc design/add of additional tables for the particulars of a each set of
new data to be loaded is not an option. <<

Option? It is a necessity for the staging tables, so you can scrub
and load it. How many different data sources do you have; my guess
is less than 100 -- there are not that many providers for financial
data.
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
John A Grandy

External


Since: Jun 26, 2008
Posts: 80



(Msg. 6) Posted: Mon Jul 14, 2008 3:51 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

It's not a question of standard financial data sources per se , it's a
question of end-users providing custom data in custom formatted files.

I have abandoned the idea of multiple datatypes. The only data I will handle
is decimal(18,5).

I may very well be proved to be wrong, but it seems possible to design a
performant system handling series data consisting of an arbitray number of
attributes all of identical datatype.

Perhaps 100 million row rowsets will not be performant but million row
rowsets should be.

If I use temp tables and/or joined derived tables to "pivot out my attribute
set", approximately how much less performant is this compared with a flat
table design ? e.g.

Id bigint identity(1,1)
BarNumber bigint
DataDateTime DateTime not null
Open decimal(18,5)
High decimal(18,5)
Low decimal(18,5)
Close decimal(18,5)
Volume decimal(18,5)





"--CELKO--" wrote in message

>>> I'm designing a system for working with per-period financial data where
>>> the set of attribute values is unknown until load time of various
>>> data-sets. <<
>
> You ought to know the attributes from each source and something about
> the scales and units of measurement used.
>
>>> For example, it might be bar data ( open, high, low, close, volume ).
>>> <<
>
> That is one kind of table and you seem to have a start on the column
> names for it ( I was looking for an ISIN, CUSIP or something for a
> key).
>
>>> Or it might be derivatives data .. <<
>
> That is a second kind of table and you seem to have a start on the
> column names for it, too.
>
> I hope that it doesn't bother you that you don't put your squids and
> your automobiles in the same table. But for your database, are these
> really different creatures? That is, do you want to put differently
> formatted data of the same kind into one table? Statisticians have
> had this problem for centuries and have tools for it. There are some
> simple things like converting time to UTC, measurements to common
> scales (pounds to kilograms, gallons to liters, etc.), or time based
> conversions (dollars to yen on a particular day). Then there is the
> complex stuff to adjust for missing or erroneous data.
>
> What you want is a data warehouse and some good data mining tools.
> And with 100's of millions of data points, you want a RDBMS designed
> for this kind of work.
>
>>> Ad-hoc design/add of additional tables for the particulars of a each set
>>> of
> new data to be loaded is not an option. <<
>
> Option? It is a necessity for the staging tables, so you can scrub
> and load it. How many different data sources do you have; my guess
> is less than 100 -- there are not that many providers for financial
> data.
>
>
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
Plamen Ratchev

External


Since: Jan 10, 2008
Posts: 1007



(Msg. 7) Posted: Mon Jul 14, 2008 5:12 pm
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Yes, you can group and pivot at the same time:

CREATE TABLE Attributes (
id BIGINT IDENTITY(1,1) PRIMARY KEY,
attribute NVARCHAR(50) NOT NULL,
periodnumber BIGINT NOT NULL,
datadatetime DATETIME NOT NULL,
value DECIMAL(18,5) NOT NULL);

INSERT INTO Attributes VALUES ('Attr1', 1, '20080701', 1.45);
INSERT INTO Attributes VALUES ('Attr2', 1, '20080701', 1.75);
INSERT INTO Attributes VALUES ('Attr3', 1, '20080701', 1.20);
INSERT INTO Attributes VALUES ('Attr4', 1, '20080701', 1.59);
INSERT INTO Attributes VALUES ('Attr1', 2, '20080702', 1.59);
INSERT INTO Attributes VALUES ('Attr2', 2, '20080702', 1.80);
INSERT INTO Attributes VALUES ('Attr3', 2, '20080702', 1.30);
INSERT INTO Attributes VALUES ('Attr4', 2, '20080702', 1.80);

SELECT periodnumber,
datadatetime,
MAX([Attr1]) AS Attr1,
MAX([Attr2]) AS Attr2,
MAX([Attr3]) AS Attr3,
MAX([Attr4]) AS Attr4
FROM Attributes
PIVOT
(MAX(value) FOR attribute IN
([Attr1], [Attr2], [Attr3], [Attr4])) AS P
GROUP BY periodnumber, datadatetime;

HTH,

Plamen Ratchev
http://www.SQLStudio.com
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
--CELKO--

External


Since: Jan 11, 2008
Posts: 1089



(Msg. 8) Posted: Tue Jul 15, 2008 7:03 am
Post subject: Re: can pivot be applied to group by ? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

>> It's not a question of standard financial data sources per se , it's a question of end-users providing custom data in custom formatted files. <<

Have you looked at ETL tools? That would load your warehouse and
scrub the data fairly fast.

>> The only data I will handle is decimal(18,5). <<

Sound right for currency -- enough decimal places for calculations and
so big that you can hold amounts of Yen.

>> I may very well be proved to be wrong, but it seems possible to design a performant system handling series data consisting of an arbitrary number of attributes all of identical data type. <<

But how do you know that all of the attributes are of the same data
type? You cannot reasonably model currency amounts with temporal
data; look at the number of idiots on these newsgroups who keep
temporal data in strings, etc.

Picking the right data type is the first part of the data integrity
constraints. Screw it up and life is hell for the rest of the lifetime
of the system.

>> If I use temp tables and/or joined derived tables to "pivot out my attribute set", approximately how much less performant is this compared with a flat table design ? e.g. <<

Why not use a real as hoc spreadsheet instead of faking it in SQL?
 >> Stay informed about: can pivot be applied to group by ? 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
What happens when a "NULL" condition is applied to the - Hi, I have below two tables "TAB1" & "TAB2": TAB1 ---- T1C1 T1C2 ---- ---- 1 A 2 B 3 C TAB2 ---- T2C1 T2C2 ---- ---- 2 D 3 E 4 F When I executed the below sql statement, it displayed the Cartesian product ...

error adding user when audit trigger applied - Hi I have a trigger that will audit all ddl events. When adding a user through SSMS and give the user permission to certain databases only not all databases I get the error below. If I grant access using the code below CREATE USER [Titan\john.dow] FOR...

select/isolate group from... where max(date in group) = so.. - In the following data I need to select (or isolate) the codes (code groups) which contain a max codeDate of '12/31/08'. Of the codes a, b, c, d -- only codes b and c contain a max codeDate of '12/31/08' in their respective groups. What would be the...

"IS NULL" check applied to a LOB field causes "lob logical.. - It looks like an "IS NULL" predicate, when applied to a LOB field, causes that LOB's pages to be read. This results in a very slow processing of queries like: SELECT (case when [myColumn] is not null then 1 else 0 end) as test1 FROM myTable ...

select rowcount per group before value in group = null - I need to select the top 3 rows per each group (xID -- order by xID). No problem here. But I need to select the count of rows in each group before a null value is encountered in the stat column -- ordering by xID CREATE TABLE #tmp1(rowID int..
   Database Help (Home) -> Programming All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]