Welcome to dbFreaks.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

calculating correlation coefficient

 
   Database Help (Home) -> Datamining RSS
Next:  sp_prepexec syscomments?  
Author Message
csmba

External


Since: Feb 15, 2008
Posts: 1



(Msg. 1) Posted: Fri Feb 15, 2008 2:16 pm
Post subject: calculating correlation coefficient
Archived from groups: microsoft>public>sqlserver>datamining (more info?)

The end game is that I want a report to show me the correlation (number
between -1 to 1) between each 2 items in my transactional database.

1. Is there a way to have an SP or function that calculates correlation for
2 clms in a table?
2. is there a better way, so that I don't need to create fake tables (cause
I want correlation between lets say x,y,z and a,b,c, so that is:
x-a,x-b,x-c,y-a....)
3. would you tell me that the only way is to use Analysis Services?

What would be the recommendation that will let me at the end, deliver such
report?

thanks

 >> Stay informed about: calculating correlation coefficient 
Back to top
Login to vote
Dejan Sarka

External


Since: Mar 18, 2004
Posts: 103



(Msg. 2) Posted: Mon Feb 18, 2008 7:51 am
Post subject: Re: calculating correlation coefficient [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Hi!

It is quite simple to calculate covariance, correlation coefficient and
coefficient of determination (squared correlation) with T-SQL queries. Here
is an example using vTargetMail view from AdventureWorksDW demo db:

/* Linear relationship */
USE AdventureWorksDW;
GO
-- Covariance
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT CoVar=
SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*)
FROM dbo.vTargetMail
GO
-- Correlation and CD
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT Correl=
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))),
CD=SQUARE(
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))))
FROM vTargetMail
GO

--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx

"csmba" wrote in message

> The end game is that I want a report to show me the correlation (number
> between -1 to 1) between each 2 items in my transactional database.
>
> 1. Is there a way to have an SP or function that calculates correlation
> for 2 clms in a table?
> 2. is there a better way, so that I don't need to create fake tables
> (cause I want correlation between lets say x,y,z and a,b,c, so that is:
> x-a,x-b,x-c,y-a....)
> 3. would you tell me that the only way is to use Analysis Services?
>
> What would be the recommendation that will let me at the end, deliver such
> report?
>
> thanks
>
>

 >> Stay informed about: calculating correlation coefficient 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Collation Problem any help - Hello all : My problem is like the following I have an online databse with a latin collation, this database is using ntext and nvarchar values that was with our previous admin ok the problem is in the data is stored should use another collation (the...

Reprocessing a mining model - I'm currently developing a user registration prototype. The aim is to classify certain types of users. However, after the registration I query a decision tree model in Analysis Services with DMX and I want to send the user data into the warehouse to....

Trying to cut down query times - I have a query that is based on two other queries that are based on tables in a database. All of the queries that are used for this final query use grouping and totals. In the application I am making, I need to choose a date range to select the subset....

Update serial number in database - Hi, I want to update specific fields in a existing table in SQL database as below example: L240100001-LG L240100002-LG *serial number is difficult to update one by one L240100003-LG L240100004-LG L240100005-LG change to L240100001-L L240100002-L..

Newbie question - I have SQL Server 2000. What add-ons do I need to try data mining, and where do I best go for resources/references?
   Database Help (Home) -> Datamining All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]