agree that cheating is a major problem, but merely asking questions
on a usenet group hardly qualifies as cheating. In fact, most
universities unfortunately do a bad job of teaching database
concepts. I think the OP is doing a good thing by asking outside
sources for clarification of concepts. Asking someone else to do the
work for you is cheating, but that isn't what the OP is asking. As
far as I can see, the OP is just asking if it is possible to have the
database engine create keys in the format 'C'<last two digits of
year><local counter>. Where <local counter> is a number that is unique
within a year but resets each year, and is thus not unique across
years.
To the OP: The answer to your question is 'yes', but it I recommend
against it. Keys of this type (called 'intelligent' or 'hierarchical'
keys) have lots of problems. I'll give you some tips on where you
can look to get the tools needed for building the key the way you
want, then talk a little about better alternatives.
1 How to build the ugly (intelligent) key:
First, I don't think you need to build a stored procedure, you can do
it all with a SELECT statement and built in SQL Server functions. Do a
google search for "string concatenation" in SQL Server. Your key has
three pieces, and there is a simple operator in SQL Server that allows
you to concatenate the separate pieces and make a single string like
you want.
You still need to generate the different pieces of your key so that
you have something to concatenate. The first part (lettter C) appears
to be a literal. In just about every flavor of SQL, you can just put
a literal in single quotes and select it, like this: SELECT 'C
FROM . . .';
For the next piece, you have to get the system date (I assume), then
extract the year portion, and finally get to the last two digits of
the year. google sql server getdate() function to see about system
date. The DATEPART() function will help you get different parts of
the date. So say you get the integer 2008, you want to get to '08'.
There may be better ways, but one idea is to convert the integer to a
string (look up CAST or CONVERT) and then grab the last two characters
(look up SUBSTRING). You could also do modulus arithmetic on the full
year (lookup SQL SERVER MODULO OPERATOR), but the modulo operator
returns an integer. You would have to convert this to a string anyway
and conditionally pad it with a leading zero, depending on the size of
the number. That's easy with LPAD in Oracle, but a pain in the butt
(as far as I know) with SQL Server.
For the final piece, you should lookup the ROWNUMBER() function, which
is part of the window functions in sql server. In this case the word
'window' has nothing to do with the Microsoft O/S. 'Window' refers to
a particular partition of your data (nothingg to do with physically
partitioned tables) over which you want to perform an operation.
Works great for subtotals, averages on partitions, etc. You can use
ROWNUMBER() to create a number that would be unique with respect to a
partition, for instance unique within a year, etc.
2: Why you should take another approach:
Intelligent keys always lead to big trouble sooner or later, usually
sooner. By "intelligent" I mean a key that has any embedded
meaning.
Here's a great example of an intelligent key causing big problems: I
once did some work for a retail chain that had grown by acquisition
over the years. In cases like this, the original legal entity of each
store is important for regulatory and tax reporting purposes. So
let's say the chain had a total of 1,000 stores. Maybe 500 of the
stores belonged to entity 'X', 200 of the stores belonged to entity
'Y' and 300 of the stores belonged to entity 'Z'. You have to keep
track of this, because each legal entity pays taxes bases on its
revenues, etc. Someone had the horrible idea of relating the store
number to the legal entity. Thus stores with numbers in the range 1 -
500 were in legal entity X, numbers in the range 501 - 700 were in
legal entity 'Y' etc. As the chain grew, they found that they had
trouble assigning new store numbers. Example: The chain has one
thousand stores and acquires another one. Here is what would happen
Person 1: What should be the store number for the store we just
acquired? How about 1001?
Person 2: No, wait! Numbers in the range 1000 to 2500 are reserved
for stores that belong to entity 'AA', and this one came from entity
'Y'.
Person 1: Oh, okay, well then what is an available number for entity
'Y'.
Person 2: Oops! We're out of numbers in the range for entity 'Y'.
Person 1: Darn! What do I do now?
Person 2: It turns out we closed a store a year ago that originally
came from entity 'Y'. We can re-use that old store number!
They really did this, and it was a freakin' mess. It got worse: These
stores had physical characteristics that had to be tracked for
regulatory purposes for a very long time, sometimes for years after
the stores closed. However, some numbers had been re-used two and
even three times. Tracking the stores was a horrible mess.
This whole problem could have been avoided by just using an arbitrary
unique number with no hidden meaning to identify each store. The
legal entity info should have been stored in an "OWNING_LEGAL_ENTITY"
column in the store table.
Your customer data is an analogous issue. Make your customer number a
simple integer and put the date that the customer was created in its
own column. This info could be very useful for other purposes. The
database owner could see if there are any correlations between
customer age and customer sales, see if customers tend to leave after
a certain age, etc. Just make your customer key a simple number, and
put the other meaningful pieces in their own columns. With the
customer CREATE_DT in it's own column, it's extremely simple to use
the built in data arithmetic functions of SQL Server to calculate the
age of the customer in days, months, years or whatever, correlate
sales to age, etc.. Even if you had the entire date embedded in the
key, it would be a pain (and perform poorly) to extract the date from
the key and use it for queries.
As far as the 'C', I would not include that in the key. It's not
uncommon for business to have suppliers who are also customers, or who
used to be customers, or who may become customers in the future.
Let's say customer (Call it company 'XYZ') C0804 starts selling stuff
to the company. Will you make a vendor id for him (maybe V0912)? If
you do business with someone who is both a supplier and a customer,
its natural to want to know something akin to the "balance of trade"
between you and the other party. But now, the same company XYZ has
two different identifiers. How do you know that they they both point
to the same entity? You could make a cross reference table, but that
gets ugly. Why not just give the same type of key to anyone with whom
you do business (call it a "Party" or maybe a "Business Partner")?
Then it's easy to uniquely identify the Party. Maybe the business in
this scenario would say "our customers and suppliers never overlap".
That's a big statement, but for the sake of argument, let's assume
it's true. Even in this case, it's very possible that other parties
overlap. Most business set up their employees as vendors when they
reimburse them for business expenses. If you're stuck with a typical
accounting system, you have to cross reference the employee and vendor
to the same party in your own external database, but if you are
designing the system form the ground up, you can set up the system to
just have parties playing roles. Setting up a party system can get
pretty detailed, but I'd say for sure that including anything in the
key that specifies the role of the party in question is not a good
idea.
For repairs, consider something similar. Make the repair primary key
a simple integer. That integer could also printed on the repair
ticket. Don't try to encode the date or customer number into the
repair ticket number. Those pieces of info belong in their own
columns.
Thanks,
Bill
>> Stay informed about: customise primary Key