Welcome to dbFreaks.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Idempotent ODBMS iterators

 
   Database Help (Home) -> Object-Oriented RSS
Next:  What is stored in the Favorites folder?  
Author Message
Paul Chapman

External


Since: Feb 17, 2005
Posts: 3



(Msg. 1) Posted: Wed Feb 16, 2005 8:40 pm
Post subject: Idempotent ODBMS iterators
Archived from groups: comp>databases>object (more info?)

I am building a client/server ODBMS in Java.

It occurred to me that one way of helping to guarantee the consistency of
the database was to ensure that all atomic requests were idempotent -- that
is, that if a particular request is performed more than once, it will leave
the database in the same (logical) state as it would if the operation had
been performed once only.

This occured to me when I discovered that, by chance or unconscious design,
almost all of the operations I had designed were idempotent.

For example, rather than an atomic "append" operation, which would not be
idempotent for obvious reasons, my implementation requires that the client
(in effect) perform the following three idempotent operations (while the
table is locked, of course) [pseudocode]:

size = table.size;
table.redim(size + 1);
table[size] = item;

Note that the following code would have the same effect:

size = table.size;
size = table.size;
table.redim(size + 1);
table.redim(size + 1);
table[size] = item;
table[size] = item;

(Of course one could provide an "append" operation to the client, but
execute it in the server as three idempotent operations like those above,
logging each one separately. The advantage in having idempotent *client*
operations is that the client, too, can use logging to allow resumption
after a catastrophic failure.)

Now, after a brief survey of my code, I discovered that just one operation
is not
idempotent: the "next method" for a map iterator. I'd like to rectify this.

A database map maintains an expandable array of keys ("key-array") in
insertion order. A map iterator, which is itself always stored as an object
in the database (while there is a reference to it), keeps an index into the
key-array. If a key is removed, its entry in the key-array is replaced with
null. If an entry is added, the key is appended to the key-array. An
iterator automatically skips over null entries; it fails graceully when the
end of the key-array is reached.

Since database access is generally asynchronous, a map's keys may change
while an iterator is running over it. (An iteration could, in principle, be
done incrementally by a client over days or weeks, without locking. So no
"ConcurrentModificationException" should be "thrown".) This produces
worst-case behaviour in which a key might be encountered more than once,
when it has been removed and then re-inserted. Clients are (supposed to be)
aware of this, and it is the best solution I can think of which does not
require a potentially expensive key-array copy operation to be made when an
iterator is created. (Most long-term client iterations are expected to be
maintenance tasks, each of whose steps should itself be idempotent, so a
repeated key shouldn't be a problem.)

Now, I could remove the iterator facility, and require clients to maintain
and increment their own key-array index (and check for null entries
explicitly). Trouble is, during concurrent server maintenance of the
database, I want to be able to compact the key-arrays of maps from time to
time, and to do so safely requires that all iterators currently "in use" by
clients are in the database (and recognizable as such) and can therefore
have their indices changed (atomically) when keys are moved during
compaction.

One possible solution is to have the "next method" return a brand new
iterator with an incremented index. Performing this operation twice on the
same original iterator would obviously produce two iterators both at the
same iteration point. But this is going to create a huge amount of garbage.

(Again, I could implement the "next method" in the server as a sequence of
locally idempotent operations, but I'd like client-side idempotency as well
if possible.)

Can anyone think of any other solution, which minimizes the work done by the
server both to create and to increment the iterator?

Cheers, Paul

 >> Stay informed about: Idempotent ODBMS iterators 
Back to top
Login to vote
Carl Rosenberger

External


Since: Feb 03, 2005
Posts: 2



(Msg. 2) Posted: Sun Feb 20, 2005 3:36 pm
Post subject: Re: Idempotent ODBMS iterators [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Paul Chapman wrote:

[...]

 > Since database access is generally asynchronous, a map's keys may
 > change while an iterator is running over it.

[...]

 > Can anyone think of any other solution, which minimizes the work
 > done by the server both to create and to increment the iterator?


Four possible approaches:

(1) Read-back-in-time isolation between clients. Every
transaction sees all objects in the state they were in
when the transaction started.

(2) Locking collections. When an iterator is running on
a collection, lock it for updates.

(3) Server-Side iterator failure. Throw an exception
back at the client when the client tries to continue
an iterator that does not work anymore, because the
next element has been deleted.

(4) Server-To-Client notification. Send a message to the
client when changes to a Map are committed.


This sure isn't easy and there is no "right" solution.
The best solution will depend upon the application.


Good luck with your object database!


Best,
Carl
--
Carl Rosenberger
Chief Software Architect
db4objects Inc.
<a rel="nofollow" style='text-decoration: none;' href="http://www.db4o.com" target="_blank">http://www.db4o.com</a>

 >> Stay informed about: Idempotent ODBMS iterators 
Back to top
Login to vote
Paul Chapman

External


Since: Feb 17, 2005
Posts: 3



(Msg. 3) Posted: Sun Feb 20, 2005 7:40 pm
Post subject: Re: Idempotent ODBMS iterators [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Carl,

db4o was one of the sites I looked at when researching this project. Smile

 > Four possible approaches:
 >
 > (1) Read-back-in-time isolation between clients. Every
 > transaction sees all objects in the state they were in
 > when the transaction started.
 >
 > (2) Locking collections. When an iterator is running on
 > a collection, lock it for updates.

Remember, though, that I expect some users will want to run iterations over
days or weeks. Reason: to do low-priority verification of some of the data.
They won't be able to hold a session open all that time, let alone a
transaction. Smile

 > (3) Server-Side iterator failure. Throw an exception
 > back at the client when the client tries to continue
 > an iterator that does not work anymore, because the
 > next element has been deleted.

Sounds like what Windows does when you're trying to do a ScanDisk. Wink

 > (4) Server-To-Client notification. Send a message to the
 > client when changes to a Map are committed.

ATM the client/server comms is synchronous. But I might look at adding
async behaviour later.

It's not an urgent or important problem. Thanks for your help.

Cheers, Paul
 >> Stay informed about: Idempotent ODBMS iterators 
Back to top
Login to vote
Paul Chapman

External


Since: Feb 17, 2005
Posts: 3



(Msg. 4) Posted: Tue Feb 22, 2005 8:40 pm
Post subject: Re: Idempotent ODBMS iterators [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Carl,

If you have a chance to look at this, be my guest. Anyone is free to
comment, of course.

 > (1) Read-back-in-time isolation between clients. Every
 > transaction sees all objects in the state they were in
 > when the transaction started.

I've been looking at this in more detail, not particularly from the point of
view of iterators, but as a general approach to concurrent database sessions
belonging to different clients.

Each client is connected to the database via a "session" which is maintained
by the server.

The session is itself an object in the database, having a path from the
root. The session maintains as a field a set of all of the objects the
client has been told about (in this session). This ensures that the
concurrent garbage collector doesn't throw away objects which the client has
created, but hasn't yet stored in fields of other objects in the database
reachable from the root: all objects known to all logged-on client are safe
from GC.

A client (person, local process, remote batch program, etc) may interrogate
the database in two ways: (1) "browsing", during which other clients may be
modifying the database concurrently, and so successive enquiries may see
changing values, and (2) "transacting", which is the only time mutating
requests may be made, and during which the database will appear to the
transacting client to be exclusive. A client is browsing until a transation
is started, then is transacting until the transaction is either committed or
abandoned, then is browsing again, etc.

All changes made during a transaction are visible only to the client
performing the transaction. When a transaction is committed, all changes
requested by the client are made in a single, failproof, atomic action by
the server.

Things get interesting when the following sequence of events takes place:

(1) Client A begins a transaction
(2) Client A requests some info about object X
(3) Client B begins a transaction
(4) Client B requests that object X be mutated
(5) Client B commits
(6) Client A requests some more information about object X
(7) Client A requests that object X be mutated
(Cool Client A commits

Client A must see the same X in steps (2) and (6), despite the fact the B
has mutated it. When client A commits, B's changes to X will be wiped out,
but that's what we'd expect.

OK, so that's the theory. How is it implemented in practice? Forgive me,
but I haven't studied ODBMS. What I now present (if correct!) is probably
well-known technology, but I've always enjoyed algorithmics, and so it's fun
to try to find my own solutions.

A client knows an object by a handle. The same handle always refers to the
same object (during a session, anyway -- I haven't yet decided whether
handles can change in the long term).

During a transaction, the server session object for that client maintains a
"substition map" (SM), initially empty. Whenever the client make a mutating
request, the server makes a copy of the object to be mutated (if it hasn't
already), and mutates the copy. It then stores an entry in the SM whose key
is the handle of the original object, and whose value is handle of the new
object.

All handles referred to by the client in requests during a transaction are
"translated" via the SM if necessary, so that the requests are actually
carried out on the copies.

If the transaction is abandoned, the copies are simply thrown away, and the
SM cleared. (The copies will probably hang around until the GC discards
them, although I might try and improve that later -- when I write the GC!)

If the transaction is committed, the server atomically *exchanges the
identities of each of the key-value pairs in the SM*, and then clears the
SM. That is to say, for an entry
handle-of-X : handle-of-X-copy
the server makes the first handle point to the mutated copy and the second
to the original. (This is a mutual become: in Smalltalk language.)

But that isn't quite the end of the story.

If some other client A is mid-transaction when a transaction by client B is
committed, client A may perceive a change in the database, which isn't
allowed. In particular, any object which A *hasn't* yet mutated will appear
to have changed when B's transaction exchanges identities.

So before B exchanges each pair of identities in its SM, it must show its SM
to all other sessions with transactions in progress. Each of those session
scans B's SM, and adds any entry whose key is found in its
"user-knows-about-this-object" set but not among the keys of its own SM to a
secondary substitution map.

Ie, if B's SM contains
handle-of-X : handle-of-X-copy
A adds
handle-of-X : handle-of-X-copy
to its secondary substition map. Then when B exchanges identities of X and
X-copy, A's entry will effectively say:
handle-of-X-copy : handle-of-X
which will ensure that if client A uses the only handle to X it knows about
(the first), the server will ensure that the *original* is still referred
to.

I need to work out some details yet about what happens to entries in the
secondary list when A attempts to mutate them, and also what happens if yet
another session attempts to modify X: some some of transitive operations are
going on here. Smile

Does all this seem familiar? Does it look right?

Cheers, Paul
 >> Stay informed about: Idempotent ODBMS iterators 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
oops help - Question 1:- Describe the following with the help of examples: 1)Generalization and its role in Inheritance. 2)abstract Classes 3)State Diagrams. 4)OMT and its impact in programming. 5)future of Object Oriented languages. Qusetion2: -Identify the..

Modelling Books (with XDb2) -

Help Understanding OODBMS' - <font color=purple> ;> 3) Are relationships recorded in the same way as in an RDB - 1:M and</font> <font color=purple> ;> M:M? I assume that program code has to place the IDs in a "table" to</font> ...

CfP Reminder: The Second Scala Workshop - Scala Days 2011 - The Second Scala Workshop ========================= Call for Papers --------------- Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates..

CGVCVIP 2011 (Rome, Italy) - 1st call extension: until 28 .. - Apologies for cross-postings. Please send to interested colleagues and students -- CALL FOR PAPERS - Deadline for submissions (1st call extension): 28 March 2011 -- IADIS INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, VISUALIZATION, COMPUTER VISION..
   Database Help (Home) -> Object-Oriented All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]