 |
CVM WeakRefs
What are weakrefs?
Weakrefs are references to objects that won't necessarily keep a Java object
alive. If there are other strong/normal references to the object, then the
object will remain alive during GC, and the weakref will continue to point
to it. If there no strong/normal references to the object during a GC, then
the weakref may be nullified, and the object may be collected.
Alternatively, instead of nullifying the weakref and collecting the object
immediately, the weakref may undergo special handling. An example of this is
FinalReference? which is used to ensure that objects get finalized once they
are no longer reachable from strong/normal references.
How weakrefs work?
There are 5 containers for the 5 strengths of weak references:
- CVMglobals.discoveredSoftRefs queue for instances of SoftReference?
- CVMglobals.discoveredWeakRefs queue for instances of WeakReference?
- CVMglobals.discoveredFinalRefs queue for instances of FinalReference?
- CVMglobals.discoveredPhantomRefs queue for instances of PhantomReference?
- CVMglobals.weakGlobalRoots stack for JNI weakrefs
The first 4 correspond to the respective java.lang.ref.* subclasses
of java.lang.ref.Reference. The 5th corresponds to the JNI weakrefs
that are allocated and deallocated. Hence it is not really a stack.
Unlike the other references, the JNI weakrefs have no Java object
instances associated with it. JNI weakrefs are not part of the Java
language specification but are part of the JNI specification. Its
behavior is similar to that of WeakReference? except that it's not
associated with any Reference object.
The idea of weakrefs is essentially that a weakref can reference an
object but the garbage collector may still choose to nullify this
reference and collect the referent object (object being referred to)
under certain conditions.
How does Garbage Collection handle weakrefs?
There are 3 main functions in the weakrefs code that the GC calls:
- CVMweakrefDiscover?()
- CVMweakrefProcessNonStrong?()
- CVMweakrefFinalizeProcessing?()
CVMweakrefDiscover()
During a GC cycle, as GC scanning finds objects that are live, GC checks
if the object being scanned is a subclass of Reference. If so, GC calls
this function to declare that a Reference has been found. Every instance
of Reference (and its subclasses) has 2 important fields:
- next
- referent
CVMweakrefDiscover?() is only called on a Reference object if its next field
is NULL. This indicates that the Reference object is not in any queue.
CVMweakrefDiscover?() checks to see if the referent field is NULL. If it is,
then this is a NULL Reference. Hence, there is no garbage collection
activity that needs to be performed on it. If the referent is not NULL,
the object is added to the queue for its type.
The act of enqueuing will set the next field in the Reference to a non-NULL
value. The next field is used as the link in the queue. The queue is
not NULL terminated. The last element in the queue will point to itself in
its next field. This ensures that no Reference that has been enqueued will
have a NULL in its next field. This is also used to demark that the Reference
has already been discovered.
Note: During live object scanning, GC will ignore the referent and next
fields of Reference objects (see CVMobjectWalkRefsAux?()). This is
because the GC doesn't know yet how the Reference wants to treat
these references. These fields are scanned later as needed in
CVMweakrefProcessNonStrong?().
Note: Since CVMweakrefDiscover?() is called on object instances, it does not
apply to JNI weakrefs.
Note: Before GC begins, the 4 Reference object queues are NULL and empty.
During live object scanning, Reference instances are added to these
queues.
CVMweakrefProcessNonStrong()
At some point when GC has found all the objects that it thinks are live by
reachability, it will call CVMgcProcessSpecialWithLivenessInfo?() (or its
equivalent) which calls CVMweakrefProcessNonStrong?().
CVMweakrefProcessNonStrong?() will iterate through the 4 ref queues and
determine whether to keep the References' referent objects alive or not.
The decision criteria varies based on the type of Reference object.
If the referent object is to be kept alive, it will also call back into
the GC to scan the network of objects that would be kept alive by the
referent. The following is how CVMweakrefProcessNonStrong?() works in
detail:
- It iterates through the 4 queues. For each enqueued Reference, it
does the following.
- It checks with the GC if the referent object is being kept alive by
hard references (previously determined by a GC root scan). If so, it
enqueues the Reference object in the CVMglobals.deferredWeakrefs queue,
and it also calls back into the GC to scan the referent object and its
sub-network of objects (i.e. keeping this network of objects alive if
GC hasn't already done so previously).
Note: Queueing the Reference in CVMglobals.deferredWeakrefs is
essentially preparing for the finalize phase to restore the
Reference to its original state before the GC i.e.
- not queued in any queues
- the next field is NULL.
- the referent field remains pointing to the object. The referent will now be scanned by the GC transitively to keep its sub-network of objects alive.
- If there were no hard references to the referent, then the weakref
gets to determine if it wants to keep the referent object alive.
Depending on the type of Reference, it calls the following handlers
to check if it wants to keep the referent alive:
| Ref Type | handler |
| SoftReference? | CVMweakrefClearConditional?() |
| WeakReference? | CVMweakrefClearUnconditional?() |
| FinalReference? | CVMweakrefReferentKeep?() |
| PhantomReference? | CVMweakrefReferentKeep?() |
In the current implementation, CVMweakrefClearConditional?() simply calls
CVMweakrefClearUnconditional?().
CVMweakrefClearUnconditional?() euqueues the Reference in
CVMglobals.deferredWeakrefsToClear.
Note: Queueing the Reference in CVMglobals.deferredWeakrefsToClear is
essentially preparing for the finalize phase to nullify the
Reference's referent field, and enqueue the Reference in the
pending queue.
CVMweakrefReferentKeep?() enqueues the Reference in
CVMglobals.deferredWeakrefsToAddToPending.
Note: Queueing the Reference in CVMglobals.deferredWeakrefsToAddToPending
is essentially preparing for the finalize phase to enqueue the
Reference in the pending queue.
CVMweakrefFinalizeProcessing()
CVMweakrefFinalizeProcessing?() carries out the work of:
- Executing the respective actions on the References enqueued in
CVMglobals.deferredWeakrefs, CVMglobals.deferredWeakrefsToClear, and
CVMglobals.deferredWeakrefsToAddToPending as described above.
Note: References from CVMglobals.deferredWeakrefsToClear and
CVMglobals.deferredWeakrefsToAddToPending gets enqueued in the
the pending queue.
- Nullifies all JNI weakrefs if their referent object isn't being
kept alive by either a hard reference or a Reference object.
- Reset the 4 Reference discovery queue i.e. returning to their
initial empty state prior to the GC cycle.
- CVMweakrefHandlePendingQueue?() is called to scan the References in the
pending queue with the GC transitive scanner. This is redundant because
each of the referent objects were already scanned in
CVMweakrefDiscoveredQueueCallback?() when the References were being
enqueued in deferredWeakrefsToClear and deferredWeakrefsToAddToPending.
This is unless CVMweakrefFinalizeProcessing?() is also expected to update
the object pointers in the next and referent fields of the references.
This is dependent on the GC algorithm. In that case, the GC transitive
scanner will update the object pointer.
- CVMweakrefUpdate?() is called with the transitiveScanner to update the
object pointers which have been moved. There won't actually be any
transitive scanning because all dead references are nullified in step 2
above, and all live references are due to the existence of strong
references, which in turn means that the object has already been
transitively scanned before weakrefs are processed.
Like step 4, this is not needed unless CVMweakrefFinalizeProcessing?()
is also expected to update the object pointers in the next and referent
fields of the references. This is dependent on the GC algorithm. In
that case, the GC transitive scanner will update the object pointer.
Hence, gcOpts->isUpdatingObjectPointers is provided to allow the GC to
bypass executing step 4 and 5 when the GC algorithm does not need it.
Aborting a GC and restoring weakrefs to a consistent state (resetting):
It is assumed that aborting GC means that no object motion has occurred
i.e. previous object pointer values are still valid. GC aborts can happen
using a setjmp/longjmp mechanism. Hence, the abort can happen in the
midst of a callback function to GC. We have to be careful that we leave
the system in a consistent state (that we can clean up after) before
calling back into GC code.
Phase 1: Weakref Discovery
If GC is aborted while weakrefs are being discovered, then there may
be some Reference objects in the 4 weakref queues. To reset these,
set their next field back to NULL.
Phase 2: ProcessNonStrong
During this phase, References were being moved from the 4 weakref queues
to the deferred queues. References in the deferred queues could be easily
restored to their active state by setting their next field to NULL.
Phase 3: FinalizeProcessing
At this point, it is assumed that GC will not abort anymore. This phase
actually changes the state of the References in an irreversible way i.e.
references cannot be reset once we get to this phase.
The work in Phase 1 and 2 basically moved References between queues.
This movement need to be done in such a way that will not leave the
queues in a partially modified (i.e. corrupted) state should GC chooses
to abort. This ensures that we can reset the references in an abort.
Note: Part of the work of supporting GC aborts is in the use of
CVMweakrefIterateQueue?() instead of
CVMweakrefIterateQueueWithoutDequeueing?(). CVMweakrefIterateQueue?() will
dequeue the reference it is iterating on before calling its callback
function. It is assumed that the callback function will either enqueue the
reference onto another queue or set its "next" field to NULL before going
on to call back into GC code which can abort. Together, both these actions
of CVMweakrefIterateQueue?() and its callback functions ensure that the
queues being operated on are left in a consistent state should the GC
choose to abort. Consistency here means that a reference will not appear
on more than one queue when we have to handle clean up for GC abort.
-- Main.prasadsanagavarapu - 12 Nov 2006
|