The Source for Java Technology Collaboration


CVM WeakRefs

What are weakrefs?

Weakrefs are references to objects that won't necessarily keep a Java object alive. If there are other strong/normal references to the object, then the object will remain alive during GC, and the weakref will continue to point to it. If there no strong/normal references to the object during a GC, then the weakref may be nullified, and the object may be collected.

Alternatively, instead of nullifying the weakref and collecting the object immediately, the weakref may undergo special handling. An example of this is FinalReference? which is used to ensure that objects get finalized once they are no longer reachable from strong/normal references.

How weakrefs work?

There are 5 containers for the 5 strengths of weak references:

  1. CVMglobals.discoveredSoftRefs queue for instances of SoftReference?
  2. CVMglobals.discoveredWeakRefs queue for instances of WeakReference?
  3. CVMglobals.discoveredFinalRefs queue for instances of FinalReference?
  4. CVMglobals.discoveredPhantomRefs queue for instances of PhantomReference?
  5. CVMglobals.weakGlobalRoots stack for JNI weakrefs

The first 4 correspond to the respective java.lang.ref.* subclasses of java.lang.ref.Reference. The 5th corresponds to the JNI weakrefs that are allocated and deallocated. Hence it is not really a stack. Unlike the other references, the JNI weakrefs have no Java object instances associated with it. JNI weakrefs are not part of the Java language specification but are part of the JNI specification. Its behavior is similar to that of WeakReference? except that it's not associated with any Reference object.

The idea of weakrefs is essentially that a weakref can reference an object but the garbage collector may still choose to nullify this reference and collect the referent object (object being referred to) under certain conditions.

How does Garbage Collection handle weakrefs?

There are 3 main functions in the weakrefs code that the GC calls:

  1. CVMweakrefDiscover?()
  2. CVMweakrefProcessNonStrong?()
  3. CVMweakrefFinalizeProcessing?()

CVMweakrefDiscover()

During a GC cycle, as GC scanning finds objects that are live, GC checks if the object being scanned is a subclass of Reference. If so, GC calls this function to declare that a Reference has been found. Every instance of Reference (and its subclasses) has 2 important fields:

  1. next
  2. referent

CVMweakrefDiscover?() is only called on a Reference object if its next field is NULL. This indicates that the Reference object is not in any queue. CVMweakrefDiscover?() checks to see if the referent field is NULL. If it is, then this is a NULL Reference. Hence, there is no garbage collection activity that needs to be performed on it. If the referent is not NULL, the object is added to the queue for its type.

The act of enqueuing will set the next field in the Reference to a non-NULL value. The next field is used as the link in the queue. The queue is not NULL terminated. The last element in the queue will point to itself in its next field. This ensures that no Reference that has been enqueued will have a NULL in its next field. This is also used to demark that the Reference has already been discovered.

Note: During live object scanning, GC will ignore the referent and next fields of Reference objects (see CVMobjectWalkRefsAux?()). This is because the GC doesn't know yet how the Reference wants to treat these references. These fields are scanned later as needed in CVMweakrefProcessNonStrong?().

Note: Since CVMweakrefDiscover?() is called on object instances, it does not apply to JNI weakrefs.

Note: Before GC begins, the 4 Reference object queues are NULL and empty. During live object scanning, Reference instances are added to these queues.

CVMweakrefProcessNonStrong()

At some point when GC has found all the objects that it thinks are live by reachability, it will call CVMgcProcessSpecialWithLivenessInfo?() (or its equivalent) which calls CVMweakrefProcessNonStrong?().

CVMweakrefProcessNonStrong?() will iterate through the 4 ref queues and determine whether to keep the References' referent objects alive or not. The decision criteria varies based on the type of Reference object.

If the referent object is to be kept alive, it will also call back into the GC to scan the network of objects that would be kept alive by the referent. The following is how CVMweakrefProcessNonStrong?() works in detail:

  1. It iterates through the 4 queues. For each enqueued Reference, it does the following.

  2. It checks with the GC if the referent object is being kept alive by hard references (previously determined by a GC root scan). If so, it enqueues the Reference object in the CVMglobals.deferredWeakrefs queue, and it also calls back into the GC to scan the referent object and its sub-network of objects (i.e. keeping this network of objects alive if GC hasn't already done so previously).

    Note: Queueing the Reference in CVMglobals.deferredWeakrefs is essentially preparing for the finalize phase to restore the Reference to its original state before the GC i.e.

    1. not queued in any queues
    2. the next field is NULL.
    3. the referent field remains pointing to the object. The referent will now be scanned by the GC transitively to keep its sub-network of objects alive.

  3. If there were no hard references to the referent, then the weakref gets to determine if it wants to keep the referent object alive. Depending on the type of Reference, it calls the following handlers to check if it wants to keep the referent alive:

    Ref Type handler
    SoftReference? CVMweakrefClearConditional?()
    WeakReference? CVMweakrefClearUnconditional?()
    FinalReference? CVMweakrefReferentKeep?()
    PhantomReference? CVMweakrefReferentKeep?()

    In the current implementation, CVMweakrefClearConditional?() simply calls CVMweakrefClearUnconditional?().

    CVMweakrefClearUnconditional?() euqueues the Reference in CVMglobals.deferredWeakrefsToClear.
    Note: Queueing the Reference in CVMglobals.deferredWeakrefsToClear is essentially preparing for the finalize phase to nullify the Reference's referent field, and enqueue the Reference in the pending queue.

    CVMweakrefReferentKeep?() enqueues the Reference in CVMglobals.deferredWeakrefsToAddToPending.
    Note: Queueing the Reference in CVMglobals.deferredWeakrefsToAddToPending is essentially preparing for the finalize phase to enqueue the Reference in the pending queue.

CVMweakrefFinalizeProcessing()

CVMweakrefFinalizeProcessing?() carries out the work of:

  1. Executing the respective actions on the References enqueued in CVMglobals.deferredWeakrefs, CVMglobals.deferredWeakrefsToClear, and CVMglobals.deferredWeakrefsToAddToPending as described above.
    Note: References from CVMglobals.deferredWeakrefsToClear and CVMglobals.deferredWeakrefsToAddToPending gets enqueued in the the pending queue.

  2. Nullifies all JNI weakrefs if their referent object isn't being kept alive by either a hard reference or a Reference object.

  3. Reset the 4 Reference discovery queue i.e. returning to their initial empty state prior to the GC cycle.

  4. CVMweakrefHandlePendingQueue?() is called to scan the References in the pending queue with the GC transitive scanner. This is redundant because each of the referent objects were already scanned in CVMweakrefDiscoveredQueueCallback?() when the References were being enqueued in deferredWeakrefsToClear and deferredWeakrefsToAddToPending.

    This is unless CVMweakrefFinalizeProcessing?() is also expected to update the object pointers in the next and referent fields of the references. This is dependent on the GC algorithm. In that case, the GC transitive scanner will update the object pointer.

  5. CVMweakrefUpdate?() is called with the transitiveScanner to update the object pointers which have been moved. There won't actually be any transitive scanning because all dead references are nullified in step 2 above, and all live references are due to the existence of strong references, which in turn means that the object has already been transitively scanned before weakrefs are processed.

    Like step 4, this is not needed unless CVMweakrefFinalizeProcessing?() is also expected to update the object pointers in the next and referent fields of the references. This is dependent on the GC algorithm. In that case, the GC transitive scanner will update the object pointer.

Hence, gcOpts->isUpdatingObjectPointers is provided to allow the GC to bypass executing step 4 and 5 when the GC algorithm does not need it.

Aborting a GC and restoring weakrefs to a consistent state (resetting):

It is assumed that aborting GC means that no object motion has occurred i.e. previous object pointer values are still valid. GC aborts can happen using a setjmp/longjmp mechanism. Hence, the abort can happen in the midst of a callback function to GC. We have to be careful that we leave the system in a consistent state (that we can clean up after) before calling back into GC code.

Phase 1: Weakref Discovery
If GC is aborted while weakrefs are being discovered, then there may be some Reference objects in the 4 weakref queues. To reset these, set their next field back to NULL.

Phase 2: ProcessNonStrong
During this phase, References were being moved from the 4 weakref queues to the deferred queues. References in the deferred queues could be easily restored to their active state by setting their next field to NULL.

Phase 3: FinalizeProcessing
At this point, it is assumed that GC will not abort anymore. This phase actually changes the state of the References in an irreversible way i.e. references cannot be reset once we get to this phase.

The work in Phase 1 and 2 basically moved References between queues. This movement need to be done in such a way that will not leave the queues in a partially modified (i.e. corrupted) state should GC chooses to abort. This ensures that we can reset the references in an abort.

Note: Part of the work of supporting GC aborts is in the use of CVMweakrefIterateQueue?() instead of CVMweakrefIterateQueueWithoutDequeueing?(). CVMweakrefIterateQueue?() will dequeue the reference it is iterating on before calling its callback function. It is assumed that the callback function will either enqueue the reference onto another queue or set its "next" field to NULL before going on to call back into GC code which can abort. Together, both these actions of CVMweakrefIterateQueue?() and its callback functions ensure that the queues being operated on are left in a consistent state should the GC choose to abort. Consistency here means that a reference will not appear on more than one queue when we have to handle clean up for GC abort.

-- Main.prasadsanagavarapu - 12 Nov 2006

Topic PhoneMEAdvancedCVMWeakRefs . { Edit | Ref-By | Printable | Diffs r1 | More }
 XML java.net RSS

Revision r1 - 15 Apr 2005 - 22:24:42 - Main.prasadsanagavarapu
Parents: WebHome > PhoneMEAdvanced