The Source for Java Technology Collaboration


Home | Changes | Index | Search | Go

(DRAFT) Wonderland Client Asset Caching and Database

This document describes the implementation architecture and design of the client-side caching of assets and the database that stores entries in the cache. The asset cache is simply a collection of files in a directory hierarchy on a user's local disk. Each file in the cache has a corresponding entry in an embedded Java DB (aka Apache Derby) database, also found on a user's local disk.

Cache and Database Location

The location of the asset cache and database resides on a user's local disk, typically in a user's home directory, and configured via the ${wonderland.user.dir} (or some such) property. Beneath this directory, the asset cache and database define the following directory structure:

v2/
 |---------- AssetDB/
 |---------- cache/

where the v2/ directory reflects the version of the asset cache and database implementation. This additional directory structure was added in this release so that previous or future asset caches may exist concurrently and not interfere with this version's asset cache. The version number is defined within the asset cache source code.

Also, in this release, the ${wonderland.cache.dir} and ${wonderland.derby.dir} configuration parameters are ignored.

Asset Database

Every asset in the cache has a corresponding entry in the asset database. The asset database is simply an embedded Java DB (aka Apache Derby) database that is created by the Wonderland client if it does not yet exist. It is placed within the AssetDB/ directory. To check whether an asset exists in the cache, the Wonderland client first queries the asset database. When the Wonderland client adds an asset to the cache, it also adds a corresponding entry in the database; when the asset is later removed, its database entry is also removed.

The asset database consists of one table. Its structure is as follows (Table 1):

Table 1. The columns in the asset database table APP.ASSET

Column Name Data Type Description Properties
ASSET_URI String (8192 max) The URI that describes the asset. The format of this URI is described below. Non-null primary key
CHECKSUM String (40 max) A string encoding of the asset checksum. Non-null primary key
URL String (8192 max) The base URL of the repository from which the asset was obtained.  
TYPE String (10 max) The asset type: IMAGE, MODEL, FILE, OTHER.  
LAST_ACCESSED BigInt (Long) The time (in milliseconds since the epoch) the asset was last added, accessed, or updated.  
SIZE BigInt (Long) The size (in bytes) of the asset in the cache.  

Each entry in the database has two primary keys: the ASSET_URI and the CHECKSUM. Assets in the cache, therefore, are uniquely identified by an (ASSET_URI, CHECKSUM) pair. This allows the cache to store different versions of the same asset at the same time: each version of the asset may have the same ASSET_URI, but a different CHECKSUM.

Asset URI

Every asset is defined by a URI that describes where the asset comes from. In Wonderland v0.5, assets may belong to modules that are installed on a Wonderland server -- these assets may be served by a number of asset repositories located over the Internet. The identity of an asset is tied to its module, and not the asset server from which it was downloaded--even though a Wonderland client may download an asset from one of a number of different asset servers, it is still the "same" asset.

The format of the URI describing an asset belonging to a module is:

wlm://<module name>/<asset path>

where <module name> is the name of the module and the <asset path> is the relative path of the asset within the module. Module names are globally unique and the asset path is unique only within its module.

Assets do not necessarily need to be associated with a module: an asset, for example, may be one explicit copy of the asset located over the Internet, for example, a document stored on the web. In this case, the asset URI may be a URL, for example:

http://docs.sun.com/app/docs/doc/819-1771-24.pdf

Finally, assets may belong to the "system-wide" asset repository. This mechanism exists for backward compatibility to Wonderland v0.3 and v0.4. In these versions, the base URL of the asset server is defined by a run-time property; each asset URI is specified as a relative path beneath the base URL of the asset server, for example:

models/mpk20.jme.gz

Asset checksums

Whether a cached asset is used or whether an asset is download fresh from an asset server depends (in part) upon whether the checksum of the asset currently cached matches the checksum of the asset currently desired. The checksum is a hex string-encoded representation of the SHA-1 hash of the asset's contents (although the specific hash algorithm used by the implementation is not a key detail).

An asset, therefore, is uniquely identified by the (Asset URI, Checksum) pair. This allows different versions of the same asset to exist within the client-side cache. This is useful in the following example: suppose a user is teleporting between two Wonderland servers that have similar worlds, except one has a more recent version of a module installed that includes updated artwork (with different checksums from the assets in the other world). By uniquely identifying an asset by the (Asset URI, Checksum) pair, the assets on each server are identified to be distinct (even though they have the same asset URI) and both may be cached on a user's lock disk at the same time to avoid downloading each asset after every teleport.

Cache size and Least-Recently Used (LRU) replacement scheme

To prevent the asset cache from growing too large, the size of the cache has an upper limit, currently hard-coded in the asset management source code. When the client-side asset manager attempts to add a new asset and finds the asset cache near its maximum size, it frees up space in the asset cache by removing the "oldest" entries until it has enough room to add the new asset.

To help implement this scheme, the size of each asset is stored in the database (SIZE column). This size is computed when the asset is first downloaded and cached. The overhead of maintaining the directory structure of the cache is not included in the size calculation--the maximum cache size, therefore, should not be considered a strict limit. The total size of the cache is computed via the SQL SUM() function.

Each cache entry in the database also maintains the date and time (in milliseconds since in the epoch) the asset was last accessed (LAST_ACCESSED column). An asset is accessed when: it is first added to the cache, when it is updated in the cache (e.g. if an asset is forceably re-downloaded), or when the asset is read from the cache.

Entries from the asset cache are removed only when a new entry must be added and there is not enough room to do so. In such a case, the asset with the smallest (i.e. oldest) "last accessed" value is removed and the total size of the cache is recomputed. If there is still not enough room in the cache for the new asset, the asset with the next smallest "last accessed" value is removed. This process repeats until there is enough room in the asset cache for the new asset.

Asset Cache

The cached assets are stored beneath the cache/ directory, where each unique asset can be located knowing only the asset URI and checksum. The cache/ directory has the following sub-directory structure for assets belonging to a module (e.g. wlm://...), definite assets identified by a URL (e.g. http://....), and assets belonging to the system-wide asset repository (e.g. models/mpk20.jme.gz).

cache/
  |--------- modules/
  |--------- definite/
  |--------- system/

where assets belonging to a module are stored beneath the modules/ directory, assets identified by a definite URL are stored beneath the definite/ directory, and assets belonging to the system-wide asset repository are stored beneath the system/ directory.

Structure of the modules/ directory

Assets that belong to modules are uniquely identified, in part, by the name of the module in which they reside, and the relative path of the module within the module. An asset is also identified by its checksum: slightly different versions of an asset from the same module with the same relative path may both be cached. Therefore, the module name, relative path, and checksum are all part of locating an asset within the cache.

The directory structure beneath the modules/ directory takes the following form:

modules/<module name>/<relative path>/<checksum>

where <module name> is the unique name of the module, <relative path> is the relative path of the asset within the module, and <checksum> is the asset's checksum. For example, suppose two different versions of the same asset exists with the URI: wlm://mpk20/textures/poster.jpg (but have different checksums). The directory structure of the cache would be:

modules/
   |-------- mpk20/
               |-------- textures/
                            |-------- poster.jpg/
                                          |-------- ASD673FLSJKWE432342
                                          |-------- DFGERDIDFGIOCB323ZD

where ASD673FLSJKWE432342 and DFGERDIDFGIOCB323ZD are the checksums of the two different versions and are files. Note that "poster.jpg" is a directory within the cache, not a file.

Structure of the definite/ directory

Since an asset defined by a definite URL is globally unique, the URL defines the directory hierarchy in which the cache file exists. For example, the asset defined by the URL "http://docs.sun.com/app/docs/doc/819-1771-24.pdf" would be stored in "cache/definite/docs.sun.com/app/docs/doc/891-1771-24.pdf".

Structure of the system/ directory

Since only a single, system-wide repository exists, assets that belong to the system-wide asset repository are uniquely identified by the relative path name of the asset within the repository. The directory hierarchy in which the cache file exists is defined by this relative path. For example, the asset defined by the relative path "models/mpk20.jme.gz" would be stored in "cache/definite/models/mpk20.jme.gz".

Topic ProjectWonderlandClientCache . { Edit | Ref-By | Printable | Diffs r2 < r1 | More }
 XML java.net RSS

Revision r2 - 07 Aug 2008 - 15:31:10 - Main.jslott
Parents: WebHome > ProjectWonderland > WonderlandRoadmap > WonderlandReleasepoint5