Google Wave’s Federation Protocol Under the Hood, Part 5

Posted Feb 22, 2010 by Anthony in Architecture, Blogs, Google Wave


Purpose [Updated 4/3/2010]


This is the fifth and final post in the series dedicated to introducing some of the internals of the FedOne protocol.  As you probably realize, FedOne does not store wavelets to a persistent state.  The goals of this post are to illustrate where the in-memory store takes place and highlight the information that should be captured in order to restore the state of a wave upon the restart of a server.  

The documentation on the Fed protocol states that local AND remote wavelets are all stored in the wave server’s persistent wave store.  In addition to persisting hosted and remote wavelets, the wave server must also maintain additional information related to its clients (covered later).  I have provided the following graph to illustrate the structure of the in-memory wavelet storage containers:



The process of retrieving a wavelet container is initiated in   fedone.waveserver.WaveServerImpl.getOrCreateLocalWavelet() and fedone.waveserver.WaveServerImpl.getOrCreateRemoteWavelet().  These methods call localWaveletContainerFactory.create(waveletName) and remoteWaveletContainerFactory.create(waveletName) respectively, which are defined by the fedone.waveserver.ServerModule() Guice Module.  This module simply calls the constructors of the fedone.waveserver.LocalWaveletContainerImpl and fedone.waveserver.RemoteWaveletContainerImpl classes.  Each of these constructors in turn calls the super constructor of the fedone.waveserver.WaveletContainerImpl class.

Most of the work of persistence can be narrowed down to the fedone.waveserver.WaveletContainerImpl class.  Specifically, I suggest checking the persistent store as part of the constructor process, such that if the passed WaveletName is contained in the store, the applied and transformed deltas will be retrieved to seed the Wavelet Container.  Wavelet operations can be commited to the store in the commitAppliedDelta() method.  Updates in these sections will apply to the fedone.waveserver.LocalWaveletContainerImpl and fedone.waveserver.RemoteWaveletContainerImpl, which both call the fedone.waveserver.WaveletContainerImpl.commitAppliedDelta(). It will be necessary to update these two classes to only invoke the commit method when a commit-notice is received. It will then be necessary to maintain a set of pending deltas that correspond to wavelet updates that will be dequeued when the corresponding commit-notices are sent (note that this is one strategy of many possibilities).

It should be noted that in FedOne the RemoteWaveletContainerImpl maintains a set of pending deltas. This set is populated when it receives a waveletUpdate from a remote Host server.  However, commitAppliedDelta() is invoked prior to receiving the the actual commit-notice and the newly applied delta is removed from the queue of pending deltas.  When the commit-notice is received by wave.federation.xmpp.XmppFederationRemote.update() it is passed through to the client via fedone.waveserver.ClientFrontendImpl.waveletCommitted(). Obviously, commit-notice must be handled differently in production. This is necessary as the wavesandbox currently supports the sending and queuing of hosted wavelets prior to committing and sending the commit-notice.

The FedOne Host server currently has no mechanisms in place to first queue deltas then commit them at some later time.  When a delta is submitted to the LocalWaveletContainer, it is immediately committed through the call to commitAppliedDelta() (please note that it is not actually committed, but it is treated as such and a commit notice is sent to Remote servers as soon as the deltas are applied).  The reason wavesandbox queues updates is to optimize latency, however, this does cause a number of issues that are out of the scope of this post.  To read more about commit-notice and its implications on the wave server, I suggest following this thread on the google group.

To properly restore the WaveServer at startup, it is also necessary to add persistence to the implementation of the wave.crypto.CertPathStore.  The wave.crypto.DefaultCertPathStore notes that this default implementation is in-memory only and will lose certificate chains when the server is shut down.

Finally, we must provide some persistence for the client. Specifically, we must store information related to the Index Wave.  The Index Wave is a lookup that contains all waves a participant has access to.  Conceptually, the Index Wave can be thought of as follows:




When a user connects to the client, a call is made to fedone.waveserver.ClientFrontendImpl.openRequest() with the Index Wave ID (specified in fedone.common.CommonConstants.INDEX_WAVE_ID).  This will request history on all wavelets associated with the Index Wave for the user.  As illustrated in the above graphic, the Wavelet ID of the Index Wave is actually a reference to a Wave ID to which the user is a participant.

In all, the following information should be persisted: The Wavelet identification, Applied Wavelet Deltas, Transformed Wavelet Deltas, Certificate Store, and Index Wave information.  These can either be stored in-place (for instance, as a ProtoBuf wrapped object) or the data can be extracted and saved in its native format.  

Objects decomposed into native representations (see wave.protocol.common.proto for ProtoBuf based objects)

The following is a breakdown of the ProtocolBuffer objects used in FedOne.  The syntax is <name>/<type>: { <internals of structure> }.

ProtocolAppliedWaveletDelta: {


    signed_original_delta/ProtocolSignedDelta: {


        delta/bytes(ProtocolWaveletDelta): {


            hashed_version/ProtocolHashedVersion: {


                version/int64


                ,history_hash/bytes    
   
            } //end ProtocolHashVersion


            ,author/String


            ,operation/ProtocolWaveletOperation: {


                add_participant/String


                ,remove_participant/String


                ,mutate_document/MutateDocument (see reference for details)


                ,no_op/bool
            
            } //end ProtocolWaveletOperation


            ,address_path:String


        } //end of ProtocolWaveletDelta


        ,signature/ProtocolSignature: {


            signature_bytes/bytes


            ,signer_id/bytes(ProtocolSignerInfo): {


                 /*The certificates present in a ProtocolSignerInfo are encoded in PkiPath format, and then hashed into signer_id using the hash algorithm indicated in the ProtocolSignerInfo*/
                
                hash_algorithm/enum


                ,domain:String


                ,certificate:bytes


            } //end of ProtocolSignerInfo


            ,signature_algorithm/enum


        } //end ProtocolSignature


    } //end ProtocolSignedDelta


    ,hashed_version_applied_at/ProtocolHashVersion: {


        version/int64


        ,history_hash/bytes


    } //end ProtocolHashVersion


    ,operations_applied/int32


    ,application_timestamp/int64


} //end ProtocolAppliedWaveletDelta


ProtocolWaveletDelta for transformedProtocolDelta. (The number of operations applied as referenced in the AppliedDelta above reference the operations contained in this transformedProtocolDelta):


{ //same as above. repeated here for ease of reference


    hashed_version/ProtocolHashedVersion: {


        version/int64


        ,history_hash/bytes    
   
    } //end ProtocolHashVersion


    ,author/String


    ,operation/ProtocolWaveletOperation: {


        add_participant/String


        ,remove_participant/String


        ,mutate_document/MutateDocument (see reference for details)


        ,no_op/bool
            
    } //end ProtocolWaveletOperation


    ,address_path:String


} //end of ProtocolWaveletDelta


WaveletName: {


    waveId/WaveId: {


        domain/String


        ,id/String


    } //end WaveId


    ,waveletId/WaveletId: {


        domain/String


        ,id/String


    } //end WaveletId


} //end WaveletName


<? implements CertPathStore> {


    public SignerInfo get(byte[]);


    public void put(ProtocolSignerInfo);


}


Wrap Up

I hope this series has been a useful companion to the FedOne documentation.  When you really dig into the code as a potential contributer you realize that there is a very complex (yet elegant) architecture under the hood.  There was much not covered in this series like the implementation of the OT algorithms or the current (soon to be replaced) client-server protocol.  This will not be my last post about the wave protocol, but I figured the 5 posts was enough for an introduction/overview.

Further Investigation

The best place to go for more in-depth information is the code itself.  I would also HIGHLY recommend joining the Google User Groups for the FedOne project.  There is a wealth of information being shared there on a daily basis.

A few links if you’re just getting started:

Installation Instructions [http://code.google.com/p/wave-protocol/wiki/Installation]
Installation Instructions for Windows users [http://jamespurser.com.au/blog/How_To_-_Install_WRS_On_Windows]
Wave Protocol Google Group [http://groups.google.com/group/wave-protocol/]
Wave Protocol Code Google Group (more focused on code reviews and repository updates) [http://groups.google.com/group/wave-protocol-code-discuss]

Related posts:

  1. MongoWave: Persistence on Google FedOne Wave Server with mongoDB Purpose My company, SESI, has been working on applying...
  2. Google Wave’s Federation Protocol Under the Hood, Part 4 Purpose [Updated 4/3/2010] This is the fourth post in...
  3. Google Wave’s Federation Protocol Under the Hood, Part 3 Purpose [Updated 4/3/2010] This is the third post in...
  4. Google Wave’s Federation Protocol Under the Hood, Part 2 Purpose [Updated 4/3/2010] This post is the second in a...
  5. Google Wave’s Federation Protocol Under the Hood, Part 1 Purpose [Updated 4/3/2010] This post is the first in...

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , ,

Leave a Reply

Subscribe without commenting