Scatter Gather System Demo
Overview
The Scatter Gather Demo showed how to start up the scatter gather components by hand to provide a number of worker components to which a client submits a number of jobs via a java space. This is useful up to a point but does not address issues of scaling or reliability. To start to build these properties into the architecture we will make use of the concept of a system of components.
This demo makes use of system managers, provisioners and containers, for more information on the relationships between these services see System Managers, Provisioners and Containers.
Discussion
The following is the xml markup to define a scatter gather worker system:
<?xml version="1.0"?>
<system name="scattergather-workers" boundary="fabric">
<description>An system that sets up one worker for every container in the grid</description>
<system.composite name="worker" bundle="sg-worker-bundle" template="workerTemplate" version="0.1">
<replicator name="scale" />
</system.composite>
<system.composite name="space" bundle="blitz-bundle" template="blitzTemplate" version="0.1" />
<replication.handler name="scale" type="org.cauldron.newton.system.replication.ScalableReplicationHandler">
<config name="scaleFactor" value="1" type="float"/>
</replication.handler>
</system>
This defines an system that contains one Blitz JavaSpace and a scalable set of workers. In this case the workers will scale to match the number of available containers in the network. Note the boundary="fabric" attribute, see Boundary for more information on system boundaries.
In order to test this system we will use five containers in separate JVM's (possibly on different machines - it is up to you, though bear in mind Multicast visibility). Below are listed the containers we will use and the components that will be installed on them.
- Container 1: Boot Container
- Reggie
- CDS Server
- Remote System CLI
- Remote Provisioner CLI
- Container 2: Fabric System Manager
- Remote System Manager
- Remote Container Registry
- Container 3: Fabric Provisioner
- Remote Provisioner
- Remote Container Registry
- Container 4: Fabric Container
- Remote Container
- Container 5: Fabric Container
- Remote Container
If you're running on the same machine, you should start each container with a different -instance=N argument. i.e. bin/container -instance=1, bin/container -instance=2, etc.
Running the demo
So let's get started:
Container 1
$ bin/container -instance=1 ... > installer install etc/instances/reggie.composite ... > installer install etc/instances/server-cds.composite ... > installer install etc/instances/remote-manager-cli.composite ... > installer install etc/instances/remote-provisioner-cli.composite ... > cds publish remote etc/cds/jini-service-bundles.xml > cds scan remote examples/scattergather/build/lib
This sets up reggie, the server cds and cli components in container1 and scans in the jini service and demo component bundles into the remote cds repository.
Container 2
$ bin/container -instance=2 ... > installer install etc/instances/remote-registry.composite > system manage etc/systems/remote-manager.system.xml
This sets up a remote registry and the components that make up the system manager.
Container 3
$ bin/container -instance=3 ... > installer install etc/instances/remote-registry.composite > system manage etc/systems/remote-provisioner.system.xml
This sets up a remote registry and a remote provisoner components.
Now to submit the scatter gather worker system.
Container 1
> remote-system manage examples/scattergather/build/etc/worker.system
In this case you have submitted the system but there are no containers on which to install the components, so the system waits. Lets start up a container.
Container 4
$ bin/container -instance=4 ... > system manage etc/systems/remote-container.system.xml
The system will notice the new remote container and install the blitz instance and one worker to it. You can check this by using the remote provisioner cli:
Container 1
> remote-provisioner status
Now lets add a second container:
Container 5
$ bin/container -instance=5 ... > system manage etc/systems/remote-container.system.xml
Again the system will notice and install a second worker instance to the container. You can now submit jobs via the client to both of these workers.
Container 1
> installer install examples/scattergather/build/etc/client.composite > installer install examples/scattergather/build/etc/cli.composite > scattergather submit
Now to test resilience. You can stop one of the containers and the system components will be dynamically reprovisioned to the other container. Lets stop the first container that is hosting blitz:
Container 4
> system retire etc/systems/remote-container.system.xml
You should see blitz startup on container 5. If you restart container 4 you should see a second worker installed.
When you are finished with the scatter gather components you can remove them from the containers using the following commands:
Container 1
> installer uninstall examples/scattergather/build/etc/cli.composite > installer uninstall examples/scattergather/build/etc/client.composite > remote-system retire examples/scattergather/build/etc/worker.system


