I - Vertex idV - Vertex dataE - Edge datapublic class BspServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable> extends BspService<I,V,E> implements CentralizedServiceMaster<I,V,E>, ResetSuperstepMetricsObserver
CentralizedServiceMaster.| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_INPUT_SPLIT_THREAD_COUNT
Default number of threads to use when writing input splits to zookeeper
|
static int |
MAX_PRINTABLE_REMAINING_WORKERS
Print worker names only if there are 10 workers left
|
static String |
NUM_MASTER_ZK_INPUT_SPLIT_THREADS
How many threads to use when writing input splits to zookeeper
|
APPLICATION_ATTEMPTS_DIR, applicationAttemptsPath, BASE_DIR, basePath, checkpointBasePath, CLEANED_UP_DIR, cleanedUpPath, COUNTERS_DIR, FORCE_CHECKPOINT_USER_FLAG, HALT_COMPUTATION_NODE, haltComputationPath, INPUT_SPLITS_ALL_DONE_NODE, INPUT_SPLITS_WORKER_DONE_DIR, INPUT_SUPERSTEP, inputSplitsAllDonePath, inputSplitsWorkerDonePath, JSONOBJ_APPLICATION_ATTEMPT_KEY, JSONOBJ_METRICS_KEY, JSONOBJ_NUM_MESSAGE_BYTES_KEY, JSONOBJ_NUM_MESSAGES_KEY, JSONOBJ_STATE_KEY, JSONOBJ_SUPERSTEP_KEY, KRYO_REGISTERED_CLASS_DIR, kryoRegisteredClassPath, MASTER_ELECTION_DIR, MASTER_JOB_STATE_NODE, MASTER_SUFFIX, masterElectionPath, masterJobStatePath, MEMORY_OBSERVER_DIR, memoryObserverPath, METRICS_DIR, PARTITION_EXCHANGE_DIR, savedCheckpointBasePath, SUPERSTEP_DIR, SUPERSTEP_FINISHED_NODE, UNSET_APPLICATION_ATTEMPT, UNSET_SUPERSTEP, WORKER_FINISHED_DIR, WORKER_HEALTHY_DIR, WORKER_SUFFIX, WORKER_UNHEALTHY_DIR, WORKER_WROTE_CHECKPOINT_DIR| Constructor and Description |
|---|
BspServiceMaster(org.apache.hadoop.mapreduce.Mapper.Context context,
GraphTaskManager<I,V,E> graphTaskManager)
Constructor for setting up the master.
|
| Modifier and Type | Method and Description |
|---|---|
void |
addGiraphTimersAndSendCounters(long superstep)
We add the Giraph Timers separately, because we need to include
the time required for shutdown and cleanup
This will fetch the final Giraph Timers, and send all the counters
to the job client
|
boolean |
becomeMaster()
Become the master.
|
List<WorkerInfo> |
checkWorkers()
Check all the
WorkerInfo objects to ensure
that a minimum number of good workers exists out of the total that have
reported. |
void |
cleanup(SuperstepState superstepState)
Clean up the service (no calls may be issued after this)
|
SuperstepState |
coordinateSuperstep()
Master coordinates the superstep
|
int |
createEdgeInputSplits()
Create the
BspInputSplit objects from the index range based on the
user-defined EdgeInputFormat. |
int |
createMappingInputSplits()
Create the
BspInputSplit objects from the index range based on the
user-defined MappingInputFormat. |
int |
createVertexInputSplits()
Create the
BspInputSplit objects from the index range based on the
user-defined VertexInputFormat. |
void |
failureCleanup(Exception e)
Called when the job fails in order to let the Master do any cleanup.
|
AggregatorToGlobalCommTranslation |
getAggregatorTranslationHandler()
Handler for aggregators to reduce/broadcast translation
|
MasterGlobalCommHandler |
getGlobalCommHandler()
Get handler for global communication
|
long |
getLastGoodCheckpoint()
Get the last known good checkpoint
|
MasterCompute |
getMasterCompute()
Get MasterCompute object
|
MasterInfo |
getMasterInfo()
Get master info
|
BspEvent |
getSuperstepStateChangedEvent()
Event that the master watches that denotes if a worker has done something
that changes the state of a superstep (either a worker completed or died)
|
List<WorkerInfo> |
getWorkerInfoList()
Get list of workers
|
BspEvent |
getWorkerWroteCheckpointEvent()
Event that the master watches that denotes when a worker wrote checkpoint
|
void |
newSuperstep(SuperstepMetricsRegistry superstepMetrics)
Starting a new superstep.
|
void |
postApplication()
Application has finished.
|
void |
postSuperstep()
Superstep has finished.
|
boolean |
processEvent(org.apache.zookeeper.WatchedEvent event)
Derived classes that want additional ZooKeeper events to take action
should override this.
|
void |
restartFromCheckpoint(long checkpoint)
Master can decide to restart from the last good checkpoint if a
worker fails during a superstep.
|
void |
setJobState(ApplicationState state,
long applicationAttempt,
long desiredSuperstep)
If the master decides that this job doesn't have the resources to
continue, it can fail the job.
|
void |
setup()
Setup (must be called prior to any other function)
|
getApplicationAttempt, getApplicationAttemptChangedEvent, getCheckpointBasePath, getCleanedUpChildrenChangedEvent, getConfiguration, getContext, getFs, getGraphPartitionerFactory, getGraphTaskManager, getHealthyHostnameIdFromPath, getHostname, getHostnameTaskId, getInputSplitsAllDoneEvent, getInputSplitsWorkerDoneEvent, getJobId, getJobProgressTracker, getJobState, getLastCheckpointedSuperstep, getMasterElectionChildrenChangedEvent, getPartitionExchangePath, getPartitionExchangeWorkerPath, getRestartedSuperstep, getSavedCheckpointBasePath, getSuperstep, getSuperstepFinishedEvent, getSuperstepFinishedPath, getSuperstepFromPath, getSuperstepPath, getTaskId, getWorkerCountersFinishedPath, getWorkerHealthRegistrationChangedEvent, getWorkerId, getWorkerInfoById, getWorkerInfoHealthyPath, getWorkerInfoUnhealthyPath, getWorkerMetricsFinishedPath, getWorkerWroteCheckpointPath, getWrittenCountersToZKEvent, getZkExt, incrCachedSuperstep, process, registerBspEvent, setApplicationAttempt, setCachedSuperstep, setRestartedSuperstepclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetConfiguration, getJobProgressTracker, getRestartedSuperstep, getSupersteppublic static final int MAX_PRINTABLE_REMAINING_WORKERS
public static final String NUM_MASTER_ZK_INPUT_SPLIT_THREADS
public static final int DEFAULT_INPUT_SPLIT_THREAD_COUNT
public BspServiceMaster(org.apache.hadoop.mapreduce.Mapper.Context context,
GraphTaskManager<I,V,E> graphTaskManager)
context - Mapper contextgraphTaskManager - GraphTaskManager for this compute nodepublic void newSuperstep(SuperstepMetricsRegistry superstepMetrics)
ResetSuperstepMetricsObservernewSuperstep in interface ResetSuperstepMetricsObserversuperstepMetrics - SuperstepMetricsRegistry being used.public void setJobState(ApplicationState state, long applicationAttempt, long desiredSuperstep)
CentralizedServiceMastersetJobState in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>state - State of the application.applicationAttempt - Attempt to start ondesiredSuperstep - Superstep to restart from (if applicable)public List<WorkerInfo> checkWorkers()
CentralizedServiceMasterWorkerInfo objects to ensure
that a minimum number of good workers exists out of the total that have
reported.checkWorkers in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public int createMappingInputSplits()
CentralizedServiceMasterBspInputSplit objects from the index range based on the
user-defined MappingInputFormat. The BspInputSplit objects will
processed by the workers later on during the INPUT_SUPERSTEP.createMappingInputSplits in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public int createVertexInputSplits()
CentralizedServiceMasterBspInputSplit objects from the index range based on the
user-defined VertexInputFormat. The BspInputSplit objects will
processed by the workers later on during the INPUT_SUPERSTEP.createVertexInputSplits in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public int createEdgeInputSplits()
CentralizedServiceMasterBspInputSplit objects from the index range based on the
user-defined EdgeInputFormat. The BspInputSplit objects will
processed by the workers later on during the INPUT_SUPERSTEP.createEdgeInputSplits in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public List<WorkerInfo> getWorkerInfoList()
CentralizedServicegetWorkerInfoList in interface CentralizedService<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public MasterGlobalCommHandler getGlobalCommHandler()
CentralizedServiceMastergetGlobalCommHandler in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public AggregatorToGlobalCommTranslation getAggregatorTranslationHandler()
CentralizedServiceMastergetAggregatorTranslationHandler in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public MasterCompute getMasterCompute()
CentralizedServiceMastergetMasterCompute in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public void setup()
CentralizedServiceMastersetup in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public boolean becomeMaster()
CentralizedServiceMasterbecomeMaster in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public MasterInfo getMasterInfo()
CentralizedServicegetMasterInfo in interface CentralizedService<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public void restartFromCheckpoint(long checkpoint)
CentralizedServiceMasterrestartFromCheckpoint in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>checkpoint - Checkpoint to restart frompublic long getLastGoodCheckpoint()
throws IOException
CentralizedServiceMastergetLastGoodCheckpoint in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>IOExceptionpublic SuperstepState coordinateSuperstep() throws org.apache.zookeeper.KeeperException, InterruptedException
CentralizedServiceMastercoordinateSuperstep in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>org.apache.zookeeper.KeeperExceptionInterruptedExceptionpublic void addGiraphTimersAndSendCounters(long superstep)
addGiraphTimersAndSendCounters in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>superstep - superstep for which the GiraphTimer will be sentpublic void postApplication()
CentralizedServiceMasterpostApplication in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public void postSuperstep()
CentralizedServiceMasterpostSuperstep in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>public void failureCleanup(Exception e)
CentralizedServiceMasterfailureCleanup in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>e - Exception job failed from. May be null.public void cleanup(SuperstepState superstepState) throws IOException
CentralizedServiceMastercleanup in interface CentralizedServiceMaster<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>superstepState - what was the state
of the last complete superstep?IOExceptionpublic final BspEvent getWorkerWroteCheckpointEvent()
public final BspEvent getSuperstepStateChangedEvent()
public boolean processEvent(org.apache.zookeeper.WatchedEvent event)
BspServiceprocessEvent in class BspService<I extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable,E extends org.apache.hadoop.io.Writable>event - Event that occurredCopyright © 2011-2020 The Apache Software Foundation. All Rights Reserved.