MapReduce任务创建和分配流程

cloudeagle_bupt

浏览: 540551 次

最近访客更多访客>>

morelily

csmnjk

jnh

superich2008

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (2884)

社区版块

存档分类

这篇文章写的不错 http://blog.csdn.net/jackydai987/article/details/6227365

总结下主要流程:

1. JobClient.runJob()

根据用户设置的InputFormat类将输入数据进行切分，将相应的信息放在job.jar,job.split和job.xml这三个文件中并存入HDFS.

2.:JobTracker.submitJob()

创建新的JIP对象，其初始化时会将job.jar,job.split和job.xml这三个文件存放在本地文件系统的临时目录中。经过监听器等的一系列操作，JT.JobInitThread 最终调用该JIP的initTasks()函数进行初始化。

3. initTasks()

这里有四个关键数据结构:

// NetworkTopology Node to the set of TIPs
Map<Node, List<TaskInProgress>> nonRunningMapCache;

// Map of NetworkTopology Node to set of running TIPs
Map<Node, Set<TaskInProgress>> runningMapCache;

// A list of non-local, non-running maps
final List<TaskInProgress> nonLocalMaps;

// A set of non-local running maps

Set<TaskInProgress> nonLocalRunningMaps;

首先根据获得的split数目创建相应的TIP对象，并通过createCache初始化nonRunningMapCache来建立hostMap，即节点和任务之间的关系。至于不具备数据本地性任务的TIP, 放入nonLoalMaps中.

建立关系时，主要按照由近及远，由本节点到rackNode的流程。将该任务分别加入该node节点和其父节点(rack node), 由于maxlevel默认为2，因此一般nonRunningMapCache中存放的就是节点本地性任务和机架本地性任务。

  private Map<Node, List<TaskInProgress>> createCache(
                                 TaskSplitMetaInfo[] splits, int maxLevel)
                                 throws UnknownHostException {
    Map<Node, List<TaskInProgress>> cache = 
      new IdentityHashMap<Node, List<TaskInProgress>>(maxLevel);
    Set<String> uniqueHosts = new TreeSet<String>();
    for (int i = 0; i < splits.length; i++) {
      String[] splitLocations = splits[i].getLocations();
      if (splitLocations == null || splitLocations.length == 0) {
        nonLocalMaps.add(maps[i]);
        continue;
      }

      for(String host: splitLocations) {
        Node node = jobtracker.resolveAndAddToTopology(host);
        uniqueHosts.add(host);
        LOG.info("tip:" + maps[i].getTIPId() + " has split on node:" + node);
        for (int j = 0; j < maxLevel; j++) {
          List<TaskInProgress> hostMaps = cache.get(node);
          if (hostMaps == null) {
            hostMaps = new ArrayList<TaskInProgress>();
            cache.put(node, hostMaps);
            hostMaps.add(maps[i]);
          }
          //check whether the hostMaps already contains an entry for a TIP
          //This will be true for nodes that are racks and multiple nodes in
          //the rack contain the input for a tip. Note that if it already
          //exists in the hostMaps, it must be the last element there since
          //we process one TIP at a time sequentially in the split-size order
          if (hostMaps.get(hostMaps.size() - 1) != maps[i]) {
            hostMaps.add(maps[i]);
          }
          node = node.getParent();
        }
      }
    }
    
    // Calibrate the localityWaitFactor - Do not override user intent!
    if (localityWaitFactor == DEFAULT_LOCALITY_WAIT_FACTOR) {
      int jobNodes = uniqueHosts.size();
      int clusterNodes = jobtracker.getNumberOfUniqueHosts();
      
      if (clusterNodes > 0) {
        localityWaitFactor = 
          Math.min((float)jobNodes/clusterNodes, localityWaitFactor);
      }
      LOG.info(jobId + " LOCALITY_WAIT_FACTOR=" + localityWaitFactor);
    }
    
    return cache;
  }

4. 任务分配时

本地性任务的区分靠findNewMapTask的参数maxLevel来区分，maxLevel=1时调度Node Local, 2时NodeOrRackLocal,

主要分配函数为scheduleMap, 其主要作用是将从nonRunningMapCache中找到的符合条件的TIP(本地或者是非本地)，取出放入runningMapCache中，即可。

至于不具备数据本地性任务的TIP,同理从nonLocalMaps中找到相应任务，放入nonLocalRunningMaps中。

  protected synchronized void scheduleMap(TaskInProgress tip) {
    
    if (runningMapCache == null) {
      LOG.warn("Running cache for maps is missing!! " 
               + "Job details are missing.");
      return;
    }
    String[] splitLocations = tip.getSplitLocations();

    // Add the TIP to the list of non-local running TIPs
    if (splitLocations == null || splitLocations.length == 0) {
      nonLocalRunningMaps.add(tip);
      return;
    }

    for(String host: splitLocations) {
      Node node = jobtracker.getNode(host);

      for (int j = 0; j < maxLevel; ++j) {
        Set<TaskInProgress> hostMaps = runningMapCache.get(node);
        if (hostMaps == null) {
          // create a cache if needed
          hostMaps = new LinkedHashSet<TaskInProgress>();
          runningMapCache.put(node, hostMaps);
        }
        hostMaps.add(tip);
        node = node.getParent();
      }
    }
  }

以上是任务的创建和本地性分配，非本地性任务的分配流程有时间在描述。