Tuesday 4 August 2015

Hadoop: Partitioner


  • It gives more performance in the application
  • Number of reducer is mentioned in the drivercode
  • All the key,value pairs are passed to all the reducers that are present.
  • Partitioner code:

public class MyPartitioner implements Partitioner(interface)... <text,IntWritable>
{
public void configure(JobConf conf)
{
}
public int getPartition(Test Key,IntWritable value,int setNumRedTasks)
{
String s=key.toString();
if(s.length()==1)
{
return 0;
}
if(s.length==2)
{
return 1;
}
 if(s.length==3)
{
return 2;
}
 else
return 3;
}
}


  • Configure partitioner class in the DriverCode:

conf.PartitionerClass(MyPartitioner.class);

  • Configure the reducer class:

conf.setNumReduceTable(4);

  • We can work with 10^5 reducers (maximum)

part-00000 to part-99999

No comments:

Post a Comment