I have a simple static class that it's purpose is given an RDD of Point to find the median of each dimension and return that as a new Point using Spark's reduce functions.
This is the class:
public class MedianPointFinder { public static Point findMedianPoint(JavaRDD<Point> points) { Point biggestPointByXDimension = points.reduce((a, b) -> getBiggestPointByXDimension(a, b)); Point biggestPointByYDimension = points.reduce((a, b) -> getBiggestPointByYDimension(a, b)); double xDimensionMedian = biggestPointByXDimension.getX() / 2.0; double yDimensionMedian = biggestPointByYDimension.getY() / 2.0; return new Point(xDimensionMedian, yDimensionMedian); } private static Point getBiggestPointByXDimension(Point first, Point second) { return first.getX() > second.getX() ? first : second; } private static Point getBiggestPointByYDimension(Point first, Point second) { return first.getY() > second.getY() ? first : second; } } Point class is a simple class for storing an (x, y) point.
1 2 9I consider the median to be2not4.5. \$\endgroup\$