Fundamentals

Default Methods

Collections

Idioms and Techniques

Design Rationale

Advanced Questions

How can I get distinct() to compare some derived value instead of the stream elements themselves?

The distinct() stream operation compares the stream’s elements directly to each other using Object.equals(), but there’s no obvious way to have it compute uniqueness based on some value derived from each stream element.

Here’s a snippet of code do this. This is used in a filter() operation, and it takes a key extractor function to derive a value from the stream element. This derived value is in turn used do determine uniqueness.

public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
    Map<Object,String> seen = new ConcurrentHashMap<>();
    return t -> seen.put(keyExtractor.apply(t), "") == null;
}

Here’s an example of its use. Given a collection of persons, this filters the list so that at most one person of any given age is output:

persons.stream()
    .filter(distinctByKey(Person::getAge))
    .forEach(System.out::println);

If the stream is sequential, the output will contain the first stream element. If the stream is parallel, the output will contain one element for each unique derived value, but it won’t necessarily be the first.