1. 概述

本教程将介绍如何在使用Java的ArrayList时避免插入重复值。我们将探讨多种实现方案:

  • ✅ 使用JDK内置类(Set、List方法、Stream API)
  • ✅ 利用第三方库(Guava、Apache Commons Collections)
  • ⚠️ 不同方案的适用场景对比

2. Set类方案

Set集合的核心特性就是不允许重复元素,这是最简单粗暴的解决方案:

@Test
void givenArrayList_whenUsingSet_thenAvoidDuplicates() {
    Set<String> distinctCities = new HashSet<>(Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo"));

    String newCity = "Paris";
    distinctCities.add(newCity);
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

关键点

  • 尝试添加重复城市"Paris"时,Set会自动去重
  • 最终ArrayList大小与初始Set一致
  • 经验之谈:优先使用Set是避免重复的最佳实践

3. List#contains方法检查

List接口的contains()方法提供直接的存在性检查:

@Test
void givenArrayList_whenUsingContains_thenAvoidDuplicates() {
    List<String> distinctCities = Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo");
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    String newCity = "Madrid";
    if (!arrayListCities.contains(newCity)) {
        arrayListCities.add(newCity);
    }

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

执行逻辑

  1. 检查"Madrid"是否已存在
  2. 因已存在导致条件为false
  3. 跳过add()操作避免重复插入

4. Stream#anyMatch方法

Java 8 Stream API的anyMatch()提供谓词匹配检查:

@Test
void givenArrayList_whenUsingAnyMatch_thenAvoidDuplicates() {
    List<String> distinctCities = Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo");
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    String newCity = "Tamassint";
    boolean isCityPresent = arrayListCities.stream()
      .anyMatch(city -> city.equals(newCity));
    if (!isCityPresent) {
        arrayListCities.add(newCity);
    }

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

工作原理

  • 流式处理检查是否存在匹配元素
  • 发现"Tamassint"已存在时返回true
  • 阻止重复添加操作

5. Stream#filter方法

使用filter()配合findFirst()实现存在性验证:

@Test
void givenArrayList_whenUsingFilterAndFindFirst_thenAvoidDuplicates() {
    List<String> distinctCities = Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo");
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    String newCity = "Tamassint";
    Optional<String> optionalCity = arrayListCities.stream()
      .filter(city -> city.equals(newCity))
      .findFirst();
    if (optionalCity.isEmpty()) {
        arrayListCities.add(newCity);
    }

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

处理流程

  1. 过滤匹配新城市"Tamassint"的元素
  2. findFirst()返回Optional包装结果
  3. 当Optional为空时才执行添加

6. 使用Guava方案

Guava库提供Iterables工具类简化操作,先添加依赖:

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>33.2.0-jre</version>
</dependency>

使用示例:

@Test
void givenArrayList_whenUsingIterablesContains_thenAvoidDuplicates() {
    List<String> distinctCities = Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo");
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    String newCity = "Paris";
    boolean isCityPresent = Iterables.contains(arrayListCities, newCity);
    if (!isCityPresent) {
        arrayListCities.add(newCity);
    }

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

适用场景

  • ✅ 已在项目中使用Guava时推荐
  • ❌ 纯Java项目不建议额外引入依赖

7. 使用Apache Commons Collections

CollectionUtils的containsAny()方法提供多元素检查,添加依赖:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-collections4</artifactId>
    <version>4.5.0-M2</version>
</dependency>

实现代码:

@Test
void givenArrayList_whenUsingCollectionUtilsContainsAny_thenAvoidDuplicates() {
    List<String> distinctCities = Arrays.asList("Tamassint", "Madrid", "Paris", "Tokyo");
    ArrayList<String> arrayListCities = new ArrayList<>(distinctCities);

    String newCity = "Tokyo";
    boolean isCityPresent = CollectionUtils.containsAny(arrayListCities, newCity);
    if (!isCityPresent) {
        arrayListCities.add(newCity);
    }

    assertThat(arrayListCities).hasSameSizeAs(distinctCities);
}

底层机制

  • 基于集合交集运算实现
  • 当交集非空时返回true
  • 适合需要批量检查的场景

8. 总结

各方案对比总结: | 方案 | 优点 | 缺点 | 适用场景 | |------|------|------|----------| | Set | ⭐⭐⭐ 最简单高效 | 需要转换数据结构 | 新开发场景首选 | | List#contains | ⭐⭐ 直观易用 | O(n)时间复杂度 | 小规模数据 | | Stream API | ⭐⭐ 函数式风格 | 稍显啰嗦 | 已使用Java 8+ | | Guava | ⭐ API友好 | 需额外依赖 | 已集成Guava项目 | | Commons | ⭐ 批量检查强 | 依赖较重 | 复杂集合操作 |

核心建议

  1. 优先使用Set解决重复问题
  2. 已有List时用contains()/Stream检查
  3. 第三方库仅在项目已集成时考虑
  4. 避免在性能敏感场景用O(n)检查

示例完整代码:GitHub仓库


原始标题:Avoid Inserting Duplicates in ArrayList in Java | Baeldung