1. 概述

在这个教程中,我们将学习各种方法来检查字符串(包括非ASCII字符)中所有字符是否唯一,而且所有讨论的方法都是不区分大小写的。

2. 简单暴力法

这是最明显的技巧,但可能稍微粗糙且效率不高。我们比较字符串中的字符

public class UniqueCharChecker {
    public static boolean bruteForceCheck(String str) {
        char[] chars = str.toUpperCase().toCharArray();
        for (int i = 0; i < chars.length; i++) {
            for (int j = i + 1; j < chars.length; j++) {
                if(chars[i] == chars[j]) {
                    return false;
                }
            }
        }
        return true;
    }
}

让我们为上述方法编写一些测试用例:

public class UniqueCharCheckerUnitTest {
    @Test
    public void givenUnique_whenBruteForceCheck_thenReturnTrue() {
        String[] sampleStrings = new String[]{"Justfewdi123", "$%&Hibusc", "Hibusc%$#", "მშვნიერ"};
        final String MSG = "Duplicate found";
        Arrays.stream(sampleStrings)
          .forEach(sampleStr -> assertTrue(MSG + " in " + sampleStr, UniqueCharChecker.checkV1(sampleStr)));
    }

    @Test
    public void givenNotUnique_whenBruteForceCheck_thenReturnFalse() {
        String[] sampleStrings = new String[]{"Justfewdif123", "$%&Hibushc", "Hibusuc%$#", "Hi%busc%$#", "მშვენიერი"};
        final String MSG = "Duplicate not found";
        Arrays.stream(sampleStrings)
          .forEach(sampleStr -> assertFalse(MSG + " in " + sampleStr, UniqueCharChecker.checkV1(sampleStr)));
    }
}

2. 排序

这个方法类似于简单暴力法,但我们首先对字符串中的字符进行排序,然后只与相邻的字符进行比较,而不是所有人。让我们看看实现:

public static boolean sortAndThenCheck(String str) {
    char[] chars = str.toUpperCase().toCharArray();
    Arrays.sort(chars);
    for (int i = 0; i < chars.length - 1; i++) {
        if(chars[i] == chars[i+1]) {
            return false;
        }
    }
    return true;
}

测试一下:

@Test
public void givenUnique_whenSortAndThenCheck_thenReturnTrue() {
    String[] sampleStrings = new String[]{"Justfewdi123", "$%&Hibusc", "Hibusc%$#", "მშვნიერ"};
    final String MSG = "Duplicate found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertTrue(MSG + " in " + sampleStr, UniqueCharChecker.checkV2(sampleStr)));
}
@Test
public void givenNotUnique_whenSortAndThenCheck_thenReturnFalse() {
    String[] sampleStrings = new String[]{"Justfewdif123", "$%&Hibushc", "Hibusuc%$#", "Hi%busc%$#", "მშვენიერი"};
    final String MSG = "Duplicate not found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertFalse(MSG + " in " + sampleStr, UniqueCharChecker.checkV2(sampleStr)));
}

3. HashSet

这里,我们利用java.util.Set的强大功能来移除重复字符

public static boolean useSetCheck(String str) {
    char[] chars = str.toUpperCase().toCharArray();
    Set <Character> set = new HashSet <>();
    for (char c: chars) {
        if (!set.add(c)) {
            return false;
        }
    }
    return true;
}

现在,让我们看看测试用例:

@Test
public void givenUnique_whenUseSetCheck_thenReturnTrue() {
    String[] sampleStrings = new String[]{"Justfewdi123", "$%&Hibusc", "Hibusc%$#", "მშვნიერ" };
    final String MSG = "Duplicate found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertTrue(MSG + " in " + sampleStr, UniqueCharChecker.checkV3(sampleStr)));
}
@Test
public void givenNotUnique_whenUseSetCheck_thenReturnFalse() {
    String[] sampleStrings = new String[]{"Justfewdif123", "$%&Hibushc", "Hibusuc%$#", "Hi%busc%$#", "მშვენიერი"};
    final String MSG = "Duplicate not found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertFalse(MSG + " in " + sampleStr, UniqueCharChecker.checkV3(sampleStr)));
}

4. Java 流程

这种方法类似于上一节中使用的技巧,但我们使用流(Streams) API 来创建Set。让我们看看实现:

public static boolean useStreamCheck(String str) {
    boolean isUnique = str.toUpperCase().chars()
      .mapToObj(c -> (char) c)
      .collect(Collectors.toSet())
      .size() == str.length();
    return isUnique;
}

接下来是单元测试:

@Test
public void givenUnique_whenUseStreamCheck_thenReturnTrue() {
    String[] sampleStrings = new String[]{"Justfewdi123", "$%&Hibusc", "Hibusc%$#", "მშვნიერ" };
    final String MSG = "Duplicate found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertTrue(MSG + " in " + sampleStr, UniqueCharChecker.checkV1(sampleStr)));
}
@Test
public void givenNotUnique_whenUseStreamCheck_thenReturnFalse() {
    String[] sampleStrings = new String[]{"Justfewdif123", "$%&Hibushc", "Hibusuc%$#", "Hi%busc%$#", "მშვენიერი"};
    final String MSG = "Duplicate not found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertFalse(MSG + " in " + sampleStr, UniqueCharChecker.checkV4(sampleStr)));
}

5. StringUtils

基本来说,这里我们将使用Apache Commons Lang库中的StringUtils类的containsIgnoreCase()方法**:

public static boolean useStringUtilscheck(String str) {
    for (int i = 0; i < str.length(); i++) {
        String curChar = String.valueOf(str.charAt(i));
        String remainingStr = str.substring(i+1);
        if(StringUtils.containsIgnoreCase(remainingStr, curChar)) {
            return false;
        }
    }
    return true;
}

测试这个方法:

@Test
public void givenUnique_whenUseStringUtilscheck_thenReturnTrue() {
    String[] sampleStrings = new String[]{"Justfewdi123", "$%&Hibusc", "Hibusc%$#", "მშვნიერ"};
    final String MSG = "Duplicate found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertTrue(MSG + " in " + sampleStr, UniqueCharChecker.checkV5(sampleStr)));
}
@Test
public void givenNotUnique_whenUseStringUtilscheck_thenReturnFalse() {
    String[] sampleStrings = new String[]{"Justfewdif123", "$%&Hibushc", "Hibusuc%$#", "Hi%busc%$#", "მშვენიერი"};
    final String MSG = "Duplicate not found";
    Arrays.stream(sampleStrings)
      .forEach(sampleStr -> assertFalse(MSG + " in " + sampleStr, UniqueCharChecker.checkV5(sampleStr)));
}

6. 总结

在这篇教程中,我们看到了检查字符串中字符是否唯一的五种不同方法。我们还得出结论,目前没有现成的库可以直接解决这个问题

这里使用的代码片段以及相关的JUnit测试用例可以在GitHub上找到。