2. 问题引入

Map在Java中存储键值对,有时我们需要将文本文件内容加载为Map对象。文件需遵循特定格式才能正确解析。以下示例文件展示了典型结构:

$ cat theLordOfRings.txt
title:The Lord of the Rings: The Return of the King
director:Peter Jackson
actor:Sean Astin
actor:Ian McKellen
Gandalf and Aragorn lead the World of Men against Sauron's
army to draw his gaze from Frodo and Sam as they approach Mount Doom with the One Ring.

文件特点:

  • 多数行遵循KEY:VALUE模式(如director:Peter Jackson
  • 需处理三种特殊情况:
    • 值包含分隔符(如首行标题中的冒号)
    • 重复键(如两个actor键)
    • 不符合模式的行(如末尾两行)

3. 重复键处理策略枚举

针对重复键问题,我们定义三种策略:

  1. 覆盖:保留最新值
  2. 丢弃:保留首次出现的值
  3. 聚合:将值收集为List(单独章节讨论)

先实现前两种策略,创建枚举类:

enum DupKeyOption {
    OVERWRITE, DISCARD
}

4. 使用BufferedReader和FileReader

组合BufferedReader和FileReader可逐行读取文件内容

4.1 创建byBufferedReader方法

public static Map<String, String> byBufferedReader(String filePath, DupKeyOption dupKeyOption) {
    HashMap<String, String> map = new HashMap<>();
    String line;
    try (BufferedReader reader = new BufferedReader(new FileReader(filePath))) {
        while ((line = reader.readLine()) != null) {
            String[] keyValuePair = line.split(":", 2);
            if (keyValuePair.length > 1) {
                String key = keyValuePair[0];
                String value = keyValuePair[1];
                if (DupKeyOption.OVERWRITE == dupKeyOption) {
                    map.put(key, value);
                } else if (DupKeyOption.DISCARD == dupKeyOption) {
                    map.putIfAbsent(key, value);
                }
            } else {
                System.out.println("No Key:Value found in line, ignoring: " + line);
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

关键实现点:

  • try-with-resources 确保资源自动关闭
  • split(":", 2) 限制分割次数,保留值中的冒号
  • putIfAbsent() 实现丢弃策略
  • 非键值对行直接跳过

4.2 测试解决方案

初始化预期结果Map:

private static final Map<String, String> EXPECTED_MAP_DISCARD = Stream.of(new String[][]{
    {"title", "The Lord of the Rings: The Return of the King"},
    {"director", "Peter Jackson"},
    {"actor", "Sean Astin"}
  }).collect(Collectors.toMap(data -> data[0], data -> data[1]));

private static final Map<String, String> EXPECTED_MAP_OVERWRITE = Stream.of(new String[][]{
...
    {"actor", "Ian McKellen"}
  }).collect(Collectors.toMap(data -> data[0], data -> data[1]));

测试方法验证结果:

@Test
public void givenInputFile_whenInvokeByBufferedReader_shouldGetExpectedMap() {
    Map<String, String> mapOverwrite = FileToHashMap.byBufferedReader(filePath, FileToHashMap.DupKeyOption.OVERWRITE);
    assertThat(mapOverwrite).isEqualTo(EXPECTED_MAP_OVERWRITE);

    Map<String, String> mapDiscard = FileToHashMap.byBufferedReader(filePath, FileToHashMap.DupKeyOption.DISCARD);
    assertThat(mapDiscard).isEqualTo(EXPECTED_MAP_DISCARD);
}

5. 使用Java Stream

Files.lines方法可便捷获取文件内容的Stream对象

public static Map<String, String> byStream(String filePath, DupKeyOption dupKeyOption) {
    Map<String, String> map = new HashMap<>();
    try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
        lines.filter(line -> line.contains(":"))
            .forEach(line -> {
                String[] keyValuePair = line.split(":", 2);
                String key = keyValuePair[0];
                String value = keyValuePair[1];
                if (DupKeyOption.OVERWRITE == dupKeyOption) {
                    map.put(key, value);
                } else if (DupKeyOption.DISCARD == dupKeyOption) {
                    map.putIfAbsent(key, value);
                }
            });
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

实现要点:

  • Stream自动关闭 确保文件资源释放
  • filter() 过滤非键值对行
  • forEach() 处理有效行逻辑

测试验证:

@Test
public void givenInputFile_whenInvokeByStream_shouldGetExpectedMap() {
    Map<String, String> mapOverwrite = FileToHashMap.byStream(filePath, FileToHashMap.DupKeyOption.OVERWRITE);
    assertThat(mapOverwrite).isEqualTo(EXPECTED_MAP_OVERWRITE);

    Map<String, String> mapDiscard = FileToHashMap.byStream(filePath, FileToHashMap.DupKeyOption.DISCARD);
    assertThat(mapDiscard).isEqualTo(EXPECTED_MAP_DISCARD);
}

6. 按键聚合值

实现聚合策略,返回Map<String, List<String>>

public static Map<String, List<String>> aggregateByKeys(String filePath) {
    Map<String, List<String>> map = new HashMap<>();
    try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
        lines.filter(line -> line.contains(":"))
          .forEach(line -> {
              String[] keyValuePair = line.split(":", 2);
              String key = keyValuePair[0];
              String value = keyValuePair[1];
              if (map.containsKey(key)) {
                  map.get(key).add(value);
              } else {
                  map.put(key, Stream.of(value).collect(Collectors.toList()));
              }
          });
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

踩坑提醒:初始化List时避免使用不可变集合(如Collections.singletonList()List.of()),否则后续无法添加新值。

测试准备:

private static final Map<String, List<String>> EXPECTED_MAP_AGGREGATE = Stream.of(new String[][]{
      {"title", "The Lord of the Rings: The Return of the King"},
      {"director", "Peter Jackson"},
      {"actor", "Sean Astin", "Ian McKellen"}
  }).collect(Collectors.toMap(arr -> arr[0], arr -> Arrays.asList(Arrays.copyOfRange(arr, 1, arr.length))));

测试验证:

@Test
public void givenInputFile_whenInvokeAggregateByKeys_shouldGetExpectedMap() {
    Map<String, List<String>> mapAgg = FileToHashMap.aggregateByKeys(filePath);
    assertThat(mapAgg).isEqualTo(EXPECTED_MAP_AGGREGATE);
}

7. 总结

本文介绍了两种将文本文件转换为Java Map的方案:

  1. BufferedReader+FileReader:经典逐行处理
  2. Stream API:函数式简洁实现

针对重复键问题实现了三种策略:

  • 覆盖策略(保留最新值)
  • 丢弃策略(保留首次值)
  • 聚合策略(收集为List)

完整代码示例可在GitHub仓库获取。


原始标题:Read a File Into a Map in Java