1. 概述

String类 是Java中使用最多的类之一,这促使语言设计者需要对其进行特殊对待。这种特殊的行为使它成为Java面试中最热门的话题之一。

在本教程中,我们将讨论有关String的一些最常见的面试问题。

2. String 基础

本节包含关于字符串内部结构和内存的问题。

Q1. Java中的String是什么?

Java中,String内部由一个byte数组(JDK 9 之前是char类型数组)表示。

在Java 8 之前版本中(包括Java 8),String由不可变的Unicode字符数组组成。 但是,大多数字符只需要8 bit(1个字节)来表示它们,而不是16 bit (一个char大小)。

为了提高内存利用率和性能,Java 9 引入了压缩String这一概念。这意味着如果一个String只包含 一字节大小的字符,则将使用Latin-1编码表示该字符串。如果字符串包含至少一个多字节字符,则使用UTF-16编码,每个字符占用2个字节大小。

在C和C+中,String是一个字符数组。但在Java中,它是一个独立的对象,拥有自己的API。

Q2. Java中如何创建String对象?

java.lang.String 定义了 13 种创建String的不同方法.

通常有2种:

  • 通过String字面量:

      String s = "abc";
    
  • 使用new关键字:

      String s = new String("abc");
    

所有Java String字面量,都是String类的实例。

Q3. String是基本类型还是派生类型?

String是派生类型,因为他有自己的状态和行为。例如,它具有诸如substring()indexOf()equals()这些方法,而这些基本类型是没有的。

但是,由于我们都经常使用它,因此它具有一些特殊的特性,使其看起来像基本类型:

  • 字符串不像其他基本类型存储在堆中,而是存储在一个被称为字符串池的特殊内存区域种。
  • 像基本类型一样,我们可以在字符串上使用“+”操作符
  • 以及,可以不使用new关键字创建一个String的实例

Q4. String 不可变有什么好处?

According to an interview by James Gosling, strings are immutable to improve performance and security.

And actually, we see several benefits to having immutable strings:

  • The string pool is only possible if the strings, once created, are never changed, as they are supposed to be reused
  • The code can safely pass a string to another method, knowing that it can't be altered by that method
  • Immutably automatically makes this class thread-safe
  • Since this class is thread-safe, there is no need to synchronize common data, which in turn improves performance
  • Since they are guaranteed to not change, their hashcode can be easily cached

Q5. String 是如何在内存中存储的?

According to the JVM Specification, String literals are stored in a runtime constant pool, which is allocated from the JVM's method area.

Although the method area is logically part of the heap memory, the specification does not dictate the location, memory size, or garbage collection policies. It can be implementation-specific.

This runtime constant pool for a class or interface is constructed when the class or interface is created by the JVM.

Q6. Are Interned Strings Eligible for Garbage Collection in Java?

Yes, all _String_s in the string pool are eligible for garbage collection if there are no references from the program.

Q7. 什么是字符串常量池?

The string pool, also known as the String constant pool or the String intern pool, is a special memory region where the JVM stores String instances.

It optimizes application performance by reducing how often and how many strings are allocated:

  • The JVM stores only one copy of a particular String in the pool
  • When creating a new String, the JVM searches in the pool for a String having the same value
  • If found, the JVM returns the reference to that String without allocating any additional memory
  • If not found, then the JVM adds it to the pool (interns it) and returns its reference

Q8. String 是线程安全的吗?

Strings are indeed completely thread-safe because they are immutable. Any class which is immutable automatically qualifies for thread-safety because its immutability guarantees that its instances won't be changed across multiple threads.

For example, if a thread changes a string's value, a new String gets created instead of modifying the existing one.

Q9. 对于哪些字符串操作,本地化很重要?

The Locale class allows us to differentiate between cultural locales as well as to format our content appropriately.

When it comes to the String class, we need it when rendering strings in format or when lower- or upper-casing strings.

In fact, if we forget to do this, we can run into problems with portability, security, and usability.

Q10. String的底层字符编码是什么?

According to _String'_s Javadocs for versions up to and including Java 8, Strings are stored in the UTF-16 format internally.

The char data type and java.lang.Character objects are also based on the original Unicode specification, which defined characters as fixed-width 16-bit entities.

Starting with JDK 9, Strings that contain only 1-byte characters use Latin-1 encoding, while Strings with at least 1 multi-byte character use UTF-16 encoding.

3. String API

In this section, we'll discuss some questions related to the String API.

Q11. Java 中如何比较两个字符串?str1 == str2 和 str1.Equals(str2) 有什么区别?

We can compare strings in two different ways: by using equal to operator ( == ) and by using the equals() method.

Both are quite different from each other:

  • The operator (str1 == str2) checks for referential equality
  • The method (str1.equals(str2)) checks for lexical equality

Though, it's true that if two strings are lexically equal, then str1.intern() == str2.intern() is also true.

Typically, for comparing two Strings for their content, we should always use String.equals.

Q12. Java 如何分隔字符串?

The String class itself provides us with the _String#_split method, which accepts a regular expression delimiter. It returns us a String[] array:

String[] parts = "john,peter,mary".split(",");
assertEquals(new String[] { "john", "peter", "mary" }, parts);

One tricky thing about split is that when splitting an empty string, we may get a non-empty array:

assertEquals(new String[] { "" }, "".split(","));

Of course, split is just one of many ways to split a Java String.

Q13. 什么是 Stringjoiner?

StringJoiner is a class introduced in Java 8 for joining separate strings into one, like taking a list of colors and returning them as a comma-delimited string. We can supply a delimiter as well as a prefix and suffix:

    StringJoiner joiner = new StringJoiner(",", "[", "]");
    joiner.add("Red")
      .add("Green")
      .add("Blue");
    
    assertEquals("[Red,Green,Blue]", joiner.toString());

Q14. String, Stringbuffer 以及 Stringbuilder 之间的区别?

Strings are immutable. This means that if we try to change or alter its values, then Java creates an absolutely new String.

For example, if we add to a string str1 after it has been created:

String str1 = "abc";
str1 = str1 + "def";

Then the JVM, instead of modifying str1, creates an entirely new String.

However, for most of the simple cases, the compiler internally uses StringBuilder and optimizes the above code.

But, for more complex code like loops, it will create an entirely new String, deteriorating performance. This is where StringBuilder and StringBuffer are useful.

Both StringBuilder and StringBuffer in Java create objects that hold a mutable sequence of characters. StringBuffer is synchronized and therefore thread-safe whereas StringBuilder is not.

Since the extra synchronization in StringBuffer is typically unnecessary, we can often get a performance boost by selecting StringBuilder.

Q15. 为什么将密码存储在char[]数组,而不是字符串中更安全?

Since strings are immutable, they don't allow modification. This behavior keeps us from overwriting, modifying, or zeroing out its contents, making Strings unsuitable for storing sensitive information.

We have to rely on the garbage collector to remove a string's contents. Moreover, in Java versions 6 and below, strings were stored in PermGen, meaning that once a String was created, it was never garbage collected.

By using a char[] array, we have complete control over that information. We can modify it or wipe it completely without even relying on the garbage collector.

Using char[] over String doesn't completely secure the information; it's just an extra measure that reduces an opportunity for the malicious user to gain access to sensitive information.

Q16. String的intern()方法的作用 ?

The method intern() creates an exact copy of a String object in the heap and stores it in the String constant pool, which the JVM maintains.

Java automatically interns all strings created using string literals, but if we create a String using the new operator, for example, String str = new String(“abc”), then Java adds it to the heap, just like any other object.

We can call the intern() method to tell the JVM to add it to the string pool if it doesn't already exist there, and return a reference of that interned string:

    String s1 = "Baeldung";
    String s2 = new String("Baeldung");
    String s3 = new String("Baeldung").intern();
    
    assertThat(s1 == s2).isFalse();
    assertThat(s1 == s3).isTrue();

Q17. String和Integer如何相互转换?

The most straightforward approach to convert a String to an Integer is by using _Integer#_parseInt:

int num = Integer.parseInt("22");

To do the reverse, we can use _Integer#_toString:

String s = Integer.toString(num);

Q18. 什么是 String.format() 以及如何使用?

String#format returns a formatted string using the specified format string and arguments.

String title = "Baeldung"; 
String formatted = String.format("Title is %s", title);
assertEquals("Title is Baeldung", formatted);

We also need to remember to specify the user's Locale, unless we are okay with simply accepting the operating system default:

Locale usersLocale = Locale.ITALY;
assertEquals("1.024",
  String.format(usersLocale, "There are %,d shirts to choose from. Good luck.", 1024))

Q19. 如何将字符串转换为全大写和小写 ?

String implicitly provides String#toUpperCase to change the casing to uppercase.

Though, the Javadocs remind us that we need to specify the user's L__ocale to ensure correctness:

String s = "Welcome to Baeldung!";
assertEquals("WELCOME TO BAELDUNG!", s.toUpperCase(Locale.US));

Similarly, to convert to lowercase, we have String#toLowerCase:

String s = "Welcome to Baeldung!";
assertEquals("welcome to baeldung!", s.toLowerCase(Locale.UK));

Q20. 如何从String中获取字符数组?

String provides toCharArray, which returns a copy of its internal char array pre-JDK9 (and converts the String to a new char array in JDK9+):

char[] hello = "hello".toCharArray();
assertArrayEquals(new String[] { 'h', 'e', 'l', 'l', 'o' }, hello);

Q21. 如何将Java String转换为byte数组?

By default, the method String#getBytes() encodes a String into a byte array using the platform’s default charset.

And while the API doesn't require that we specify a charset, we should in order to ensure security and portability:

byte[] byteArray2 = "efgh".getBytes(StandardCharsets.US_ASCII);
byte[] byteArray3 = "ijkl".getBytes("UTF-8");

4. 基于String的算法

In this section, we'll discuss some programming questions related to _String_s.

Q22. 如何判断两个字符串是否是Anagram(字母异位词)?

An anagram is a word formed by rearranging the letters of another given word, for example, “car” and “arc”.

To begin, we first check whether both the Strings are of equal length or not.

Then we convert them to char[] array, sort them, and then check for equality.

Q23. 如何统计String中给定字符出现的次数?

Java 8 really simplifies aggregation tasks like these:

long count = "hello".chars().filter(ch -> (char)ch == 'l').count();
assertEquals(2, count);

And, there are several other great ways to count the l's, too, including loops, recursion, regular expressions, and external libraries.

Q24. 如何反转字符串?

There can be many ways to do this, the most straightforward approach being to use the reverse method from StringBuilder (or StringBuffer):

String reversed = new StringBuilder("baeldung").reverse().toString();
assertEquals("gnudleab", reversed);

Q25. 判断一个字符串是否是回文串?

A palindrome is any sequence of characters that reads the same backward as forward, such as “madam”, “radar” or “level”.

To check if a string is a palindrome, we can start iterating the given string forward and backward in a single loop, one character at a time. The loop exits at the first mismatch.

5. 总结

本文中,我们讨论了常见的String面试问题。

用到的例子源码可从 GitHub上获取。