1. Overview
Universally Unique Identifier (UUID) represents a 128-bit number that is designed to be globally unique. In practice, UUIDs are suitable for use in situations that require unique identification, such as creating the primary key for a database table.
Java provides the long primitive, which is a data type that is easy to read and understand by humans. In many cases, using a 64-bit long can provide sufficient uniqueness with a low collision probability. Furthermore, databases like MySQL, PostgreSQL, and many others have been optimized to work efficiently with numeric data types.
In this article, we’ll discuss generating unique positive long values using UUID, focusing on version 4 UUIDs.
2. Generate Unique Positive long
This scenario presents an interesting challenge because UUIDs have a 128-bit range. Meanwhile, a long value is only 64-bit. This means there’s a reduction in the potential for uniqueness. We’ll discuss how to obtain unique positive long values using randomly generated UUIDs and how effective it is as an approach.
2.1. Using getLeastSignificantBits()
The method getLeastSignificantBits() of the UUID class returns the lowest 64 bits of the UUID. This means it only provides half of the 128-bit UUID value.
So, let’s call it after the randomUUID() method:
long randomPositiveLong = Math.abs(UUID.randomUUID().getLeastSignificantBits());
We’ll continue to use Math.abs() in each of the following methods to ensure positive values.
2.2. Using getMostSignificantBits()
Similarly, the getMostSignificantBits() method in the UUID class also returns a 64-bit long. The difference is that it takes the bits that are located in the highest positions in the 128-bit UUID value.
Again, we’ll chain it after the randomUUID() method:
long randomPositiveLong = Math.abs(UUID.randomUUID().getMostSignificantBits());
Let’s assert that the resulting values are positive (actually non-negative because it is possible to generate a value of 0):
assertThat(randomPositiveLong).isNotNegative();
3. How Effective?
Let’s discuss how effective the use of UUIDs is in this case. To find out, we’ll look at several factors.
3.1. Security and Efficiency
UUID.randomUUID() in Java uses the SecureRandom class to generate secure random numbers to produce a version 4 UUID. This is especially important if the resulting UUID will be used in a security or cryptographic context.
To assess the efficiency and relevance of using UUIDs for generating unique positive long values, let’s see the source code:
public static UUID randomUUID() {
SecureRandom ng = Holder.numberGenerator;
byte[] randomBytes = new byte[16];
ng.nextBytes(randomBytes);
randomBytes[6] &= 0x0f; /* clear version */
randomBytes[6] |= 0x40; /* set to version 4 */
randomBytes[8] &= 0x3f; /* clear variant */
randomBytes[8] |= (byte) 0x80; /* set to IETF variant */
return new UUID(randomBytes);
}
The method uses SecureRandom to generate 16 random bytes forming a UUID, then adjusts several bits in those bytes to specify the UUID version (version 4) and UUID variant (IETF).
While UUIDs offer powerful features, a simpler approach can achieve the desired outcome in this specific case. Consequently, alternative methods might offer improved efficiency and better fit.
Additionally, this approach may reduce the randomness of the generated bits.
3.2. Uniqueness and Collision Probability
Although UUID v4 has a range of 128 bits, four bits are used to indicate version 4, and two bits are used to indicate the variant. As we know, in the UUID display format, each character represents a four-bit hexadecimal digit:
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
The number 4 indicates the four-bit UUID version (in this case, version 4). Meanwhile, the letter ‘y’ will include the two-bit IETF variant. The remaining x part contains 122 random bits. So, we can have a collision probability of 1/2^122.
The RFC explains in general how UUID v4 is created. To see more clearly how the bit positions are distributed, we can look again at the implementation of UUID.randomUUID():
randomBytes[6] &= 0x0f; /* clear version */
randomBytes[6] |= 0x40; /* set to version 4 */
We can see that in randomBytes[6], there are four bits set as a version marker in the Most Significant Bits (MSB) position. So, there are only 60 truly random bits in this MSB. Therefore, we can calculate the probability of collision on the MSB as 1/2^59.
After adding Math.abs(), the chance of collision is doubled due to overlapping positives and negatives. So, the probability of a collision of a positive long value in the MSB is 1 in 2^58.
Then, in randomBytes[8], there are two bits set as IETF variants in the Least Significant Bit (LSB) position:
randomBytes[8] &= 0x3f; /* clear variant */
randomBytes[8] |= (byte) 0x80; /* set to IETF variant */
So, there are only 62 truly random bits in the LSB. Therefore, we can calculate the probability of collision on the LSB as 1/2^61.
After adding Math.abs(), the chance of collision is doubled due to overlapping positives and negatives. So, the probability of a collision of a positive long value in the LSB is 1 in 2^60.
So, we can see that the probability of collision is small. However, if we ask whether the contents of UUIDs are completely random, then the answer is no.
3.3. Suitability for Purpose
UUIDs are designed for global uniqueness, identification, security, and portability. For generating unique positive long values, more efficient methods exist, making UUIDs unnecessary in this context.
While UUID.randomUUID() boasts 128-bit length, we’ve seen that only 122 bits are actually random, and Java’s long data type only handles 64 bits. When converting 128-bit UUIDs to 64-bit longs, some unique potential is lost. Consider this trade-off if uniqueness is critical.
4. SecureRandom as an Alternative
If we need unique long values, it makes more sense to use a random number generator with an appropriate range (for example, using SecureRandom to generate unique random long values). This will ensure that we have unique long values in the appropriate range, without losing most of the unique bits as happens when we try to use UUIDs.
SecureRandom secureRandom = new SecureRandom();
long randomPositiveLong = Math.abs(secureRandom.nextLong());
It also has a lower probability of collision because it generates completely random 64-bit long values.
To ensure positive values, we simply add Math.abs(). Consequently, the probability of collision is calculated as 1/2^62. In decimal form, this probability is approximately 0.000000000000000000216840434497100900. For most practical applications, we can consider this low probability to be insignificant.
5. Conclusion
In conclusion, although UUIDs provide globally unique identifiers, they may not be the most efficient choice for generating unique positive long values, as significant bit loss occurs.
Utilizing methods like getMostSignificantBits() and getLeastSignificantBits() can still provide low collision probabilities, but using a random number generator like SecureRandom might be more efficient and suitable for generating unique positive long values directly.
As always, the full source code is available over on GitHub.