Generate CUIDs in Rust

Resources  |  Generate CUIDs in Rust

Generating a CUID (Collision-resistant Unique Identifier) in Rust involves using techniques such as the current timestamp, a counter, a machine fingerprint, and randomness. Below is an example implementation of generating a CUID in Rust.

Steps to Generate a CUID in Rust

  1. Timestamp: Get the current timestamp in milliseconds.
  2. Counter: Use a counter to handle multiple CUIDs generated in a short period.
  3. Fingerprint: Generate a machine-specific fingerprint.
  4. Randomness: Add random characters to further reduce the risk of collisions.
  5. Base36 Encoding: Encode the components using Base36 for readability.

Example Code

Here's a Rust implementation to generate a CUID:

extern crate crypto;

use std::time::{SystemTime, UNIX_EPOCH};
use crypto::digest::Digest;
use crypto::md5::Md5;
use rand::Rng;

const BASE36_CHARS: &[u8] = b"0123456789abcdefghijklmnopqrstuvwxyz";

fn encode_base36(mut value: u64) -> String {
    if value == 0 {
        return "0".to_string();
    }
    let mut result = Vec::new();
    while value > 0 {
        result.push(BASE36_CHARS[(value % 36) as usize]);
        value /= 36;
    }
    result.reverse();
    String::from_utf8(result).unwrap()
}

fn get_machine_fingerprint() -> String {
    let hostname = match hostname::get() {
        Ok(host) => host.to_str().unwrap_or("unknown").to_string(),
        Err(_) => "unknown".to_string(),
    };
    let mut hasher = Md5::new();
    hasher.input_str(&hostname);
    let hash = hasher.result_str();
    hash.chars().take(4).collect::<String>()
}

fn get_random_string(length: usize) -> String {
    let mut rng = rand::thread_rng();
    (0..length).map(|_| {
        let idx = rng.gen_range(0..BASE36_CHARS.len());
        BASE36_CHARS[idx] as char
    }).collect()
}

fn generate_cuid() -> String {
    let timestamp = SystemTime::now().duration_since(UNIX_EPOCH).unwrap().as_millis() as u64;

    static mut COUNTER: u64 = 0;
    static mut LAST_TIMESTAMP: u64 = 0;

    unsafe {
        if timestamp == LAST_TIMESTAMP {
            COUNTER += 1;
        } else {
            LAST_TIMESTAMP = timestamp;
            COUNTER = 0;
        }
    }

    let timestamp_base36 = encode_base36(timestamp);
    let counter_base36 = encode_base36(unsafe { COUNTER });
    let fingerprint = get_machine_fingerprint();
    let random_string = get_random_string(4);

    format!("c{}{}{}{}", timestamp_base36, counter_base36, fingerprint, random_string)
}

fn main() {
    println!("{}", generate_cuid());
}

Explanation

  1. Base36 Encoding:

    • The encode_base36 function converts a number to a Base36 encoded string.
  2. Timestamp:

    • SystemTime::now().duration_since(UNIX_EPOCH).unwrap().as_millis() as u64 gets the current timestamp in milliseconds since the Unix epoch.
  3. Counter:

    • A static COUNTER variable is used (unsafe due to Rust's ownership rules) to handle multiple CUIDs generated within the same millisecond. It is incremented and reset as necessary.
  4. Machine Fingerprint:

    • get_machine_fingerprint fetches the machine's hostname and computes an MD5 hash of it, returning the first 4 characters of the hash.
  5. Random String:

    • get_random_string generates a random string of specified length using characters from BASE36_CHARS.
  6. Combine Components:

    • The generate_cuid function combines the timestamp, counter, fingerprint, and random parts to form the final CUID.

Summary

This Rust implementation of CUID generation uses external crates (crypto for MD5 hashing and rand for randomness) and standard library functionalities to handle timestamps, hashing, randomness, and counters. This approach ensures that the generated CUIDs are unique, readable, and collision-resistant, making them suitable for various applications. Adjustments can be made based on specific requirements, such as handling errors when fetching the machine's hostname or modifying the length of the random string portion.