Mar 18, 2023

Working with Endless Numbers

While I was learning Rust, hex2dec was one of the first programs I wrote.

Idea was simple; take a hex number, spit out its decimal version. Back then, it looked something like this:

fn main() {
  let hex = match std::env::args().nth(1) {
    Some(v) => v,
    None => {
      println!("hex2dec <hex>");
      return;
    }
  };

  // Trim the 0x prefix if it exists
  let hex = if hex.contains("x") {
    &hex.as_str()[2..]
  } else {
    hex.as_str()
  };

  match u128::from_str_radix(hex, 16) {
    Ok(v) => println!("{}", v),
    Err(e) => {
      match e.kind() {
        IntErrorKind::InvalidDigit => println!("Invalid digit"),
        IntErrorKind::Empty => println!("Empty value"),
        IntErrorKind::PosOverflow => println!("Positive overflow"),
        _ => println!("Unknown error"),
      }
      return;
    }
  };
}

Even though it does the job for small numbers, it is nowhere near to what Python offers by default.

# Python 3.10.0 (v3.10.0:b494f5935c, Oct  4 2021, 14:59:19) [Clang 12.0.5 (clang-1205.0.22.11)] on darwin
>>> 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6668014432879854274079851790721257797144758322315908160396257811764037237817632071521432200871554290742929910593433240445028051694757062217144716328329826087985255648570940054186191833731598150723351284592671484221644804225508737300248046250

Having arbitrary length arithmetic operations like Python would require much more than figuring out how to convert a hex to decimal but I knew from a hunch that there’s should be a straightforward way to convert one base to another.

Searching for this issue, I found this site which has what I wanted. I’m still trying to wrap my head around it but the algorithm is the following:

First, represent a number as an array of its digits. Reverse order makes operations more easier. So 1024 becomes [4, 2, 0, 1] Let’s wrap the array with a costless struct.

#[derive(Debug, Clone)]
struct DigitArray<const BASE: u8>(Vec<u8>);

I’ll stick to the const generics while implementing it, because the behaviour will be same for all bases. Now that we have this representation, it is more effective to implement scalar multiplication and addition. These will be useful later

Addition is same as elementary school addition. Add both numbers digit by digit and store the carry.

impl<const BASE: u8> std::ops::Add<Self> for &DigitArray<BASE> {
  type Output = DigitArray<BASE>;

  fn add(self, rhs: Self) -> Self::Output {
    let mut sum = Vec::<u8>::new();
    let digits = self.0.len().max(rhs.0.len());
    let mut carry: u8 = 0;

    let mut i: usize = 0;
    loop {
      if i >= digits && carry == 0 {
        break;
      }

      let self_i = if i < self.0.len() { self.0[i] } else { 0 };
      let rhs_i = if i < rhs.0.len() { rhs.0[i] } else { 0 };
      let sum_i = self_i + rhs_i + carry;

      sum.push(sum_i % BASE);
      carry = sum_i / BASE;
      i += 1;
    }

    DigitArray::<BASE>(sum)
  }
}

Whereas scalar multiplication works by disecting number to its binary digits and adding 2^n * number to a total sum. Let’s say our number is ([2, 1] base 10 = 12) and we’re multiplying it with 6 (110). It would be same as adding (2^2 * 12) and (2^1 * 12). Once we have this multiplying the number with itself for 2’s powers would be adding it over and over.

impl<const BASE: u8> std::ops::Mul<u8> for DigitArray<BASE> {
  type Output = Self;

  fn mul(self, rhs: u8) -> Self::Output {
    let mut result = DigitArray::new(vec![]);

    if rhs == 0 {
      return result;
    }

    let mut num = rhs;
    let mut power = self;
    loop {
      if num & 1 == 1 {
        result = &result + &power;
      }

      num >>= 1;
      if num == 0 {
        break result;
      }

      power = &power + &power;
    }
  }
}

With those two functions converting number X with base A to number Y with base B has these steps:

Start with sum 0
For each digit of X where di is the digit index and d is the digit: sum += (A^(di+1) * d base B)

We can have a variable power with initial value (1 base B) and multiply it with A each iteration. In this case, algorith becomes:

Start with sum 0 and power (1 base B)
sum += power * d
power += A

Here’s a simple, and most likely non-idimoatic way to do this:

fn convert_base<const FROM_BASE: u8, const TO_BASE: u8>(
  decimal_str: String,
) -> Result<String, &'static str> {
  let digits: DigitArray<FROM_BASE> = match decimal_str.try_into() {
    Ok(v) => v,
    Err(e) => return Err(e),
  };

  let mut out_array = DigitArray::<TO_BASE>::new(vec![]);
  let mut power = DigitArray::<TO_BASE>::new(vec![1]);

  for digit in digits.into_inner() {
    out_array = &out_array + &(power.clone() * digit);
    power = power.clone() * FROM_BASE;
  }

  Ok(out_array
    .into_inner()
    .into_iter()
    .rev()
    .map(|v| char::from_digit(v.into(), TO_BASE.into()).expect("can't convert to char"))
    .collect())
}

For the sake of ergonomy, we can define TryFrom for the DigitArray.

impl<const BASE: u8> DigitArray<BASE> {
  fn new(inner: Vec<u8>) -> Self {
    Self(inner)
  }

  fn inner(&self) -> &Vec<u8> {
    &self.0
  }
}

impl<const BASE: u8> TryFrom<String> for DigitArray<BASE> {
  type Error = &'static str;

  fn try_from(value: String) -> Result<Self, Self::Error> {
    let mut result = Vec::<u8>::new();

    for char in value.chars().rev() {
      let digit = if let Some(digit) = char.to_digit(BASE.into()) {
        digit
      } else {
        return Err("not a valid number");
      };

      let digit: u8 = match digit.try_into() {
        Ok(v) => v,
        Err(_) => return Err("overflow u32"),
      };

      result.push(digit);
    }

    Ok(Self(result))
  }
}

which concludes the implementation. Now we can call the function using:

fn main() {
  let hex_value = String::from("ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
  assert_eq!(
    convert_base::<16,10>(hex_value),
    String::from("6668014432879854274079851790721257797144758322315908160396257811764037237817632071521432200871554290742929910593433240445028051694757062217144716328329826087985255648570940054186191833731598150723351284592671484221644804225508737300248046250")
  );
}