Mahir Exercism • cpp

All Your Base

Ringkasan Tantangan

Convert a number, represented as a sequence of digits in one base, to any other base.

Visit the all-your-base exercise on Exercism to read the full instructions and download the exercise files.

Dig Deeper

Parallel operations

In parallel operations

The key for this group of approaches is to convert one digit at a time. You will use the remainder operator repeatedly in tandem with the division operator or make use of the pow.

You might have heard Donald Knuth's famous words ["Premature Optimization Is the Root of All Evil"][pre-opt-blog].
With all the talk about best performance, please take care of the readability and therefore maintainability of your code.

Keep in mind, that the number of digits you need to process is the most important factor for your algorithm’s performance. A simple serial for-loop can be a lot faster for a small number than a parallel algorithm that has more overhead to be set up.

Converting the `input_digits`

Numeric’s `accumulate`

The for-loop with * and + can be written with the accumulate function from the numeric header.

#include <numeric>
unsigned int intermediate = std::accumulate(input_digits.begin(),
                                            input_digits.end(), 0U,
        [&input_base] (unsigned int sum, unsigned int d) {
            return sum * input_base + d;
        });

accumulate is guaranteed to run in order and thus is strictly serial as well.

Algorithm’s `transform_reduce` in parallel

There are several functions that can be run with a parallel policy in the algorithm header. As you need the position to calculate the correct exponent for the pow function, you do need a second container with the position information. To fill that container we can use numeric’s iota, which will produce something like {0, 1, 2, 3, ... } for us.

std::vector <unsigned int> positions(input_digits.size());
std::iota(positions.begin(), positions.end(), 0);
unsigned int intermediate = std::transform_reduce(
    std::execution::par_unseq,  // execution policy
    input_digits.cbegin(),      // first input container start
    input_digits.cend(),        // first input container end
    positions.crbegin(),        // start second container from the rear
    0U,                          // initial value
    std::plus {},               // reduce function
                                // transformation function:
    [ & input_base](unsigned int d, unsigned int pos) {
        return d * std::pow(input_base, pos);
    });

transform_reduce can work on two containers at the same time. In this case we use the input_digits and the positions from the back as exponents for the pow function in the transformation function. The reduce function is combining each result from the transformation with a simple addition (std::plus{}).

The most important part is the execution policy. The example uses std::execution::par_unseq to indicate that the transform_reduce function can be run in parallel and vectorized. With this statement you promise, that the execution does not violate any data dependencies. The function calls are only reading, we do not need to set any locks.

If the compiler will transform the code into something that actually works in parallel and/or vectorized is a whole different chapter.
There is a lot of ifs and whens that lead to the final code.
One of the more direct ways to parallel processing is the use of [Intel's thread building blocks][tbb].

Converting into `output_digits`

For-loop with `pow`

As discussed in the serial approach and above, you can use pow to move to an independent approach, that does not change the intermediate variable.

To set up the for-loop, you need to calculate the number of digits beforehand with std::log(intermediate) / std::log(output_base).

std::vector <unsigned int> output_digits{};
unsigned int digit_limit{static_cast<unsigned int>(std::log(intermediate) / std::log(output_base) + 1)};
for(unsigned int digit{digit_limit}; digit > 0; --digit) {
    output_digits.emplace_back(intermediate / static_cast<unsigned int>(std::pow(output_base, digit - 1)) % output_base);
}

With this approach you do not need to reverse the resulting vector.

Algorithm’s `for_each` in parallel

With the pow approach, you gained the option to move to parallel territory. You cannot decrement an integer like above, as the std container functions in C++17 expect an iterator. iota helps to set up the vector with the correct values.

unsigned int digit_limit{static_cast<unsigned int>(std::log(intermediate) / std::log(output_base)) + 1};
std::vector <unsigned int> output_digits(digit_limit);
std::iota(output_digits.rbegin(), output_digits.rend(), 1);
std::for_each(std::execution::par_unseq,
                output_digits.begin(),
                output_digits.end(),
                [&](unsigned int digit) {
                output_digits[digit_limit - digit] =
                    intermediate
                        / static_cast<unsigned int>(std::pow(output_base, digit - 1))
                        % output_base;
});

The for-each function then uses the prepared vector in the fashion of approach #4. The most important part is the execution policy. The approach uses std::execution::par_unseq to indicate that the transform_reduce function can be run in parallel and vectorized. With this statement you promise, that the execution does not violate any data dependencies. The function calls are never writing to the same vector element, we do not need to set any locks.

All together

#include <algorithm>
#include <cmath>
#include <execution>
#include <numeric>
#include <stdexcept>

std::vector<unsigned int> convert(unsigned int input_base,
    const std::vector<unsigned int>& input_digits, unsigned int output_base) {
    if (input_base < 2 || output_base < 2) {
        throw std::invalid_argument("Bases must be at least 2");
    }
    if (std::any_of(input_digits.begin(), input_digits.end(),
            [input_base](unsigned int digit) {
                return digit >= input_base;
            }))
        throw std::invalid_argument("Digits cannot be bigger than the base");

    std::vector <unsigned int> positions(input_digits.size());
    std::iota(positions.begin(), positions.end(), 0U);
    unsigned int intermediate = std::transform_reduce(
        std::execution::par_unseq,
        input_digits.cbegin(),
        input_digits.cend(),
        positions.crbegin(),
        0U,
        std::plus {},
        [&input_base](unsigned int d, unsigned int pos) {
            return d * std::pow(input_base, pos);
        });

    unsigned int digit_limit{static_cast<unsigned int>(std::log(intermediate) / std::log(output_base) + 1)};
    std::vector <unsigned int> output_digits(digit_limit);
    std::iota(output_digits.rbegin(), output_digits.rend(), 1);
    std::for_each(
        std::execution::par_unseq,
        output_digits.begin(),
        output_digits.end(),
        [&](unsigned int digit) {
            output_digits[digit_limit - digit] =
                intermediate / static_cast<unsigned int>(std::pow(output_base, digit - 1)) % output_base;
    });

    return output_digits;
}

In sequence operations

In sequence operation

The key for this group of approaches is to convert one digit at a time. You will use the remainder operator repeatedly in tandem with the division operator or make use of the pow.

Reversing standard containers like vectors works the same way as discussed in the reverse-string exercise. Depending on the algorithm to build the outgoing vector, the reversion might be unnecessary.

Converting the `input_digits`

First, you want to build the immediate value that is used for the translation into the outgoing base. If you follow the description word by word, you might get something like this:

For-loop with `pow`

unsigned int intermediate = 0;
for (std::vector<unsigned>::size_type i = 0; i < input_digits.size(); ++i) {
    intermediate += input_digits[i] * std::pow(input_base, input_digits.size() - i - 1);
}

Every digit is multiplied by the base which is raised to the power of its position from the right and added to intermediate. The advantage of this approach is the independent calculation for every digit. That can be a start to write parallelizable code. A downside is the cost of the pow function, opposed to a simple multiplication as seen in the next approach.

For-loop with `*` and `+`

unsigned int intermediate = 0;
for (unsigned int d : input_digits) {
    intermediate = intermediate * input_base + d;
}

This for-loop does not use the pow function and should be much cheaper. As it works through the input_digits vector, every digit is still multiplied by the input_base raised to the power of its position from the right. As the repeated multiplication of each digit is only done when all digits are fed through the loop, the process must work in order and is therefore serial.

Converting into `output_digits`

While-loop with `%` and `/`

This part uses a while-loop to repeatedly divide the intermediate by the output_base, extracting the remainders as digits which are then stored in the output_digits vector.

std::vector <unsigned int> output_digits{};
while (intermediate > 0) {
    output_digits.emplace_back(intermediate % output_base);
    intermediate /= output_base;
}

The approach is simple, straightforward, and easy to understand. The order of operation makes it necessary to revert the final vector. As the loop constantly writes to the intermediate variable and needs to do so in order, the approach is strictly serial.

For-loop with `%` and `/`

Instead of the while-loop from the approach above, you can also use a for-loop.

std::vector <unsigned int> output_digits{};
for (; intermediate > 0; intermediate /= output_base) {
    output_digits.emplace_back(intermediate % output_base);
}

The result is a bit more concise, but has the same properties as the while version.

Using `div`

When you need both the remainder and the quotient from a division operation, you can use std::div.

std::vector <unsigned int> output_digits{};
std::ldiv_t v{intermediate, 0};
while (v.quot > 0) {
    v = std::div(v.quot, output_base);
    output_digits.emplace_back(v.rem);
}

Depending on the processor this might save one operation per loop iteration. You do need to define a variable v of type std::ldiv_t. As the order of the resulting digits has not changed, you still need to reverse the vector.

How to avoid reversing

If you do not have the luxury to use a countdown like in approach #4, you can switch emplace_back to insert or push_front. Depending on the size of your input it might be faster to reverse a vector that was filled with elements via emplace_back, than constantly pushing or inserting them at the front. The reason is, that a new element at the front results in a move operation for every other existing element.

All parts together

If you combine the parts above with reverse from the algorithm header, and error checking code, you might get something like this:

#include <algorithm>
#include <stdexcept>

std::vector<unsigned int> convert(unsigned int input_base,
                                  const std::vector<unsigned int>& input_digits,
                                  unsigned int output_base) {
    if (input_base < 2 || output_base < 2)
        throw std::invalid_argument("Bases must be at least 2");
    unsigned int intermediate = 0;
    for (unsigned int d : input_digits) {
            if (d >= input_base)
                throw std::invalid_argument("Digits cannot be bigger than the base");
        intermediate = intermediate * input_base + d;
    }
    std::vector <unsigned int> output_digits{};
    std::ldiv_t v{intermediate, 0};
    while (v.quot > 0) {
        v = std::div(v.quot, output_base);
        output_digits.emplace_back(v.rem);
    }

    std::reverse(output_digits.begin(), output_digits.end());

    return output_digits;
}

Source: Exercism cpp/all-your-base