Beginner Exercism • python

Roman Numerals

Challenge Overview

Convert modern Arabic numbers into Roman numerals.

Description

Today, most people in the world use Arabic numerals (0–9). But if you travelled back two thousand years, you’d find that most Europeans were using Roman numerals instead.

To write a Roman numeral we use the following Latin letters, each of which has a value:

M	D	C	L	X	V	I
1000	500	100	50	10	5	1

A Roman numeral is a sequence of these letters, and its value is the sum of the letters’ values. For example, XVIII has the value 18 (10 + 5 + 1 + 1 + 1 = 18).

There’s one rule that makes things trickier though, and that’s that the same letter cannot be used more than three times in succession. That means that we can’t express numbers such as 4 with the seemingly natural IIII. Instead, for those numbers, we use a subtraction method between two letters. So we think of 4 not as 1 + 1 + 1 + 1 but instead as 5 - 1. And slightly confusingly to our modern thinking, we write the smaller number first. This applies only in the following cases: 4 (IV), 9 (IX), 40 (XL), 90 (XC), 400 (CD) and 900 (CM).

Order matters in Roman numerals! Letters (and the special compounds above) must be ordered by decreasing value from left to right.

Here are some examples:

 105 => CV
---- => --
 100 => C
+  5 =>  V

 106 => CVI
---- => --
 100 => C
+  5 =>  V
+  1 =>   I

 104 => CIV
---- => ---
 100 => C
+  4 =>  IV

And a final more complex example:

 1996 => MCMXCVI
----- => -------
 1000 => M
+ 900 =>  CM
+  90 =>    XC
+   5 =>      V
+   1 =>       I

Instructions

Introduction

Your task is to convert a number from Arabic numerals to Roman numerals.

For this exercise, we are only concerned about traditional Roman numerals, in which the largest number is MMMCMXCIX (or 3,999).

There are lots of different ways to convert between Arabic and Roman numerals.
We recommend taking a naive approach first to familiarise yourself with the concept of Roman numerals and then search for more efficient methods.

Make sure to check out our Deep Dive video at the end to explore the different approaches you can take!

Dig Deeper

If Else

def roman(number):
    # The notation: I, V, X, L, C, D, M = 1, 5, 10, 50, 100, 500, 1000
    m = number // 1000
    m_rem = number % 1000
    c = m_rem // 100
    c_rem = m_rem % 100
    x = c_rem // 10
    x_rem = c_rem % 10
    i = x_rem

    res = ''

    if m > 0:
        res +=  m * 'M'
    
    if 4 > c > 0:
        res +=  c * 'C'
    elif c == 4:
        res +=  'CD'
    elif 9 > c > 4:
        res +=  'D' + ((c - 5) * 'C')
    elif c == 9:
        res +=  'CM'

    if 4 > x >  0:
        res += x * 'X'
    elif x == 4:
        res += 'XL'
    elif 9 > x > 4:
        res += 'L' + ((x - 5) * 'X')
    elif x == 9:
        res +=  'XC'

    if 4 > i >  0:
        res += i * 'I'
    elif i == 4:
        res += 'IV'
    elif 9 > i > 4:
        res += 'V' + ((i - 5) * 'I')
    elif i == 9:
        res += 'IX'
    
    return res

This gets the job done. Something like it would work in most languages, though Python’s range test (a > x > b) saves some boolean logic.

Refactoring

The code above is quite long and a bit repetitive. We should explore ways to make it more concise.

The first block is just a way to extract the digits from the input number. This can be done with a list comprehension, left-padding with zeros as necessary:

digits = ([0, 0, 0, 0] + [int(d) for d in str(number)])[-4:]

The blocks for hundreds, tens and units are all essentially the same, so we can put that code in a function. We just need to pass in the digit, plus a tuple of translations for (1, 4, 5, 9) or their 10x and 100x equivalents.

It is also unnecessary to keep retesting the lower bounds within an elif, as the code line will only be reached if that is satisfied.

Using return instead of elif is a matter of personal preference. Given that, the code simplifies to:

def roman(number: int) -> str:
    def translate_digit(digit: int, translations: iter) -> str:
        assert isinstance(digit, int) and 0 <= digit <= 9

        units, four, five, nine = translations
        if digit < 4:
            return digit * units
        if digit == 4:
            return four
        if digit < 9:
            return five + (digit - 5) * units
        return nine

    assert isinstance(number, int)
    m, c, x, i = ([0, 0, 0, 0] + [int(d) for d in str(number)])[-4:]
    res = ''

    if m > 0:
        res += m * 'M'
    if c > 0:
        res += translate_digit(c, ('C', 'CD', 'D', 'CM'))
    if x > 0:
        res += translate_digit(x, ('X', 'XL', 'L', 'XC'))
    if i > 0:
        res += translate_digit(i, ('I', 'IV', 'V', 'IX'))

    return res

The last few lines are quite similar and it would be possible to refactor them into a loop, but this is enough to illustrate the principle.

Loop Over Romans

Loop Over Roman Numerals

ROMAN = {1000: 'M', 900: 'CM', 500: 'D', 400: 'CD',
         100: 'C', 90: 'XC', 50: 'L', 40: 'XL',
         10: 'X', 9: 'IX', 5: 'V', 4: 'IV', 1: 'I'}

def roman(number: int) -> str:
    result = ''
    while number:
        for arabic in ROMAN.keys():
            if number >= arabic: 
                result += ROMAN[arabic]
                number -= arabic
                break
    return result

This approach is one of a family, using some mapping from Arabic (decimal) to Roman numbers.

The code above uses a dictionary. With minor changes, we could also use nested tuples:

ROMANS = ((1000, "M"), (900, "CM"), (500, "D"), (400, "CD"), (100, "C"),
          (90, "XC"), (50, "L"), (40, "XL"), (10, "X"),
          (9, "IX"), (5, "V"), (4, "IV"), (1, "I"))

def roman(number: int) -> str:
    assert(number > 0)

    roman_num = ""
    for (k, v) in ROMANS:
        while k <= number:
            roman_num += v
            number -= k
    return roman_num

Using a pair of lists is also possible, with a shared index from the enumerate().

# Use a translation 
numbers = [1000, 900, 500, 400, 100,  90, 50,  40,  10,  9,  5,    4,   1]
names   = [  'M', 'CM','D','CD', 'C','XC','L','XL', 'X','IX','V','IV', 'I']

def roman(number: int) -> str:
    "Take a decimal number and return Roman Numeral Representation"

    # List of Roman symbols
    res = []

    while (number > 0):
        # Find the largest amount we can chip off
        for i, val in enumerate(numbers):
            if (number >= val):
                res.append(names[i])
                number -= val
                break

    return ''.join(res)

However, for a read-only lookup it may be better to use (immutable) tuples for numbers and names.

As Roman numerals are built up from letters for 1, 5, 10 times powers of 10, it is possible to shorten the lookup and build up most of the digits programmatically:

# The 10's, 5's and 1's position chars for 1, 10, 100, 1000.
DIGIT_CHARS = ["XVI", "CLX", "MDC", "??M"]


def roman(number: int) -> str:
    """Return the Roman numeral for a number."""
    # Generate a mapping from numeric value to Roman numeral.
    mapping = []
    for position in range(len(DIGIT_CHARS) - 1, -1, -1):
        # Values: 1000, 100, 10, 1
        scale = 10 ** position
        chars = DIGIT_CHARS[position]
        # This might be: (9, IX) or (90, XC)
        mapping.append((9 * scale, chars[2] + chars[0]))
        # This might be: (5, V) or (50, D)
        mapping.append((5 * scale, chars[1]))
        # This might be: (4, IV) or (40, XD)
        mapping.append((4 * scale, chars[2] + chars[1]))
        mapping.append((1 * scale, chars[2]))

    out = ""
    for num, numerals in mapping:
        while number >= num:
            out += numerals
            number -= num
    return out

The code below is doing something similar to the dictionary approach at the top of this page, but more concisely:

def roman(number: int) -> str:
    result = ''
    divisor_map = {1000: 'M', 900: 'CM', 500: 'D', 400: 'CD', 100: 'C', 90: 'XC',
                   50: 'L', 40: 'XL', 10: 'X', 9: 'IX', 5: 'V', 4: 'IV', 1: 'I'}
    for divisor, symbol in divisor_map.items():
        major, number = divmod(number, divisor)
        result += symbol * major
    return result

These five solutions all share some common features:

Some sort of translation lookup.
Nested loops, a whileand a for, in either order.
At each step, find the largest number that can be subtracted from the decimal input and appended to the Roman representation.

When building a string gradually, it is often better to build an intermediate list, then do a join() at the end, as in the third example. This is because strings are immutable, so need to be copied at each step, and the old strings need to be garbage-collected.

However, Roman numerals are always so short that the difference is minimal in this case.

Incidentally, notice the use of type hints: def roman(number: int) -> str. This is optional in Python and (currently) ignored by the interpreter, but is useful for documentation purposes.

Increasingly, IDE’s such as VSCode and PyCharm understand the type hints, using them to flag problems and provide advice.

Table Lookup

def roman(number):
    assert (number > 0)

    # define lookup table (as a tuple of tuples, in this case)
    table = (
        ("I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"),
        ("X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"),
        ("C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM"),
        ("M", "MM", "MMM"))

    # convert the input integer to a list of single digits
    digits = [int(d) for d in str(number)]
    
    # we need the row in the lookup table for our most-significant decimal digit
    inverter = len(digits) - 1 

    # translate decimal digits list to Roman numerals list
    roman_digits = [table[inverter - i][d - 1] for (i, d) in enumerate(digits) if d != 0]

    # convert the list of Roman numerals to a single string
    return ''.join(roman_digits)

In this approach we loop over decimal digits, not their Roman equivalents.

The key point is to have a 2-dimensional lookup table, with each row corresponding to a separate digit: ones, tens, hundreds, thousands. Each digit can then be converted to its Roman equivalent with a single lookup.

Note that we need to compensate for Python’s zero-based indexing by (in effect) subtracting 1 from each row and column.

Optional modification

In the code above, we used the inverter variable to work bottom-to-top through the lookup table. This allows working left-to-right through the decimal digits.

Alternatively, we could reverse the digits list, go top-to-bottom through the lookup table, then reverse the roman_digits list before the final join().

def roman(number):
    assert (number > 0)

    # define lookup table (as a tuple of tuples, in this case)
    table = (
        ("I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"),
        ("X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"),
        ("C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM"),
        ("M", "MM", "MMM"))

    # convert the input integer to a list of single digits, in reverse order
    digits = [int(d) for d in str(number)][::-1]

    # translate decimal digits list to Roman numerals list
    roman_digits = [table[i][d - 1] for (i, d) in enumerate(digits) if d != 0]

    # reverse the list of Roman numerals and convert to a single string
    return ''.join(roman_digits[::-1])

This eliminates one line of code, at the cost of adding two list reverses.

The [::-1] indexing is idiomatic Python, but less experienced programmers may not find it very readable.

Credit

This approach was adapted from one created by @cmcaine on the Julia track.

Source: Exercism python/roman-numerals