String Calculator Kata continued

Last week we worked through the String Calculator kata. The goal of this kata was to use Test Driven Development (TDD) to develop a function that could calculate the sum of a string of numbers separated by commas, newlines or a user defined delimiter. We stopped the article after the first five requirements, but there are more advanced requirements. In this article we will walk through the advanced requirements step by step and I will show you how you can implement them using TDD.

I would recommend to read the previous article first. This article builds on the code we wrote in the that article. The code we will start off with is available at Github.

Advanced requirements

The advanced requirements are the following:

  1. Numbers larger than 1000 should be ignored: "5,1001,3" should result in 8
  2. Negative numbers will throw an exception. When there are multiple negative numbers, the exception message should show all of them
  3. Delimiters can be of any length. For example "//***\n1***2***3" should return 6
  4. Allow multiple custom delimiters like so: "//[*][%]\n1*2%3" results in 6

And that's it. The idea is that you could walk through the first part of the kata within 30 minutes. If it takes longer you can stop there and practice some more. If you are able to do it quicker, you can make the Kata harder by adding these four requirements. Let's take the code of the previous article and implement these requirements.

Ignore numbers larger than 1000

This requirement should be easy to implement. Let's start with a test:

/** @test */
public function numbers_larger_than_1000_are_ignored()
{
    $this->assertEquals(1001, $this->calculator->add('1000,1'));
    $this->assertEquals(8, $this->calculator->add('5,1001,3'));
}

I added to two asserts to the test to enforce the border at 1000. The number of 1000 should still be added. However, the number 1001 should be ignored.

Failed asserting that 1009 matches expected 8.
Expected :8
Actual   :1009

The test fails, which means we should add some production code to bring us back to green. We have two locations where we can add this code:

  1. In the add method of StringCalculator
  2. Or, in the getIntegers method of IntegersString

Both places are valid. The add method has the responsibility of adding the parsed numbers. It would make sense to do the filtering there as it might be specific to adding and not parsing of the numbers. However, you could also argue that the add method should purely add and the IntegersString class is responsible to provide a filtered list of integers. It doesn't really matter. For now I will filter in the add method. If at a later stage we add a multiply method and we want to ignore 1000 there as well, we can always refactor this check to a different location.

public function add(string $numbers) : int
{
    $integersString = new IntegersString($numbers);

    $filteredList = array_filter(
        $integersString->getIntegers(),
        function ($i) {
            return $i <= 1000;
        }
    );
    return array_sum($filteredList);
}

We use the array_filter PHP function to filter out all numbers larger than 1000.

Time for refactoring. We can do our refactoring in two steps. First, let's extract a method that filters out all numbers larger than 1000:

public function add(string $numbers) : int
{
    return array_sum(
        $this->removeLargeNumbers(
            (new IntegersString($numbers))->getIntegers()
        )
    );
}

private function removeLargeNumbers(array $numbers): array
{
    return array_filter(
        $numbers,
        function ($i) {
            return $i <= 1000;
        }
    );
}

Second, we can extract the number 1000 to a constant called UPPER_LIMIT to express more clearly what the number means:

const UPPER_LIMIT = 1000;

private function removeLargeNumbers(array $numbers): array
{
    return array_filter(
        $numbers,
        function ($i) {
            return $i <= self::UPPER_LIMIT;
        }
    );
}

That looks much better. On to our next test.

Negative numbers throw an exception

The next requirement is similar to the negative numbers one, but now we want to throw an exception. We can add a failing test for a single negative number first:

/** @test */
public function throws_exception_for_negative_numbers()
{
    $this->expectException(NegativeNumber::class);
    $this->expectExceptionMessage('Negative numbers not allowed: [-1]');

    $this->calculator->add('-1,2');
}

We expect a NegativeNumber exception and we expect it to give us a specific error message containing the negative number. Currently this test fails:

Failed asserting that exception of type "Webdevils\Katas\StringCalculator\NegativeNumber" is thrown.

First off, we have to create the NegativeNumber exception class:

class NegativeNumber extends \Exception
{

}

The easiest way to make this test pass is by using the array_filter function to filter all negative numbers and throw a the NegativeNumber exception when we find more than one.

public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();

    $negativeIntegers = array_filter(
        $integers,
        function($i) {
            return $i < 0;
        }
    );

    if(count($negativeIntegers) > 0) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers)
            )
        );
    }

    return array_sum(
        $this->removeLargeNumbers(
            $integers
        )
    );
}

The test passes, but we have a new test that now fails: an_empty_string_returns_zero

Webdevils\Katas\StringCalculator\NegativeNumber : Negative numbers not allowed: []

The problem is that when we give an empty string to the add function, the getIntegers method gives us an array with an empty string. The empty string, when cast to an integer, will result in 0. However, when we check if an empty string is smaller than 0, it will return true and it will end up in our list of negative integers. The solution is to cast all numbers into integers before we feed them into our negative numbers check.

public function getIntegers() : array
{
    return array_map(
        function ($i) {
            return (int)$i;
        },
        preg_split($this->delimiter, $this->numbers)
    );
}

And our code passes. A good time to do some refactoring. Let's start with the getIntegers method in IntegersString. We can extract the array_map code into its own method:

public function getIntegers() : array
{
    return $this->castNumbersToInt(
        preg_split($this->delimiter, $this->numbers)
    );
}

private function castNumbersToInt(array $numbers): array
{
    return array_map(
        function ($i) {
            return (int)$i;
        },
        $numbers
    );
}

By extracting the code in its own method and giving it a name we can make clear what we try to achieve without adding a comment.

Next, we refactor the add method of StringCalculator. First we extract all code that checks for negative numbers:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();

    $this->throwExceptionForNegativeNumbers($integers);

    return array_sum(
        $this->removeLargeNumbers(
            $integers
        )
    );
}

/** @throws NegativeNumber */
private function throwExceptionForNegativeNumbers(array $integers): void
{
    $negativeIntegers = array_filter(
        $integers,
        function ($i) {
            return $i < 0;
        }
    );

    if (count($negativeIntegers) > 0) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers)
            )
        );
    }
}

We can extract getting a list of negative integers in its own method as well:

/** @throws NegativeNumber */
private function throwExceptionForNegativeNumbers(array $integers): void
{
    $negativeIntegers = $this->getNegativeNumbers($integers);

    if (count($negativeIntegers) > 0) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers)
            )
        );
    }
}

private function getNegativeNumbers(array $integers): array
{
    return array_filter(
        $integers,
        function ($i) {
            return $i < 0;
        }
    );
}

And we can extract the code for checking if there are any negative numbers in its own method:

/** @throws NegativeNumber */
private function throwExceptionForNegativeNumbers(array $integers): void
{
    $negativeIntegers = $this->getNegativeNumbers($integers);

    if ($this->hasNegativeNumbers($negativeIntegers)) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers)
            )
        );
    }
}

private function hasNegativeNumbers(array $negativeIntegers): bool
{
    return count($negativeIntegers) > 0;
}

If we look at the code in StringCalculator, we see four methods that have the same parameters. And if we look at the add method there is also the array_sum method that if we would extract it to its own method would also lead to a method with the same parameter. This gives me the feeling that there is another class hiding in our StringCalculator class.

Let's first extract the method for the summing up all integers, and let's have a look how to extract another class after that:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();

    $this->throwExceptionForNegativeNumbers($integers);

    return $this->sum(
        $this->removeLargeNumbers(
            $integers
        )
    );
}

private function sum(array $integers): int
{
    return array_sum(
        $integers
    );
}

All four methods have an array of integers as a parameter. We can extract a class using this parameter as an attribute. The methods can move to this class and can get rid of their parameter. As this is about a list of integers, I think a good name for this class is IntegerList. The first step to perform this refactoring is by creating a new IntegerList class and copy all related methods to this class:

class IntegerList
{
    const UPPER_LIMIT = 1000;

    private array $integers;
    
    public function __construct(array $integers)
    {
        $this->integers = $integers;
    }

    public function sum(): int{
        return array_sum(
            $this->integers
        );
    }

    /** @throws NegativeNumber */
    public function throwExceptionForNegativeNumbers(): void{
        $negativeIntegers = $this->getNegativeNumbers($this->integers);

        if ($this->hasNegativeNumbers($negativeIntegers)) {
            throw new NegativeNumber(
                sprintf(
                    'Negative numbers not allowed: [%s]',
                    implode(',', $negativeIntegers)
                )
            );
        }
    }

    public function hasNegativeNumbers(array $negativeIntegers): bool{
        return count($negativeIntegers) > 0;
    }

    public function getNegativeNumbers(): array{
        return array_filter(
            $this->integers,
            function ($i) {
                return $i < 0;
            }
        );
    }

    public function removeLargeNumbers(): array{
        return array_filter(
            $this->integers,
            function ($i) {
                return $i <= self::UPPER_LIMIT;
            }
        );
    }
}

Next we update the StringCalculator to use the new IntegerList class:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();
    $integers = new IntegerList($integers);

    $integers->throwExceptionForNegativeNumbers();

    $integers = new IntegerList(
        $integers->removeLargeNumbers()
    );

    return $integers->sum();
}

The code still passes and it made the code for our StringCalculator a lot shorter, but the add method is still a bit awkward. First off, the getIntegers method of the IntegersString class returns an array of integers and the first thing we do is add it to the constructor of an IntegerList. Isn't it more logical to return the IntegerList directly from the IntegerString class?

public function getIntegers() : IntegerList
{
    return new IntegerList(
        $this->castNumbersToInt(
            preg_split($this->delimiter, $this->numbers)
        )
    );
}

That change made our StringCalculator less awkward:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();

    $integers->throwExceptionForNegativeNumbers();

    $integers = new IntegerList(
        $integers->removeLargeNumbers()
    );

    return $integers->sum();
}

We have another place in the add method that we create a new IntegerList: after removing the large numbers. It would make sense for the removeLargeNumbers method to directly return a new IntegerList instead of a PHP array of integers.

public function removeLargeNumbers(): IntegerList
{
    return new IntegerList(
        array_filter(
            $this->integers,
            function ($i) {
                return $i <= self::UPPER_LIMIT;
            }
        )
    );
}

We can get rid of another line of code in our StringCalculator:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    $integers = (new IntegersString($numbers))->getIntegers();

    $integers->throwExceptionForNegativeNumbers();

    return $integers->removeLargeNumbers()
        ->sum();
}

Much better, but we can make it even better. Wouldn't it be cool if we could write our add method as following:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    return (new IntegersString($numbers))->getIntegers()
        ->throwExceptionForNegativeNumbers()
        ->removeLargeNumbers()
        ->sum();
}

We could do that easily by changing the throwExceptionForNegativeNumbers method a little bit:

/** @throws NegativeNumber */
public function throwExceptionForNegativeNumbers(): IntegerList
{
    $negativeIntegers = $this->getNegativeNumbers($this->integers);

    if ($this->hasNegativeNumbers($negativeIntegers)) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers)
            )
        );
    }

    return $this;
}

Nice, we are still green!

Next, I want to address the throwExceptionForNegativeNumbers method. It still feels a bit weird to me. I would actually like to write it as following:

/** @throws NegativeNumber */
public function throwExceptionForNegativeNumbers(): IntegerList
{
    $negativeIntegers = $this->getNegativeIntegers();
    if($negativeIntegers->hasIntegers()) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers->integers)
            )
        );
    }

    return $this;
}

And we can do that by adding a hasIntegers method and make the getNegativeIntegers method return a new IntegerList

public function hasIntegers() : bool
{
    return count($this->integers);
}

public function getNegativeIntegers(): IntegerList
{
    return new IntegerList(
        array_filter(
            $this->integers,
            function ($i) {
                return $i < 0;
            }
        )
    );
}

That looks much nicer.

One thing though. The throwExceptionForNegativeNumbers method feels off in the IntegerList class. I feel that it fits much better as a helper method in the StringCalculator. So let's move it back:

/** @throws NegativeNumber */
public function add(string $numbers) : int
{
    return $this->throwExceptionForNegativeNumbers(
        (new IntegersString($numbers))->getIntegers()
    )
        ->removeLargeNumbers()
        ->sum();
}

/** @throws NegativeNumber */
public function throwExceptionForNegativeNumbers(IntegerList $integers): IntegerList
{
    $negativeIntegers = $integers->getNegativeIntegers();
    if($negativeIntegers->hasIntegers()) {
        throw new NegativeNumber(
            sprintf(
                'Negative numbers not allowed: [%s]',
                implode(',', $negativeIntegers->getArray())
            )
        );
    }

    return $integers;
}

We need to add a getArray method to IntegerList.

public function getArray() : array
{
    return $this->integers;
}

Great! We are still green! And that's also the reason why I took some more time to do refactoring. Was it necessary? Maybe? Was it fun? Definitely! But the purpose of refactoring was also to show that you could change the shape of the production code completely while staying green. We can refactor without any fear, because we have a test suite to proof that the code still works.

Let's continue with the next requirement.

Delimiters can be of arbitrary length

We should already support delimiters of arbitrary length, but let's proof that:

/** @test */
public function can_change_delimiter_to_an_arbitrary_length_custom_delimiter()
{
    $this->assertEquals(6, $this->calculator->add("//;;;\n1;;;2;;;3"));
}

This test passes, but I noticed while writing the article that if I add the following assert the test actually fails:

/** @test */
public function can_change_delimiter_to_an_arbitrary_length_custom_delimiter()
{
    $this->assertEquals(6, $this->calculator->add("//;;;\n1;;;2;;;3"));
    $this->assertEquals(6, $this->calculator->add("//***\n1***2***3"));
}

And it fails in an ugly way: a runtime error.

preg_split(): Compilation failed: quantifier does not follow a repeatable item at offset 0

The reason why this fails is because the asterisk (*) has a meaning in regex. We need to make sure that we escape special regex characters. Luckily PHP has a function for that in the standard library: preg_quote.

private function parseDelimiter(string $delimiter): string
{
    return '/' . preg_quote(substr($delimiter, 2), '/') . '/';
}

And that solves our problem. The second parameter is our escape character. We also need to escape that otherwise the rest of our regex would be ignored.

Multiple custom delimiters

On to our last requirement. Support for multiple custom delimiters. Let's put the example in a test:

/** @test */
public function can_have_multiple_custom_delimiters()
{
    $this->assertEquals(6, $this->calculator->add("//[*][%]\n1*2%3"));
}

Ok, how are we going to implement this? Our IntegersString class has its own method to parse the custom delimiter: parseDelimiter. Let's add our code to pass to that method for now:

private function parseDelimiter(string $delimiter): string
{
    $delimiter = substr($delimiter, 2);

    if(str_starts_with($delimiter, '[') && str_ends_with($delimiter, ']')) {
        $matches = [];
        preg_match_all('/\[(.+?)\]/', $delimiter, $matches);

        $delimiters = implode('|',
            array_map(function ($match) {
                    return preg_quote($match);
            }, $matches[1])
        );

        return '/' . $delimiters . '/';
    } else {
        return '/' . preg_quote($delimiter, '/') . '/';
    }
}

And we are done with the Kata. But of course we want to refactor the code first! Let's have a look at the parseDelimiter method. The first thing I don't like is that we have to strip off the // at the beginning of the delimiter string. That is something we can do in the setCustomDelimiterAndNumbers method in a much nicer way:

private function setCustomDelimiterAndNumbers(string $numbers): void
{
    $newline = strpos($numbers, "\n");

    $this->delimiter = $this->parseDelimiter(
        substr(
            string: $numbers,
            offset: 2,
            length: $newline - 2
        )
    );

    $this->numbers = substr(
        string: $numbers,
        offset: $newline
    );
}

Next, I would like to rename the parseDelimiter method to parseDelimiters. As we can now also parse multiple delimiters. The same for the parameter of the parseDelimiter method. I would like to change that parameter to delimiterString.

Then we can extract the first condition of the parseDelimiters method in it's own method:

private function parseDelimiters(string $delimiterString): string
{
    if($this->containsMultipleDelimiters($delimiterString)) {
        $matches = [];
        preg_match_all('/\[(.+?)\]/', $delimiterString, $matches);

        $delimiters = implode('|',
            array_map(function ($match) {
                    return preg_quote($match);
            }, $matches[1])
        );

        return '/' . $delimiters . '/';
    } else {
        return '/' . preg_quote($delimiterString, '/') . '/';
    }
}

private function containsMultipleDelimiters(string $delimiterString): bool
{
    return str_starts_with($delimiterString, '[')
        && str_ends_with($delimiterString, ']');
}

We can restructure the parseDelimiters method more to get rid of some duplicate code. Both in the if and the else we add the regex escape characters and we send the delimiter through preg_quote. We can restructure the code in the following way to get rid of this duplicate code:

private function parseDelimiters(string $delimiterString): string
{
    $delimiters = [$delimiterString];

    if($this->containsMultipleDelimiters($delimiterString)) {
        $matches = [];
        preg_match_all('/\[(.+?)\]/', $delimiterString, $matches);

        $delimiters = $matches[1];
    }

    return '/' . implode('|', array_map(
        function ($delimiter) {
            return preg_quote($delimiter);
        },
        $delimiters
    )) . '/';
}

Great! Much cleaner, but it can be better. We have multiple abstraction levels in this method. We start with a quite high abstraction level in our if statement, but immediately after we move to the lowest level with the preg_match_all and converting everything to a regex. We can fix that by extracting two more methods:

private function parseDelimiters(string $delimiterString): string
{
    $delimiters = [$delimiterString];

    if($this->containsMultipleDelimiters($delimiterString)) {
        $delimiters = $this->parseMultipleDelimiters($delimiterString);
    }

    return $this->convertDelimitersToRegex($delimiters);
}

private function convertDelimitersToRegex(array $delimiters): string
{
    return '/' .
        implode('|', array_map(
            function ($delimiter) {
                return preg_quote($delimiter);
            },
            $delimiters
        ))
        . '/';
}

private function parseMultipleDelimiters(string $delimiterString): array
{
    $matches = [];
    preg_match_all('/\[(.+?)\]/', $delimiterString, $matches);

    return $matches[1];
}

One final refactoring and I will leave you alone for today. Let's extract the escaping of the delimiters in the convertDelimitersToRegex method:

private function convertDelimitersToRegex(array $delimiters): string
{
    return '/' . implode(
        '|', 
        $this->regexEscapeDelimiters($delimiters)
    ) . '/';
}

private function regexEscapeDelimiters(array $delimiters): array
{
    return array_map(
        function ($delimiter) {
            return preg_quote($delimiter);
        },
        $delimiters
    );
}

And that it's it for today. Kata done!

Conclusion

As you can see TDD is about more than just writing your test first and write your code afterwards. It's also about refactoring. Because we are writing our tests first, we can be certain that our tests work and that our code is covered. Therefore we can fearlessly refactor our code. And we can do some interesting refactoring. Like extracting multiple methods, move those to separate classes etc.

Author

Mark Kazemier's avatar
Mark Kazemier

Hi, my name is Mark. I'm the founder of webdevils.nl and love developing websites and other web applications. Through Webdevils.nl I want to spread my enthousiasm about the web and PHP. In my professional live I'm a security expert specialised in security monitoring.

View all posts