String Calculator Kata

Practising our Test Driven Development (TDD) skill is always a good idea. While we can use our work projects to practice it can be more effective to also do some Katas to teach yourself specific skills that you might not need every day in your normal work. An example of such a skill is string manipulation. It is a useful skill to have, especially in a TDD context, but it isn't something you need every day. This is where the String Calculator Kata comes in.

In the String Calculator Kata you build an application that can read a string of numbers and calculate the sum of these numbers. The goal of the kata is to practice Test First, refactoring and string manipulation skills. The Kata is a bit more advanced than the FizzBuzz kata and Greeter kata I discussed before. And if you are less familiar with TDD or PHP, I would recommend following those articles first.

The rules of the kata are as following:

  1. Create a simple String calculator with a method signature: add(string numbers) : int.
  2. The method can take up to two numbers, separated by a comma and will return the sum of these numbers. For example "1" will return 1, "1,2" will return 3.
  3. Allow the add method to handle an unknown amount of numbers
  4. Allow a newline as a delimiter. For example "1\n2" should return 3, but also "1\n2,3" should work and return 6
  5. Allow to change the delimiter. The begin of the string contains a separate line starting with //[delimiter]. For example "//;\n1;2" should return 3.

In this article we will stop the Kata right here. There are more steps that we can add to the Kata if we want to make it more advanced. We will do that in a follow up article.

In the next chapters I will describe how to walk through the Kata using the Red, Green, Refactor technique, but before you read further I would recommend that you try the Kata yourself first. You learn most by practising for yourself before reading a possible answer. Of course, after you tried the kata, it can be useful to follow along. There are multiple solutions to the Kata and by trying different possibilities you learn the most.

An empty string returns 0

The simplest case is the empty string which should return 0 as there is nothing to add. We can test that in our first test case:

class StringCalculatorTest extends TestCase
{
    /** @test */public function an_empty_string_returns_zero()
    {
        $calculator = new StringCalculator();

        $this->assertEquals(0, $calculator->add(''));
    }
}

The StringCalculator class doesn't exist and we fail the test:

Error : Class "Webdevils\Katas\StringCalculator\StringCalculator" not found

We can solve that by creating the StringCalculator class:

class StringCalculator
{

}

We still fail the test. This time the error is different:

Error : Call to undefined method Webdevils\Katas\StringCalculator\StringCalculator::add()

We can solve this by adding the add method. I returned -1 in the add method to make sure that we see the test fail.

public function add(string $numbers) : int
{
    return -1;
}

And as expected, we receive the first PHPUnit error:

Failed asserting that -1 matches expected 0.
Expected :0
Actual   :-1

We can solve this by returning 0 in our add method.

public function add(string $numbers) : int
{
    return 0;
}

For now, we don't have anything to refactor, so let's continue.

A single number returns the single number

The next simplest failing test would be a single number. A single number should return itself. For example "3" should return 3 and "1" should return 1.

/** @test */
public function a_single_number_should_return_that_number()
{
    $calculator = new StringCalculator();

    $this->assertEquals(3, $calculator->add('3'));
}

Right now this test will fail:

Failed asserting that 0 matches expected 3.
Expected :3
Actual   :0

We can easily pass this test in PHP by casting the string to a number. An empty string is seen as 0, a number in a string will automatically cast to that specific number:

public function add(string $numbers) : int
{
    return (int)$numbers;
}

And we pass the test. We can refactor our test though. In both tests we create a new StringCalculator. While repeating that code twice isn't much of a problem, it can become a problem when we add more tests. Especially if we decide to change the constructor of our StringCalcuator. By repeating the construction of our test class, we make refactoring harder in the future.

We can add a setUp method to create the StringCalculator object:

class StringCalculatorTest extends TestCase
{
    private StringCalculator $calculator;

    protected function setUp() : void{
        $this->calculator = new StringCalculator();
    }

    /** @test */public function an_empty_string_returns_zero()
    {
        $this->assertEquals(0, $this->calculator->add(''));
    }

    /** @test */public function a_single_number_should_return_that_number()
    {
        $this->assertEquals(3, $this->calculator->add('3'));
    }
}

We could get rid of some more repeating code by introducing a dataProvider, but I don't think that is the best refactoring this time. The names of the test methods provide us with valuable knowledge on what we are testing. When we introduce a dataProvider we lose that information.

Add two numbers separated by a comma

Now that we support single numbers, it is time to introduce adding two numbers separated by a comma.

/** @test */
public function two_numbers_separated_by_a_comma_should_return_the_sum()
{
    $this->assertEquals(3, $this->calculator->add('1,2'));
}

This test shows beautifully how casting a string to a number works in PHP. PHP scans the string for a number and returns the first number it finds in your string. In this case it returns 1.

Failed asserting that 1 matches expected 3.
Expected :3
Actual   :1

We can solve this problem by parsing the string and splitting it up in the separate numbers. The PHP standard library has a function that can do that for us: explode. The explode function takes a string and splits it into an array using a separator (in this case a comma). When we have an array of numbers, we can use the array_sum function to sum the numbers:

public function add(string $numbers) : int
{
    $parsedNumbers = explode(',', $numbers);
    return array_sum($parsedNumbers);
}

We pass the test and by accident we also support the third requirement for this kata: sum an arbitrary amount of numbers separated by a comma.

For now the code still looks quite nice. No need to do any refactoring. Let's continue with the next requirement

Add numbers separated by newlines or commas

Our next requirement is to support newlines and commas. For example 1\n2 should return 3, but also 1\n2,3 should return 6. We can create a test to check for this:

/** @test */
public function numbers_separated_by_newlines_and_commas_will_be_summed()
{
    $this->assertEquals(6, $this->calculator->add("1,2\n3"));
}

This test fails, because our current code splits this up in 1 and 2\n3. Converting 2\n3 to a string gives you a 2 and 1 plus 2 is 3.

Failed asserting that 3 matches expected 6.
Expected :6
Actual   :3

To support this test we need a way to support two different delimiters: comma and newline. The explode function only supports single delimiters and isn't a good fit for this. Luckily, if we read further in the documentation at php.net it suggests the preg_split function as an alternative. Preg_split allows for a regular expression as the delimiter. Regular expressions (or regex) allow us to do an OR where can tell it to split when it encounters a comma OR a newline: /,|\n/.

public function add(string $numbers) : int
{
    $parsedNumbers = preg_split("/,|\n/", $numbers);
    return array_sum($parsedNumbers);
}

The regex allows us to pass the test, but also makes our code harder to read. To make the intention of the code more clear to the reader, we can extract the regex into a constant:

const DEFAULT_DELIMITERS = "/,|\n/";

public function add(string $numbers) : int
{
    $parsedNumbers = preg_split(self::DEFAULT_DELIMITERS, $numbers);
    return array_sum($parsedNumbers);
}

The new code is easier to understand, also for someone that isn't too familiar with regulator expressions.

Let's continue with the last kata requirement for today.

Overwriting delimiters

The last requirement for today wants us to allow overwriting of the default delimiters. For example, if we want to separate our numbers with a semicolon instead of a comma or newline we could provide the following string: "//;\n1;2;3". The first line starting with // provides us with the delimiter and the second line contains the numbers. Let's create a test:

/** @test */
public function you_can_change_the_delimiter_to_a_semicolon()
{
    $this->assertEquals(6, $this->calculator->add("//;\n1;2;3"));
}

Of course this test fails:

Failed asserting that 1 matches expected 6.
Expected :6
Actual   :1

Let's first think how we can pass the test. First off, we want all other tests to keep working, which means that we should have a condition to check for the forward slashes (//). In PHP 8 there is a new function in the PHP standard library that can help us with that: str_starts_with.

public function add(string $numbers) : int
{
    if(str_starts_with($numbers, '//')) {
        // We defined a new delimiter
    }
    
    $parsedNumbers = preg_split(self::DEFAULT_DELIMITERS, $numbers);
    return array_sum($parsedNumbers);
}

The structure for the new delimiter is the two forward slashes, followed by the delimiter, followed by a newline. If we know there is a delimiter, we can use the explode function to split on only first newline using the limit parameter. The first part will be the delimiter definition, the second part the numbers we want to split.

public function add(string $numbers) : int
{
    if(str_starts_with($numbers, '//')) {
        list($delimiter, $numbers) = explode("\n", $numbers, 2);
    }

    $parsedNumbers = preg_split(self::DEFAULT_DELIMITERS, $numbers);
    return array_sum($parsedNumbers);
}

The list function allows us to expand an array into individual variables.

As a last step to bring us to passing, we need to split the numbers based on the provided delimiter. We can do that by parsing the $delimiter variable using substr.

public function add(string $numbers) : int
{
    $delimiter = self::DEFAULT_DELIMITERS;
    if(str_starts_with($numbers, '//')) {
        list($delimiter, $numbers) = explode("\n", $numbers, 2);

        $delimiter = '/' . substr($delimiter, 2) . '/';
    }

    $parsedNumbers = preg_split($delimiter, $numbers);
    return array_sum($parsedNumbers);
}

And we have our code working. It isn't pretty though, so let's clean it by refactoring.

Refactoring towards cleaner code

First step is to have a look at our method. It does two things:

  1. Parse the string into a list of integers
  2. Sum these integers

We can reflect this insight in our code as well by extracting a parseStringToNumbers method. We don't need to extract the second role of our function as that is already in a single function call and I think that function call is clearly describing what it does.

public function add(string $numbers) : int
{
    $parsedNumbers = $this->parseStringToNumbers($numbers);
    return array_sum($parsedNumbers);
}

private function parseStringToNumbers(string $numbers): array
{
    $delimiter = self::DEFAULT_DELIMITERS;
    if (str_starts_with($numbers, '//')) {
        list($delimiter, $numbers) = explode("\n", $numbers, 2);

        $delimiter = '/' . substr($delimiter, 2) . '/';
    }

    return preg_split($delimiter, $numbers);
}

Next we can have a look at our parseStringToNumbers function. The function does multiple things as well:

  1. Check if there is a delimiter defined
  2. If there is a delimiter defined it extracts the delimiter and the numbers
  3. It splits the numbers based on the delimiter

Let's see if we can split these responsibilities by extracting more methods. First we can extract the condition in the if statement:

private function parseStringToNumbers(string $numbers): array
{
    $delimiter = self::DEFAULT_DELIMITERS;
    if ($this->containsDelimiter($numbers)) {
        list($delimiter, $numbers) = explode("\n", $numbers, 2);

        $delimiter = '/' . substr($delimiter, 2) . '/';
    }

    return preg_split($delimiter, $numbers);
}

private function containsDelimiter(string $numbers): bool
{
    return str_starts_with($numbers, '//');
}

The next step is a bit more tricky. We would like to extract a method that extracts the delimiter from and the numbers from the string. We will need a function with two return values though. Another option would be to make the variables that we want to return global to our class. I don't like either of these ideas, but making the variables global to our class is at least possible, so let's try it. We can always return to our current code if we want:

private string $numbers;
private string $delimiter;

private function parseStringToNumbers(string $numbers): array
{
    $this->numbers = $numbers;
    $this->delimiter = self::DEFAULT_DELIMITERS;

    if ($this->containsDelimiter($this->numbers)) {
        list($delimiter, $this->numbers) = explode("\n", $this->numbers, 2);

        $this->delimiter = '/' . substr($delimiter, 2) . '/';
    }

    return preg_split($this->delimiter, $this->numbers);
}

One thing this allows us, is extracting a method for parsing the delimiter our of the string:

private function parseStringToNumbers(string $numbers): array
{
    $this->numbers = $numbers;
    $this->delimiter = self::DEFAULT_DELIMITERS;

    if ($this->containsDelimiter($this->numbers)) {
        $delimiter = $this->parseDelimiterAndNumbers();
        $this->obtainDelimiter($delimiter);
    }

    return preg_split($this->delimiter, $this->numbers);
}

private function parseDelimiterAndNumbers(): string
{
    list($delimiter, $this->numbers) = explode("\n", $this->numbers, 2);
    return $delimiter;
}

private function obtainDelimiter(string $delimiter): void
{
    $this->delimiter = '/' . substr($delimiter, 2) . '/';
}

We have a bunch of private methods now that all have to do with parsing a string into a list of numbers. This sounds to me as something that should be extracted to its own class: IntegersString. Let's try that and see if it makes our code nicer than it currently is:

class IntegersString
{
    const DEFAULT_DELIMITERS = "/,|\n/";

    private string $numbers;
    private string $delimiter;

    public function __construct(string $numbers)
    {
        if($this->containsDelimiter($numbers)) {
            $delimiter = $this->parseDelimiterAndNumbers($numbers);
            $this->obtainDelimiter($delimiter);
        } else {
            $this->numbers = $numbers;
            $this->delimiter = self::DEFAULT_DELIMITERS;
        }
    }

    private function containsDelimiter(string $numbers)
    {
        return str_starts_with($numbers, '//');
    }

    private function parseDelimiterAndNumbers(string $numbers): string
    {
        list($delimiter, $this->numbers) = explode("\n", $numbers, 2);
        return $delimiter;
    }

    private function obtainDelimiter(string $delimiter): void
    {
        $this->delimiter = '/' . substr($delimiter, 2) . '/';
    }

    public function getIntegers() : array
    {
        return preg_split($this->delimiter, $this->numbers);
    }
}

And the StringCalculator code:

class StringCalculator
{
    public function add(string $numbers) : int
    {
        $integersString = new IntegersString($numbers);
        return array_sum($integersString->getIntegers());
    }
}

I really like how our StringCalculator code ended up. It is much clearer in what it does and it's much more flexible. For example, if we want to introduce a multiply method, we don't have to repeat all the string parsing code. It's neatly tucked away in it's own class.

I'm not entirely happy with the IntegersString class though. The extraction of the parseDelimiterAndNumbers and obtainDelimiter methods feels a bit awkward. Let's inline those methods and see if we can come up with a better refactoring there:

public function __construct(string $numbers)
{
    if($this->containsDelimiter($numbers)) {
        list($delimiter, $this->numbers) = explode("\n", $numbers, 2);
        $this->delimiter = '/' . substr($delimiter, 2) . '/';
    } else {
        $this->numbers = $numbers;
        $this->delimiter = self::DEFAULT_DELIMITERS;
    }
}

It looks less awkward, but readability could be improved. Let's try a different way to extract the code. We have two lines of code and they do two different things:

  • The first line splits the list of numbers into the delimiter part and the list of numbers
  • The second line of code parses the delimiter and converts it to a regex

We can start with extracting a method for the second functionality:

public function __construct(string $numbers)
{
    if($this->containsDelimiter($numbers)) {
        list($delimiter, $this->numbers) = explode("\n", $numbers, 2);
        $this->delimiter = $this->parseDelimiter($delimiter);
    } else {
        $this->numbers = $numbers;
        $this->delimiter = self::DEFAULT_DELIMITERS;
    }
}

private function parseDelimiter(string $delimiter): string
{
    return '/' . substr($delimiter, 2) . '/';
}

Next we can make it clearer that we are setting the two different attributes: $numbers and $delimiter:

public function __construct(string $numbers)
{
    if($this->containsDelimiter($numbers)) {
        list($delimiter, $numbers) = explode("\n", $numbers, 2);

        $this->numbers = $numbers;
        $this->delimiter = $this->parseDelimiter($delimiter);
    } else {
        $this->numbers = $numbers;
        $this->delimiter = self::DEFAULT_DELIMITERS;
    }
}

Now it becomes clear what the constructor does. It checks if $numbers contains a delimiter. If it does, it sets the custom numbers and delimiter. If it doesn't it sets the default numbers and delimiter. We can show this in the code by extracting these two functions in their own methods:

public function __construct(string $numbers)
{
    if($this->containsDelimiter($numbers)) {
        $this->setCustomDelimiterAndNumbers($numbers);
    } else {
        $this->setDefaultDelimiterAndNumbers($numbers);
    }
}

private function setDefaultDelimiterAndNumbers(string $numbers): void
{
    $this->numbers = $numbers;
    $this->delimiter = self::DEFAULT_DELIMITERS;
}

private function setCustomDelimiterAndNumbers(string $numbers): void
{
    list($delimiter, $numbers) = explode("\n", $numbers, 2);

    $this->numbers = $numbers;
    $this->delimiter = $this->parseDelimiter($delimiter);
}

We could probably do more refactoring, but for now I'm happy. We support the requirements of the kata and we have readable and easy to extend code.

Conclusion and next steps

The StringCalculator can be approached in different ways. Today we used the standard PHP library and its string manipulation functions to perform this kata. You could also perform this kata without using the standard library. It would lead to completely different code and a different approach to the Kata. Precisely for that reason, I would recommend trying that approach as well.

At the end of the kata we performed quite some refactoring. This is for me the most fun part of a kata and probably also the most important part. We have a working test suite and the goal of refactoring is to change the code without changing the workings of your code. In the article you could follow along with every step and run the tests after each step and you would see that we always kept the code in working order. However, we were able to make quite significant changes to the code. We extracted a new class for parsing strings. We added many new methods and made the code much more readable. This is a skill that is good to practice and also use in production code you write.

The StringCalculator kata isn't done with the steps we performed in the current article. There are several additional, more advanced, requirements we could add to the kata. In a next article I will describe these extra requirements and walk through them and show you how you could extend this code with those requirements.

Source code

All code described in this article is available at Github in the Webdevils kata project.

Author

Mark Kazemier's avatar
Mark Kazemier

Hi, my name is Mark. I'm the founder of webdevils.nl and love developing websites and other web applications. Through Webdevils.nl I want to spread my enthousiasm about the web and PHP. In my professional live I'm a security expert specialised in security monitoring.

View all posts