Develop a blog - Part 5: Slugs

/
/
11 months ago
/

Last article we finalised the implementation of our initial design and requirements. However, the initial requirements weren't complete. How is a reader going to identify a specific blog post? We need some kind of ID. Introducing slugs. A slug is a human readable part of the URL most often related the title of a BlogPost. The slug not only helps with SEO (Search Engine Optimization), but also makes it easier for a user to read the URL and understand where on a website (or Blog) they are.

This article is part of a series. Read the other parts if you didn't read them yet:

The source code for the full series and the changes I made during this article are available on Github.

Requirements for slugs

So how do slugs work in our blog? We don't have any requirements yet, so let's define a few.

  1. A slug must be based on the name of a category and the title of a blog post
  2. A slug can only contain lowercase letters, numbers and dashes (-)
  3. A slug must be unique
  4. When trying to create an existing slug we add a sequential number to the slug to make it unique

A few examples of slugs that are valid and invalid are:

  • develop-a-blog
  • develop-a-blog-2
  • Develop_A_Blog
  • the-cost-in-$-of-a-house

We can use these examples in our first set of tests.

The Slug object

We have two possibilities to define a Slug in our code:

  1. Use a string that we provide to the BlogPost and the Category classes
  2. Create a Slug class that defines the behaviour of a Slug.

Personally I like the latest option most. It allows us to explicitly ask for a slug and make sure that it is always valid without bothering the Category and BlogPost classes with validation and definition of slugs.

The first four tests for a slug would look like this:

<?php

namespace Tests;

use PHPUnit\Framework\TestCase;
use Webdevils\Blog\Exceptions\InvalidSlug;
use Webdevils\Blog\Slug;

class SlugTest extends TestCase
{
    /** @test */public function can_create_a_new_slug()
    {
        $slug = new Slug('develop-a-blog');

        $this->assertEquals('develop-a-blog', $slug->getUrl());
    }

    /** @test */
    public function can_contain_a_number()
    {
        $slug = new Slug('develop-a-blog-2');

        $this->assertEquals('develop-a-blog-2', $slug->getUrl());
    }

    /** @test */
    public function cannot_contain_capitals_and_underscores()
    {
        $this->expectException(InvalidSlug::class);

        new Slug('Develop_A_Blog');
    }

    /** @test */public function cannot_contain_special_characters()
    {
        $this->expectException(InvalidSlug::class);

        new Slug('the-cost-in-$-of-a-house');
    }
}

Next we create the Slug class and make sure it passes the tests defined above:

<?php


namespace Webdevils\Blog;

use Webdevils\Blog\Exceptions\InvalidSlug;

class Slug
{
    private string $url;

    public function __construct(string $url)
    {
        if (!preg_match('/^[a-z0-9\-]*$/', $url)) {
            throw new InvalidSlug('Slug can only contain lower case letters, numbers and dashses (-)');
        }

        $this->url = $url;
    }


    public function getUrl() : string{
        return $this->url;
    }
}

There are a few design choices we made in the Slug class:

  1. We use a regex to verify if the slug is valid: only lowercase letters from a to z, numbers from 0 to 1 and dashes
  2. When the slug isn't valid we throw an InvalidSlug exception.
  3. We don't allow the user to change a slug after it was created.

Next step: generating slugs.

Generating slugs

Now that we can create slugs, the next step is to generate them from a category name or blog post title. There are three places where we can put the code to generate a slug:

  1. In the Slug constructor. We expect a string and we lowercase it, etc to generate a valid slug
  2. In the Category and BlogPost classes
  3. In a SlugGenerator class

Option 2 is my least favourite option. That would mean that we have to duplicate part of the code in the Category and BlogPost classes.

Option 1 is a valid option and would make sure we have the code for generating slugs in one place. However, it would also mean that the code for interacting with a slug and generating it would be in the same class. This would violate the single responsibility principle. For now that isn't a big issue, but it can cause issues at a later stage. For example, when we want to make sure we generate a unique slug. We don't want our Slug object to also now about a persistent layer.

By following option 3 we can isolate the generation of a proper slug from the actual interaction with a slug object. Let's create our first tests for the SlugGenerator.

<?php

namespace Tests;

use PHPUnit\Framework\TestCase;
use Webdevils\Blog\Slug;
use Webdevils\Blog\SlugGenerator;

class SlugGeneratorTest extends TestCase
{
    /** @test */public function generates_a_lowercase_slug()
    {
        $generator = new SlugGenerator();

        $this->assertEquals(
            new Slug('blog'),
            $generator->generate('Blog')
        );
    }

    /** @test */
    public function replaces_spaces_with_dashes()
    {
        $generator = new SlugGenerator();

        $this->assertEquals(
            new Slug('develop-a-blog'),
            $generator->generate('Develop a blog')
        );
    }

    /** @test */public function removes_special_characters()
    {
        $generator = new SlugGenerator();

        $this->assertEquals(
            new Slug('develop-a-blog'),
            $generator->generate('Develop a blog!!!')
        );
    }

    /** @test */
    public function removes_trailing_spaces()
    {
        $generator = new SlugGenerator();

        $this->assertEquals(
            new Slug('develop-a-blog'),
            $generator->generate('   Develop a blog   ')
        );
    }
}

And the SlugGenerator implementation that would make these tests pass:

<?php


namespace Webdevils\Blog;

class SlugGenerator
{
    public function generate(string $string) : Slug{
        return new Slug(
            str_replace(
                ' ',
                '-',
                trim(
                    preg_replace(
                        '/[^a-z\d\-\s]/',
                        '',
                        strtolower($string)
                    )
                )
            )
        );
    }
}

We lower case the string, we remove all special characters, trim all white space and finally replace the spaces with a dash. This leads to a slug we could use in our blog.

Adding a sequence number when slug isn't unique

While we have the code to generate a slug, we missed one important requirement. The slug needs to be unique. We cannot have two categories or blog posts with the same slug. So let's think about how we can make sure a slug is unique.

We need an object that knows about all slugs that exist in our blog. For example a SlugRepository. For our SlugGenerator it doesn't matter how the SlugRepository is storing and retrieving these slugs. We will come to that in a later article, when we implement the persistence layer. For know we can just define an Interface which has a method to determine if a slug already exists. Let's create a test on how this should look like:

protected function setUp() : void
{
    parent::setUp();

    $this->repository = $this->createStub(SlugRepository::class);
    $this->generator = new SlugGenerator($this->repository);
}

/** @test */
public function adds_sequence_number_when_slug_already_exists()
{
    $this->repository->method('exists')
        ->willReturn(true, false);

    $this->assertEquals(
        new Slug('develop-a-blog-2'),
        $this->generator->generate('Develop a Blog')
    );
}

/** @test */
public function sequence_number_increases_when_previous_number_already_exists()
{
    $this->repository->method('exists')
        ->willReturn(true, true, false);

    $this->assertEquals(
        new Slug('develop-a-blog-3'),
        $this->generator->generate('Develop a blog')
    );
}

The test verifies if a sequence number is added. It also tests that the sequence number is increased when the slug with a sequence number already exists. This forces us to keep increasing the sequence number until we find a valid slug. The code for this would look like this:

<?php


namespace Webdevils\Blog;

class SlugGenerator
{
    private SlugRepository $repository;

    public function __construct(SlugRepository $repository)
    {
        $this->repository = $repository;
    }

    public function generate(string $string) : Slug{
        $slugString = str_replace(
            ' ',
            '-',
            trim(
                preg_replace(
                    '/[^a-z\d\-\s]/',
                    '',
                    strtolower($string)
                )
            )
        );
        $slug = new Slug($slugString);

        $sequence = 2;
        while ($this->repository->exists($slug)) {
            $slug = new Slug($slugString . '-' . $sequence);

            $sequence++;
        }

        return $slug;
    }
}

The SlugGenerator uses a SlugRepository to check if a slug exists. If it exists it will add a sequence number. For now the implementation of the SlugRepository doesn't matter, as long as the interface is available. The persistence layer can implement the SlugRepository interface and decide how to check if a slug already exists.

Add a slug to the Category and BlogPost classes

Now that we can create a slug we can continue with the Category and BlogPost domain objects. Both will use the Slug as the identifying field. Which means that a user can use the Slug to find the Category or Blogpost.

The Category generates its slug when it is created. The Slug will be based on its name. We don't want the creator of the Category to know how a Slug is generated. Therefore it makes most sense to generate the Slug in the Category constructor. Below a test that shows how the slug is generated from a user perspective:

protected function createCategory(
    ?SlugGenerator $generator = null,
    string $name = 'PHP',
    ?string $slug = null
) : Category {
    if ($slug === null) {
        $slug = strtolower($slug);
    }

    if ($generator === null) {
        $generator = $this->createStub(SlugGenerator::class);
        $generator->method('generate')
            ->willReturn(new Slug($slug));
    }

    return new Category(
        $generator,
        $name
    );
}

/** @test */
public function a_category_generates_a_slug()
{
    $generator = $this->createMock(SlugGenerator::class);
    $generator->expects($this->once())
        ->method('generate')
        ->with($this->equalTo('PHP'))
        ->willReturn(new Slug('php'));

    $category = $this->createCategory(
        generator: $generator
    );

    $this->assertEquals(
        new Slug('php'),
        $category->getSlug()
    );
}

The createCategory method has some interesting logic. When creating a Category for testing we want a default SlugGenerator stub that returns a predefined slug. If no slug is provided to the createCategory method we will use the lowercase version of the $name. In some cases we want to overwrite this though. We can do that by providing the $slug parameter separately.

In case of this test we want to make sure the SlugGenerator is a mock instead of a stub. We want to make sure in our test that the generate method is actually called.

Making sure the test passes is easy:

public function __construct(SlugGenerator $generator, string $name)
{
    if ($this->isTooShort($name, self::MIN_NAME_LENGTH)) {
        throw new InvalidCategory('Category name must be minimum '.self::MIN_NAME_LENGTH.' characters');
    }
    if ($this->isTooLong($name, self::MAX_NAME_LENGTH)) {
        throw new InvalidCategory('Category name must be maximum '.self::MAX_NAME_LENGTH.' characters');
    }

    $this->slug = $generator->generate($name);
    $this->name = $name;
}

Next we have the BlogPost. For BlogPost the Slug is generated based on its title. Again, the user shouldn't be bothered with the details on how the Slug is generated. Therefore the BlogPost constructor will create the necessary slug.

protected function createBlogPost(string $slug = 'my-first-blog-post',
    string $title = 'My first blog post',
    string $introduction = 'A short introduction to the BlogPost',
    string $content = 'The content of the full article',
    SlugGenerator $generator = null,
    Parser $parser = null
): BlogPost {
    if($generator === null) {
        $generator = $this->createStub(SlugGenerator::class);
        $generator->method('generate')
            ->willReturn(new Slug($slug));
    }

    if ($parser === null) {
        $parser = $this->createStub(Parser::class);
        $parser->method('parse')
            ->will($this->returnArgument(0));
    }

    return new BlogPost(
        generator: $generator,
        parser: $parser,
        author: new Author('Mark'),
        category: $this->createCategory(),
        title: $title,
        introduction: $introduction,
        content: $content,
    );
}

protected function createCategory()
{
    $generator = $this->createStub(SlugGenerator::class);
    $generator->method('generate')
        ->willReturn(new Slug('php'));

    return new Category($generator, 'PHP');
}

/** @test */
public function a_blogpost_generates_a_slug()
{
    $generator = $this->createMock(SlugGenerator::class);
    $generator->expects($this->once())
        ->method('generate')
        ->with('My first blog post')
        ->willReturn(new Slug('my-first-blog-post'));

    $blogPost = $this->createBlogPost(generator: $generator);

    $this->assertEquals(
        new Slug('my-first-blog-post'),
        $blogPost->getSlug()
    );
}

Same as with the CategoryTest, I added a createCategory and createBlogPost method to the test. The createCategory method always creates the same category with the same SlugGenerator. The createBlogPost method allows the test to determine the SlugGenerator and used slug value.

The test checks if BlogPost uses the SlugGenerator to generate the slug based on the title and if the correct Slug is returned.

The implementation in the BlogPost class to make the test pass:

ublic function __construct(
    SlugGenerator $generator,
    Parser $parser,
    Author $author,
    Category $category,
    string $title,
    string $introduction,
    string $content,
) {
    $this->validate($title, $introduction, $content);

    $this->slug = $generator->generate($title);
    $this->author = $author;
    $this->category = $category;
    $this->title = $title;
    $this->introduction = $introduction;
    $this->content = $content;

    $this->status = new Draft();
    $this->parser = $parser;
}

public function getSlug() : Slug
{
    return $this->slug;
}

And I think we are done. Let's have another look at the requirements we stated at the beginning of this article:

  1. A slug must be based on the name of a category and the title of a blog post
  2. A slug can only contain lowercase letters, numbers and dashes (-)
  3. A slug must be unique
  4. When trying to create an existing slug we add a sequential number to the slug to make it unique

The Category class generates a slug based on its name. The BlogPost class generates a slug on its title. We implemented requirement 1.

The slug object verifies that the slug only contains lowercase letters, numbers and dashes. If we provide an invalid slug we get an error. In addition, the SlugGenerator class makes sure any generated slug will adhere to these requirements. We implemented requirement 2.

The SlugGenerator checks with the SlugRepository if a slug is unique. If the slug already exists it adds a sequence number. We also implemented requirement 3 and 4.

This concludes the implementation of slugs in our Blog.

Source code and next article

We can create a BlogPost and interact with it. The next step, discussed in the next article, is updating an existing BlogPost. We will have some interesting business logic related to slugs, authors and the status of a BlogPost.

The source code of the Blog project and the changes made in this article are available on Github:

Author

Mark Kazemier's avatar
Mark Kazemier

Hi, my name is Mark. I'm the founder of webdevils.nl and love developing websites and other web applications. Through Webdevils.nl I want to spread my enthousiasm about the web and PHP. In my professional live I'm a security expert specialised in security monitoring.

View all posts