Host your own Mastodon Instance with Docker (Part 2)

In my last post I wrote about setting up my own Mastodon instance with Docker. In this post I’m going to extend my setup by adding a few additional services. If you haven’t already, make sure to check out Part 1 of the setup first.

Overview

As mentioned in my last post, there were some drawbacks with the Mastodon instance I set up. First, there were errors popping up in the dev tools of the browser because I did not set up the streaming API of Mastodon. Second, I also did not set up the optional Elasticsearch service, and thus the search functionality of my instance is limited. I’m going to change that and set up those two services now.

But before I start, I’m going to cleanup and improve the existing setup.

Remove Duplication

The first thing I’m going to do is to remove the duplication in the docker compose file. For that I’m creating a second docker compose file called docker-compose.base.yml. I’m adding the following content which defines a service called mastodon-base:

version: '3.8'

services:
  mastodon-base:
    image: 'tootsuite/mastodon'
    volumes:
      - 'mastodon-volume:/mastodon/public/system'
    environment:
      RAILS_ENV: 'production'
      LOCAL_DOMAIN: '${MASTODON_DOMAIN}'
      REDIS_HOST: 'mastodon-redis' # name of the redis container
      DB_HOST: 'mastodon-db' # name of the database container
      DB_NAME: '${MASTODON_POSTGRES_DATABASE}'
      DB_USER: '${MASTODON_POSTGRES_USERNAME}'
      DB_PASS: '${MASTODON_POSTGRES_PASSWORD}'
      SECRET_KEY_BASE: '${MASTODON_SECRET_KEY_BASE}'
      OTP_SECRET: '${MASTODON_OTP_SECRET}'
      SMTP_SERVER: '${SMTP_SERVER}'
      SMTP_PORT: '${SMTP_PORT}'
      SMTP_LOGIN: '${SMTP_LOGIN}'
      SMTP_PASSWORD: '${SMTP_PASSWORD}'
      SMTP_FROM_ADDRESS: '${SMTP_FROM_ADDRESS}'

As you can see that service contains all the definitions that my mastodon-web and mastodon-sidekiq services have in common. I’m not going to start this service, instead I’m going to adjust the existing docker-compose.yml so that my mastodon-web and mastodon-sidekiq services extend from the mastodon-base service:

version: '3.8'

services:
  # ... existing services configuration omitted ... #
  mastodon-web:
    extends:
      file: 'docker-compose.base.yml'
      service: 'mastodon-base'
    command: 'bash -c "/provision.sh; rm -f /mastodon/tmp/pids/server.pid; bundle exec rails s -p 3000"'
    volumes:
      - './provision.sh:/provision.sh:ro'
    environment:
      MASTODON_ADMIN_USERNAME: '${MASTODON_ADMIN_USERNAME}'
      MASTODON_ADMIN_EMAIL: '${MASTODON_ADMIN_EMAIL}'
    depends_on:
      - mastodon-db
      - mastodon-redis

  mastodon-sidekiq:
    extends:
      file: 'docker-compose.base.yml'
      service: 'mastodon-base'
    command: 'bundle exec sidekiq'
    depends_on:
      - mastodon-db
      - mastodon-redis

volumes:
  # ... existing volumes configuration omitted ... #

Add Health Checks

While upgrading to the latest Mastodon release I noticed that the Sidekiq service was constantly throwing errors because the database schema didn’t match, even after the migration of the database completed. I assume that there is some caching going on, and after restarting the container the errors disappeared, so I wanted to make sure this doesn’t happen again.

To achieve this, I’m going to add health checks to my service definitions. The health checks allow me to specify that docker compose should wait for a service to be ready before it starts another service that depends on it.

So the first thing I do is to add the health checks to the service definitions. I’m only listing the health check of the mastodon-web service here, you can find the rest at the end of the blog post:

version: '3.8'

services:
  mastodon-db:
  # ... existing services configuration omitted ... #
  mastodon-web:
    # ... existing configuration omitted ... #
    healthcheck:
      test: ['CMD-SHELL', 'wget -q --spider --proxy=off localhost:3000/health || exit 1']
      interval: 5s
      timeout: 5s
      retries: 12

volumes:
  # ... existing volumes configuration omitted ... #

With that done, I can change the depends_on definition of the services to use the condition service_healthy:

version: '3.8'

services:
  # ... existing services configuration omitted ... #
  mastodon-sidekiq:
    # ... existing configuration omitted ... #
    depends_on:
      mastodon-web:
        condition: service_healthy

volumes:
  # ... existing volumes configuration omitted ... #

Now when I start all the containers, Docker will wait for the services to start and get a healthy response from the health check before it starts any of the dependant services. Overall with this changes it takes longer until everything is up and running, but I won’t get any errors anymore if the database migrations are not yet applied when a service starts.

Add the Streaming API

With the cleanup and improvements done, it is time to extend my Mastodon instance. The first thing I’m going to add is the streaming API. The streaming API is different from the normal REST API of Mastodon, as it allows the user interface to connect to the API via websockets and build up a bidirectionally communication channel. By doing so, Mastodon can push data to the user interface as soon as there is new data available, allowing it to display notifications or update the feed in real time.

The streaming API is also part of the Mastodon docker image, so I add a new service to my docker compose file with the name mastodon-streaming that also extends the mastodon-base service definition I created earlier:

version: '3.8'

services:
  # ... existing services configuration omitted ... #
  mastodon-streaming:
    extends:
      file: 'docker-compose.base.yml'
      service: 'mastodon-base'
    command: 'node ./streaming'
    healthcheck:
      test: ['CMD-SHELL', 'wget -q --spider --proxy=off localhost:4000/api/v1/streaming/health || exit 1']
    depends_on:
      mastodon-web:
        condition: service_healthy

volumes:
  # ... existing volumes configuration omitted ... #

The next step is to configure the reverse proxy so that requests to the path /api/v1/streaming are forwarded to the mastodon-streaming service and all the other requests to the mastodon-web service. For this, I create a file called Caddyfile with the following content:

{$MASTODON_DOMAIN} {
  handle /api/v1/streaming* {
    reverse_proxy mastodon-streaming:4000
  }

  handle {
    reverse_proxy mastodon-web:3000
  }
}

Now I need to adjust the service definition of the mastodon-proxy service in my docker compose configuration. I’m going to remove the existing command and instead add a volume mount for the Caddyfile, which is loaded by the Caddy container by default:

version: '3.8'

services:
  # ... existing volumes configuration omitted ... #
  mastodon-proxy:
    image: 'caddy:alpine'
    volumes:
      - 'caddy-data-volume:/data'
      - 'caddy-config-volume:/config'
      - './Caddyfile:/etc/caddy/Caddyfile:ro'
    environment:
      MASTODON_DOMAIN: '${MASTODON_DOMAIN}'
    ports:
      - '80:80'
      - '443:443'
    depends_on:
      mastodon-web:
        condition: service_healthy

volumes:
  # ... existing volumes configuration omitted ... #
  caddy-data-volume:
    external: true
  caddy-config-volume:
    external: true

I also added to new docker volumes and added volume mounts for them. They are used by Caddy to store persistent data. I added them because I ran into a rate limit as Caddy was issuing new Let’s Encrypt certificates every time I restarted the containers. So I need to make sure I create those two new volumes:

docker volume create caddy-data-volume
docker volume create caddy-config-volume

With that done I can run docker compose up and watch all containers start up in the correct order. To check that everything is working I open the health check URLs https://social.raeffs.dev/health and https://social.raeffs.dev/api/v1/streaming/health in a browser.

Add Elasticsearch

With the streaming API up and running it is time to add Elasticsearch to the setup. For this I once again create a new service called mastodon-es, based on the what I found in the Mastodon docker compose example:

version: '3.8'

services:
  # ... existing services configuration omitted ... #
  mastodon-es:
    image: 'docker.elastic.co/elasticsearch/elasticsearch:7.17.7'
    volumes:
      - 'mastodon-es-volume:/usr/share/elasticsearch/data'
    environment:
      - 'ES_JAVA_OPTS=-Xms512m -Xmx512m'
      - 'xpack.license.self_generated.type=basic'
      - 'xpack.security.enabled=false'
      - 'xpack.watcher.enabled=false'
      - 'xpack.graph.enabled=false'
      - 'xpack.ml.enabled=false'
      - 'bootstrap.memory_lock=true'
      - 'cluster.name=mastodon-es'
      - 'discovery.type=single-node'
      - 'thread_pool.write.queue_size=1000'
      - 'ingest.geoip.downloader.enabled=false'
    healthcheck:
      test: ['CMD-SHELL', 'curl --silent --fail localhost:9200/_cluster/health || exit 1']
      interval: 5s
      timeout: 5s
      retries: 6

volumes:
  # ... existing volumes configuration omitted ... #
  mastodon-es-volume:
    external: true

Because this service uses a docker volume too, I need to create it first:

docker volume create mastodon-es-volume

I also need to add two additional environment variables to the mastodon-base service to enable Elasticsearch and tell Mastodon the hostname of the Elasticsearch service:

version: '3.8'

services:
  mastodon-base:
    # ... existing mastodon-base configuration omitted ... #
    environment:
      # ... existing environment variables omitted ... #
      ES_ENABLED: 'true'
      ES_HOST: 'mastodon-es' # name of the elasticsearch container

After that I can startup everything once more and see all the services running.

Conclusion

To be honest, I’m not 100% sure if the Elasticsearch integration works correctly. I feel like the search results on my instance are better than before, but other than that I can only assume it works as I don’t see any errors logged in the console output. I should probably set up some monitoring to make sure everything works correctly anyway.

Other than that my Mastodon instance runs very well, and now also the streaming API can be used, so the user interface updates in real time.

If you are interested in the final setup, I created a GitHub repository with all the relevant files that you can check out.