Notes

Pitfalls with Traefik

March 10, 2024

I recently decided to start using Coolify for managing the various services I host for myself and others. I had always struggled to maintain an overview over the various services I had deployed on different servers in different ways and hoped that this would help me. I'm currently still in the process of migrating everything, but have been mostly happy with Coolify itself so far. One thing I struggled with however, has been the switch from NGINX to Traefik Proxy, which is used by Coolify for routing to the various services.

In theory Traefik seems like a good tool for managing the sort of task you would usually accomplish by setting up a reverse-proxy with NGINX, only better, since the configuration is not centralized but actually attached to the service you are running through Docker. In practice however, I have found working with it quite difficult, in more ways than one.

My setup involves two main domains, the latter being the shortened version of the first. The idea is that I use jonas-langlotz.de as my canonical URL, while lltz.de can be used as a shorthand which redirects to the full name. This means that for all my services that are reachable through subdomains, I always want the shorthand to expand.

To accomplish this, I decided to add a dynamic configuration file which sets up a generic router for all traffic to my short domain and uses a RedirectRegex middleware to redirect it. This seemed simple enough on paper, but turned out to contain some pitfalls which cost me quite a few hours to fix.

I started out with this basic configuration:

http:
 routers:
    redirect-router:
      rule: "HostRegexp(`{subdomain:.*}lltz.de`)"
      middlewares:
        - redirect-to-jonas-langlotz
      service: noop@internal
      entryPoints:
        - http
        - https

 middlewares:
    redirect-to-jonas-langlotz:
      redirectRegex:
        regex: "^(http|https)://(.*?\\.)?lltz.de/(.*)"
        replacement: "$${1}://$${2}jonas-langlotz.de/$${3}"
        permanent: true

This didn't work. I was seeing some 404 errors and some 505 errors. It turned out there were multiple issues.

Differences in Escaping

One thing that tripped me up where the differences in how various elements of the strings needed to be escaped and how this differed, depending on if the configuration was part of Dockerfile labels or, like in this case, a dynamic configuration YAML file.

In a Dockerfile label, there are two things that need to be considered. Dollar signs ($) need to doubled, to escape them from docker-compose variable substitution. Backslash (\) needs to also be escaped by doubling.

In a YAML file, these rules do not exist. You can simply use both $ and \ freely. So if you copy a configuration from an existing deployment to make it generic, you need to make sure to remove the doubled signs.

Misleading 'entryPoints' Attribute

As part of the router definition you need to specify which entry points this router is valid for. As such this attribute is labeled 'entryPoints'. This is false. For some reason, which is not clear to me, even looking at some issues on their Github, Traefik decided it made sense to call this attribute the plural, even though only one entry point per router is supported. You will not get an error about this and the dashboard will show that both entrypoints are configured, but one will silently fail with an HTTP error code.

To get the configuration to work, you will need to provide two separate router configuration dedicated to HTTP and HTTPS respectively.

Missing TLS configuration

One thing to look out for is the configuration of the TLS section when specifying the HTTPS router. Without setting the tls.certprovider field to a valid cert provider like default or letsencrypt in my case, the connection over HTTPS will not work. You will get an HTTP 502 error code.

Missing certificates

Since a redirection through an HTTPS URL still requires a TLS connection for the redirecting URL, you should make sure that there is a valid certificate for the domain you're using. Each subdomain requires its own certificate and since the redirection uses a wildcard, Traefik can't automatically generate them. To solve this, I configure the alternative domains as part of the service. To make sure that they are not used, I set the priority field for the redirection routers to 100.

Final Config

After all these changes, the final config will look something like this:

http:
  routers:
    redirect-router-http:
      rule: 'HostRegexp(`{subdomain:.*}lltz.de`)'
      middlewares: redirect-to-jonas-langlotz
      service: noop@internal
      entryPoints: http
      priority: 100
    redirect-router-https:
      rule: 'HostRegexp(`{subdomain:.*}lltz.de`)'
      middlewares: redirect-to-jonas-langlotz
      service: noop@internal
      entryPoints: https
      tls:
        certResolver: letsencrypt
      priority: 100
  middlewares:
    redirect-to-jonas-langlotz:
      redirectRegex:
        regex: '^(http|https)://(.*?\.)?lltz.de/(.*)'
        replacement: '${1}://${2}jonas-langlotz.de/${3}'
        permanent: true

It's a lot more verbose, but at least it does the actual job of generically forwarding all traffic to the canonical domain.

Conclusion

Part of the problem was that even with logging set to debug and access logs turned on, there was no way to see what was actually happening with the middleware, when the behavior was not as expected. The only possibility of troubleshooting was blind trial and error.

The fact that the documentation in regard to these configuration options was somewhat lacking didn't help either.

I'm still optimistic however, now that I have gotten past the basic issues, that Traefik will be a good alternative to NGINX for my needs. I'm looking forward to migrating the rest of my services.

I hope that this post has helped you and wish you good luck with solving your Traefik configuration issues.

Jonas Langlotz

Hey, I'm Jonas 👋

I'm a computer science student from Berlin, Germany. If you want to read more of my notes, you can find them here.