Looks like it works.

Edit still see some performance issues. Needs more troubleshooting

Update: Registrations re-opened We encountered a bug where people could not log in, see https://github.com/LemmyNet/lemmy/issues/3422#issuecomment-1616112264 . As a workaround we opened registrations.

Thanks

First of all, I would like to thank the Lemmy.world team and the 2 admins of other servers @stanford@discuss.as200950.com and @sunaurus@lemm.ee for their help! We did some thorough troubleshooting to get this working!

The upgrade

The upgrade itself isn’t too hard. Create a backup, and then change the image names in the docker-compose.yml and restart.

But, like the first 2 tries, after a few minutes the site started getting slow until it stopped responding. Then the troubleshooting started.

The solutions

What I had noticed previously, is that the lemmy container could reach around 1500% CPU usage, above that the site got slow. Which is weird, because the server has 64 threads, so 6400% should be the max. So we tried what @sunaurus@lemm.ee had suggested before: we created extra lemmy containers to spread the load. (And extra lemmy-ui containers). And used nginx to load balance between them.

Et voilà. That seems to work.

Also, as suggested by him, we start the lemmy containers with the scheduler disabled, and have 1 extra lemmy running with the scheduler enabled, unused for other stuff.

There will be room for improvement, and probably new bugs, but we’re very happy lemmy.world is now at 0.18.1-rc. This fixes a lot of bugs.

  • Kuroshi@lemmy.ramble.moe
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    1 year ago

    I may not be a user on your instance, but either way, thanks for the upgrade. I was noticing a lot of issues with federation from lemmy.world, and it seems like this upgrade more-or-less fixed them.

    I’m just running a tiny, single-user instance, but I want you to know that I appreciate the work you’re putting in! I run large-scale infra as my day job, so I understand how challenging this sudden influx of users (and federated servers!) is.