Alt account of [email protected] here.

Our instance is currently down and I can’t get remote access to the servers. It appears that there might have been a hardware failure of the main firewall, which is the one thing I can’t work around remotely.

I am still trying a few things, but I am not very optimistic that I can get access.

The really unfortunate part is that just now I am on one of my rare work deployments abroad, so I also can’t access it physically during the next few weeks and my usual back up that could restart it is not available either.

As something like that never happened in 3 years operating the servers, I thought I can risk it, but murphy’s law seems inescapable 😓

I will try to keep you posted here on any updates, but probably there will not be much I can do for a while. Really bad timing 😥

Edit: we might use this “opportunity” to migrate the instance to Piefed, which has been an idea for quite some time now. I will keep you posted on that.

  • Kris@feddit.orgOP
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    1 day ago

    We have a small write up about the hardware on our wiki, but it is also down right now.

    I think we will share a post-mortem write up of the actual improvements we will do to avoid this in the future.

    One thing I will definitly do is to add a KVM remote management console to one of our server boards and move the main firewall into a VM with hardware passthrough of the NICs (this was anyways planned for a 10gbit network upgrade for the second half of 2025). This way I should be able to reboot and even reinstall the main ingress point remotely, so that only the fiber gateway remains as a failure point that requires physical access.