It’s been a while.
My blogs and other sites all run on jekyll (well are generated by and hosted inside nginx) and have been as void of new content as ever. but at least now they can’t be hacked as easily as wordpress.
Today, new shocker.
- ssh gives me connection refused (this must have happened today I knew because even yesterday I was playing with my python project)
- went to hetzner for recovery console because I assumed I had somehow accidentally firewalled myself out again (unlikely if I had thought more about it… but hey panicmode). there is no recovery console?
- still connection refused both on my new machine and my laptop
- hetzner only offers a console that wants a password that I don’t have but at least suggests I could connect.
- reboot doesn’t fix the problem
… I was about to use the 4 hour old backup to restore the computer because that might have fixed things? I mean minus a dozen semi-solicited mails is not terrible for data loss.
- a few minutes later everything magically recovers, logging in I can see gitlab on the way out the door with a huge load of whatever…
Question:
Root and my personal user are both set up for password less login with ssh. My user has a password, I am almost sure root has no password because of the way that hetzner does the setup. at least I do not have the root password.
Is this a bad idea? Should I setup differently? Add a third user that is allowed to login via terribly-long password, just in case?
This was a bit too much panic-mode for a saturday afternoon…
2 Likes
I have PermitRootLogin = prohibit-password
(or the older and easily misunderstood “without-password”) as part of my standard sshd setup.
FWIW I haven’t seen any access anomalies to Hetzner today. However, it’s possible that your box may have been being hit by a password-guessing attack and that was eating up all the available server connections. So if everyone legitimately logging in to the box is using a key rather than a password, it might be worth adding something like
LoginGraceTime 10
which would mean the server wasn’t rejecting new connections (like yours) while waiting for an invalid password from an attacker. (The default is 120 seconds.)
On Debian, grep Failed.password /var/log/auth.log
will show failed passwords for ssh…
2 Likes
Thanks 
I’ll add that setting.
Just 400 attempts today, doesn’t seem like much. But maybe if I just had unlucky timing–trying to check if I could login from my new computer. A bunch just around the time I tried to login.
Various users: git, operator, postgres, samba, root, ubuntu, tech, admin
Glad I am running everything in containers that do not have users on the main machine.
Sometimes looking at serverlogs, I despair…
3 Likes
I am back with more questions…
I have my internet server running all with ansible / docker
For SSL I have an automation for letsencrypt that works with traefik in some “magical” way so I don’t have to worry and still get fresh certs every quarter.
Now I am setting up my 2nd NUC as our homeserver. I have a stripped down version of the same ansible and docker scripts because well a lot of stuff is similarish and I want to test federation for next cloud 
So now how do I ssl-ize my homeserver which is under a dynamic IP address and does not have its own domain.
Would you do self-signed? Other options?
You can still use LetsEncrypt without a static IP. The easiest way is using an API from your DNS provider (using a subdomain from your main domain is easiest)
Like if your main domain was supercool.example
, you could use slightlyless.supercool.example
for your home stuff
1 Like
okay, but how do I tell my dns provider (I have the API, I am using it for the server with traefik) that the subdomain points to my dynamic IP or do I not need that? It just needs the acme challenge… is that enough?
edit : thanks for the clue. I found some articles. Using Let's Encrypt for internal servers - Philipp's Tech Blog
That’s really neat because I have most of that solution already in place.
1 Like
Yeah basically the viable routes for getting a cert for domain X are:
Either of these has to be visible from outside.
2 Likes
I am already doing that part on my internet-server, I had not realized it was viable for my at-home-server.
I am going to give it a try either tomorrow or sometime next week.
(I very much prefer Advent Of Code to all other techie stuff right now. It’s puzzly but so relaxing because I don’t need to do it)
2 Likes
For now btw I have just copied the certificates from my server but I believe I will get the DNS challenge to work for my internal server.
The home server is currently running a similar docker/traefik setup as my internet server.
However, I am very unhappy with the “latest” shenanigans Ubuntu has introduced. It is becoming more and more windows like in its attempts to tell me how to run my server. Nope I don’t want a bunch of “don’t delete this, this is managed by weird service you never heard of before” files in /etc. I can very well edit my resolv.conf on my own thank you.
On the other hand I finally have a working pihole and I am excited to try out Traefik 3.0 soonish 
(After barely managing the upgrade from 2.9 to 2.10 
2 Likes
Me, someone who just recently migrated from 1.9
1 Like
There was one major change between 2.9 and 2.10 namely how the dns challenge finds out which domains to challenge for. And that really really broke things because the previous configuration didn‘t throw any errors, it just stopped generating wildcard certificates.
Right now I am working on the home server so I do updates more regularly. But this is not an every day hobby. It is one I work on for a few weeks or months and then when stuff is up and running I drop it. And then it runs smoothly for half a year until it doesn‘t or some shitty security thing crops up and forces me to upgrade roundcube (not this time, because I already did that).
And for work I work with docker but only a couple of containers and the only upgrades we ever casually do is pulling our own release. When the wildfly inside those is upgraded it is a major undertaking and ES upgrades are similarly scary and usually not within my responsibility.
on my own servers I run a lot of containers with :latest
except traefik.
2 Likes
It seems I never run out of issues.
Today I am home from a lengthy work thing, still tired and I notice that my paperless instance stopped running. So I log into the local server (a nuc) and discover that
a) it wants a firmware update and
b) the file system is now in read only mode and almost all commands including df/dh/dmesg, attempts to check anything on the drive give I/O errors. Including rsync or scp.
The data that is stored exclusively on the NUC is the paperless which I manually rsynced to our NAS. So I am missing a few weeks of documents on the NAS. Which is not a big issue. The most important one is an invoice we still have has a paper hardcopy.
We are now without a pi-hole. I guess we’ll have to live with that.
We just shut down the NUC.
Next step: boot linux from a stick and see if that allows me to save data or analyze whats wrong.
If that doesn’t help … probably extract the SSD and try to read the data from some other device. No idea how to go about that. I am not in a huge hurry, it just sucks that I’ll probably have to put the system back together. This will show if my docker/ansible approach really works 
Any tips what I could/should do?
1 Like
Sounds like the right approach. If you can get the NUC running off a usb stick, you might dd
the affected partition onto a different Linux box (e.g. over ssh
) then mount it with -o loop
to avoid any further data loss.
1 Like