0

On this server system we have been offloading the ssd system "disk" of high-write loads for reliability / longevity reasons and following one strategy, we went to a new one and everything seemed OK.

However, we discovered that we'd made an error and forgot to remove a sym link that was in /. The purpose of that link was to avoid having to reboot and provide a partition, which we've later done. So, that link pointed to the new disk space for the /var tree. The new strategy just uses a mount point for /var. We rebooted and everything seemed OK. But without removing the symlink, it was odd: Everything seemed to work OK, but if you did realpath in /var it seemed to be on the wrong disk?!

So, we removed the symlink and rebooted and removing the sym link and THAT'S when things went wrong.

The mail files on the intended mount of /var were ALREADY fully up to date, which was surprising. So,over-writing the current /var with the former location - where realpath said it was - seemed like a bad idea. However, Postfix won't start!

It says:

Failed to start postfix.service: Unit var.mount is masked.

I've tried unmasking and have also read and considered the input from this question and this one, too.

So far, we have a "system down" situation!

We still have both disk trees intact, of course!

Asked for Additional Data

I suppose var.mount is a service - never knew that until now but:

  1. systemctl unmask var.mount completes silently. A subsequent attempt to try to either start var.mount or postfix.service repeats the original error message about it being masked.

  2. systemctl status var.mount says:

systemctl status var.mount
● var.mount
     Loaded: masked (Reason: Unit var.mount is masked.)
     Active: active (mounted) since Sat 2023-10-28 12:06:09 PDT; 5h 19min ago
      Where: /var
       What: /dev/sda1
        CPU: 2ms
     CGroup: /system.slice/var.mount
Notice: journal has been rotated since unit was started, output may be incomplete.

Following my own query in a comment below I did a web search and found this article, and so I ran the following:

# systemctl list-unit-files | grep masked
-.mount                                                                   masked-runtime  disabled
boot.mount                                                                masked-runtime  disabled

...all the mounts followed with exactly the same form as boot.mount, just the other mount names with slashes replaced by dashes.

No attempts at unmasking have worked so far, but I'm only using systemctl's unmask command...

...Still digging into it!

Advice is VERY welcome!

Richard T
  • 1,262
  • After making changes to fstab, you can tell systemd to regenerate its dynamic configuration using systemctl daemon-reload - after which investigating whether they match your intent can be done separately from rebooting. Try adding to your question: your /etc/fstab, the output of systemctl status [unit] of relevant unit.service and unit.mount, lsblk, mount, .. – anx Oct 28 '23 at 23:46
  • @anx I tried the daemon-reload - didn't work. I haven't the SLIGHTEST idea why lsblk would be of interest, or mount for that matter. Mount has the same options as root, so I don't see the pertinence there, either. NOTABLY - and I indicated this above - the new mount point WAS receiving the live email from postfix. ...Its configuration is in etc, NOT var.... If you can tell me WHY lsblk and mount is is useful, I would find your answer on why as useful! And as for "relevant unit.service" I have NO idea what you mean, except perhaps you mean for postfix.service? Sure. I'll add it now. – Richard T Oct 29 '23 at 00:19
  • @anx It just occurred to me I responded too soon in my last message: there's no point to systemctl daemon-reload when the system was just rebooted instead, TWICE. The first time had the link, the second time didn't. THAT is the difference - NOTHING to do with changes to fstab. However, sure, it SAYS "Unit var.mount is masked." And, unfortunately for me I NEVER HEARD OF "masking" until this error has occurred. Perhaps that's what my question should have been about! ... I guess this is what I get for not reading the last 20 years or so worth of Fedora release notes! – Richard T Oct 29 '23 at 00:31
  • 1
    I find it odd for a directory simultaneously being a mount point, and systemd (not generally, just during this runtime) believing it should not set it up as a mount point. Usually /etc/fstab would be the place to configure the intended mount points in. In any case, your question is not exactly clear on what is former, what is current state, and what is intended and what is unintended state. Is /dev/sda1 a regular partition containing the data that you want, and is /var a regular directory right now? – anx Oct 29 '23 at 01:18
  • 1
    Though I do not believe you should do that before diagnosing what is going on, both masked and masked-runtime status can be overridden manually. Calling unmask without --runtime not clearing the latter state was (in older versions) by design. I do not think that is what you want, as you probably did not hand-write or manually mask those units. But should you for some reason still want to brute-force your way to modify systemd internals, the mask paragraph on the systemctl manual documents the --runtime case. – anx Oct 29 '23 at 01:34
  • Unfortunately there is a lack of documentation on automatic actions systemd performs when things went wrong before bringing up dependent services, but you will still find some more details when searching the context of mentions of the unit names your system logs. e.g. try journalctl -b 0 -o short-iso-precise | grep -C 10 -i -E 'var|mount|mask|postfix' – anx Oct 29 '23 at 01:39
  • @anx I find it odd, too. However, in poking around some, I wonder if MAYBE there's some systemd config that's found in /var? If so, THEN.... I have observed that SOME application's data was found on the mount point and SOME is found where the link pointed, creating a bifurcated world! ...I'm still trying to understand how THAT can happen, but clearly the email was going to the new, /var specific, mount point. ... I don't agree that my question is not exactly clear, what's not exactly clear is how the heck systemd actually works "under the sheets!" However, I think what's in the other var... – Richard T Oct 29 '23 at 01:40
  • 1
    @anx Just following up: An opportunity to reboot just hasn't happened BUT a power spike got through our (bad quality) UPS and rebooted it for us early Monday morning and while the system didn't come up all by itself, that was the fault of the BIOS being set improperly. Once I "hit the button," it came right up; there was NO fall-out from this. There was no "underlying problem," apparently. It was just a badly done reboot, as described above. That said, I VERY MUCH appreciate your warning me - there MIGHT have been something to it! ... I was planning for xMas or New Years to do a reboot! Ciao! – Richard T Dec 06 '23 at 15:15

1 Answers1

-2

I did a web search for "masked-runtime" and that led me to... (drum-roll) the systemctl man page?! WOAH, who'da thunk the MAN page would have the answer?

That led me to learn about the systemctl option --runtime and when I used it on an unmask command on the mount point I was able to then start postfix the usual way! YAY!

# systemctl unmask var.mount --runtime

From there, I just started postfix the usual way - with another systemctl command!

Now, postfix isn't happy because dovecot isn't happy, but now at least THIS problem is behind me!

Richard T
  • 1,262
  • 1
    --runtime will not live through another reboot, you haven't solved the issue, you've just masked it (pun intended). As @anx said, the question is unclear as to what the previous, current and desired state is. The content of /etc/fstab, and the output of ls -ld /var, journalctl -xu var.mount right after reboot would help people wanting to help you. – Ginnungagap Oct 29 '23 at 08:18
  • @Ginnungagap Thanks for the warning but as a server system, I can't just bring it down any ole time for a test. And, whoever down-voted this answer is a misguided soul: Finding this answer, even if a temporary one as you say, was very important and not easily done with a quick web search. ... When in a "production server down" situation, "the clock is ticking!" ...As for mount points? I believe the answer to this puzzle is more to be found in the reasons why the system decides to mask mount points than on the particulars of that mount point. Besides, the fstab is FAR too long and too private. – Richard T Oct 29 '23 at 14:28
  • Ah I didn't get that this was first attempted on a a production system. As to why it's masking them, can you post the output of the commands I previously mentioned (skip the fstab if it's really that sensitive). Also, what does systemd-fstab-generator say if you run it manually? – Ginnungagap Oct 29 '23 at 23:21
  • @Ginnungagap Thanks... My system has a 31 different systemd-(something) commands, but not that one! Other ideas? And why are people down-voting this? Because it's not a PERFECT or PERMANENT answer? I guess they've never run a production server, even though this IS the Server Fault forum! I figured most here ARE sys-admins who run production systems?! System managers of production systems understand these things, those who don't? Well... -shrug- I don't get it. – Richard T Oct 30 '23 at 15:57
  • systemd generators aren't in your PATH but in /usr/lib/systemd/system-generators. – Ginnungagap Oct 30 '23 at 18:26
  • @Ginnungagap Thanks much. ...I don't know what arguments to supply ... on a PRODUCTION system. I read the man-page and learned a good bit but don't have enough background information to know what won't be harmful... I obviously have a lot to learn about systemd! But then, I also have a lot else I have to spend time on. So, a hint or three would be appreciated! – Richard T Nov 01 '23 at 02:59