AlmaLinux 9.1 Problem on MDADM raid1
Hi Jack, I'm sorry to bother you during holidays. I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration: - /boot/efi on md125 - swap on md126 - / on md127 disks are 2 SSD MLC Type. After the installation, if I reboot the system I get: "md: md125 stopped" (it is printed many times like in a loop) alternated with: "systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)" and the system hangs on this loop until I cut the power. I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k. I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables. Also on the second workstation the problem is found. I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1 I tried also with 8.7. No problem here. I tried also RockyLinux 9.1 and got the same problem but with different messages: "block device autoconfig is deprecated and will removed" alternated with: "blkdev_get_no_open: 270 callbacks suppressed." To stop the machine I need to cut the power. I tried also Debian 11.5 without problems. So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1 There is a way to fix this or I should wait an upgrade? Thank you in advance.
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
This is software RAID1, not hardware RAID1? Would hardware hit the same problem? the Terramaster looks like it is hardware RAID? On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
On Tue, 3 Jan 2023 at 08:12, Robert Moskowitz
This is software RAID1, not hardware RAID1?
Would hardware hit the same problem? the Terramaster looks like it is hardware RAID?
Hardware RAID for ESP partitions requires that the EUFI firmware to understand how to talk to the hardware raid device. If the EUFI doesn't have the right driver it will try to talk to the hardware raid as a raw device in a different way and it goes to pot. [You would think that server hardware would be built to do this out of the box, but I have had a couple where hardware raid was only supported to boot non-EUFI or required the EUFI to be on a separate drive.]
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
On 1/3/23 09:15, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 08:12, Robert Moskowitz
wrote: This is software RAID1, not hardware RAID1?
Would hardware hit the same problem? the Terramaster looks like it is hardware RAID?
Hardware RAID for ESP partitions requires that the EUFI firmware to understand how to talk to the hardware raid device. If the EUFI doesn't have the right driver it will try to talk to the hardware raid as a raw device in a different way and it goes to pot. [You would think that server hardware would be built to do this out of the box, but I have had a couple where hardware raid was only supported to boot non-EUFI or required the EUFI to be on a separate drive.]
Argh. I want to buy a small RAID platform for my mail server which really needs updating. After what just happened on my NAS, I am totally sold on RAID. And for my small size, RAID1 is ok. So I can't buy a decent small business turnkey mail server, need to build it. CentOS is gone, that means AlmaLinux and probably iRedMail... But if I get the wrong box that won't give me AL supporting RAID, I have shot a few hundred. Sigh.
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
I looked some more at this Terramaster RAID box and it is a closed system with its own OS. Not something I can install my own OS on. :( So I ask: is there a "bare metal" RAID1 box out there I can install AlmaLinux on? On 1/3/23 09:15, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 08:12, Robert Moskowitz
wrote: This is software RAID1, not hardware RAID1?
Would hardware hit the same problem? the Terramaster looks like it is hardware RAID?
Hardware RAID for ESP partitions requires that the EUFI firmware to understand how to talk to the hardware raid device. If the EUFI doesn't have the right driver it will try to talk to the hardware raid as a raw device in a different way and it goes to pot. [You would think that server hardware would be built to do this out of the box, but I have had a couple where hardware raid was only supported to boot non-EUFI or required the EUFI to be on a separate drive.]
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
On 1/3/23 15:47, Robert Moskowitz wrote:
I looked some more at this Terramaster RAID box and it is a closed system with its own OS. Not something I can install my own OS on. :(
So I ask:
is there a "bare metal" RAID1 box out there I can install AlmaLinux on?
Accordance makes standalone internal RAID 1 subsystems (OS independent). I've used them in the very very distant past. https://www.accordancesystems.com/prod/products IMHO, that may be what you really need. The ARAID system will appear as a single drive. It is a "closed" firmware, like most anything of this type. At the end of the day, you probably don't care. It's a "singular drive" that just so happens to mirror to two drives.
On 1/3/23 09:15, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 08:12, Robert Moskowitz
wrote: This is software RAID1, not hardware RAID1?
Would hardware hit the same problem? the Terramaster looks like it is hardware RAID?
Hardware RAID for ESP partitions requires that the EUFI firmware to understand how to talk to the hardware raid device. If the EUFI doesn't have the right driver it will try to talk to the hardware raid as a raw device in a different way and it goes to pot. [You would think that server hardware would be built to do this out of the box, but I have had a couple where hardware raid was only supported to boot non-EUFI or required the EUFI to be on a separate drive.]
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
Hi, I have problem on MDADM software RAID. Il 03/01/23 14:12, Robert Moskowitz ha scritto:
This is software RAID1, not hardware RAID1?
Would hardware hit the same problem? the Terramaster looks like it is hardware RAID?
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
I just discovered that the HP Proliant gen8 does not have UEFI whereas the gen10 does. So should I skip the older gen8 boxen and go with the gen10? Or is it better, for RAID to avoid UEFI and get the older, cheaper gen8? thanks On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
On Thu, 5 Jan 2023 at 08:21, Robert Moskowitz
I just discovered that the HP Proliant gen8 does not have UEFI whereas the gen10 does.
So should I skip the older gen8 boxen and go with the gen10?
Or is it better, for RAID to avoid UEFI and get the older, cheaper gen8?
Hardware which doesn't support UEFI is probably going to have issues with EL8 or EL9 kernel in other ways (aka older megaraid or similar controller no longer supported) etc. Going from the web pages on HP ( https://techlibrary.hpe.com/us/en/enterprise/servers/supportmatrix/redhat_li... ), the Gen8 only supports RHEL 6 and RHEL 7. I am going to bet everything from network card to hard drive controller is EOL in EL8 and above on a Gen8.
thanks
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
trimmed bottom.
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
Thanks. This saves me time. But will cost more. :) One to further digging. On 1/5/23 08:31, Stephen John Smoogen wrote:
On Thu, 5 Jan 2023 at 08:21, Robert Moskowitz
wrote: I just discovered that the HP Proliant gen8 does not have UEFI whereas the gen10 does.
So should I skip the older gen8 boxen and go with the gen10?
Or is it better, for RAID to avoid UEFI and get the older, cheaper gen8?
Hardware which doesn't support UEFI is probably going to have issues with EL8 or EL9 kernel in other ways (aka older megaraid or similar controller no longer supported) etc. Going from the web pages on HP ( https://techlibrary.hpe.com/us/en/enterprise/servers/supportmatrix/redhat_li... ), the Gen8 only supports RHEL 6 and RHEL 7. I am going to bet everything from network card to hard drive controller is EOL in EL8 and above on a Gen8.
thanks
On 1/3/23 08:04, Stephen John Smoogen wrote:
On Tue, 3 Jan 2023 at 06:19, Alessandro Baggi
wrote: Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 - swap on md126 - / on md127
My limited understanding is that RAID on EFI has been something of a hack as the backing store that EFI uses is a slightly modified VFAT. What happens is that there is some code to 'clone' the data across but it isn't really RAID1. My guess is that something in the 9.1 kernel broke that hack. Could you try CentOS Stream 9 kernel (you can install that with your existing Alma or Rocky system) and see if the problem still occurs? If it does then it is a bug that needs to be tracked upstream at bugzilla.redhat.com http://bugzilla.redhat.com and if it doesn't then it should have been fixed in an upcoming kernel. You could continue to then use the CS9 kernel until whatever works in Alma/Rocky 9
trimmed bottom.
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
_______________________________________________ AlmaLinux Users mailing list --users@lists.almalinux.org To unsubscribe send an email tousers-leave@lists.almalinux.org
On 03.01.2023 13:16, Alessandro Baggi wrote:
Hi Jack, I'm sorry to bother you during holidays.
I encountered a strange problem installing AlmaLinux 9.1 on a RAID1 (MDADM) with the current configuration:
- /boot/efi on md125 Note that any EFI modification by BIOS firmware (it can happen) or by EFI utilities (like firmware update, running memtest and saving report) will corrupt and degrade your raid. Moreover note that chainloading in grub does not work with virtual devices (like md raid).
by far the easiest and safest way is to have individual ESP partitions and let's say they are mounted to /boot/efi and /boot/efi2 just make a systemd unit with the form of: root@hal: ~ # cat /etc/systemd/system/esp_sync.service [Unit] Description=Sync ESP1 to ESP2 DefaultDependencies=no ConditionPathIsDirectory=/boot/efi/EFI/ ConditionPathIsDirectory=/boot/efi2/EFI/ After=final.target [Service] Type=oneshot ExecStart=/usr/bin/cp -af /boot/efi/EFI /boot/efi2/ [Install] WantedBy=multi-user.target (you could also use rsync if it's guaranteed to be present) and attach to this service a timer with [Timer] OnStartupSec=40 and/or a path unit with a specification like [Path] Unit=esp_sync.service PathModified=/boot/efi/EFI/almalinux also make sure that /boot/efi/EFI/almalinux/grub.cfg is a stub with a content like: [root@fst09 ~]# cat /boot/efi/EFI/almalinux/grub.cfg search --no-floppy --fs-uuid --set=dev f9c0f1b7-7f36-4b80-9563-6b2702b14c19 set prefix=($dev)/boot/grub2 export $prefix configfile $prefix/grub.cfg (you get the UUID from blkid output for the md device) this way, content of ESP is minimal and rarely changes AND you have a ESP fallback in the way of second ESP you can put ",nofail,errors=continue" for the mounting of ESP within system as ESP is not really needed for system running. HTH, Adrian
- swap on md126 - / on md127
disks are 2 SSD MLC Type.
After the installation, if I reboot the system I get:
"md: md125 stopped" (it is printed many times like in a loop)
alternated with:
"systemd-shutdown[1]: Not all MD devices stopped, 1 left Stopping MD Devices Stopping MD /dev/md125 (9:125)"
and the system hangs on this loop until I cut the power.
I encountered this issue on my Workstation with Asus Prime Z490-A / i9-10850k.
I tried with another workstation that runs on Asus Prime Z370-A / i7 8700K, to exclude bad SATA controller and bad cables.
Also on the second workstation the problem is found.
I tried to replicate this using 9.0 ISO. The problem does not occur until I update to 9.1
I tried also with 8.7. No problem here.
I tried also RockyLinux 9.1 and got the same problem but with different messages:
"block device autoconfig is deprecated and will removed"
alternated with:
"blkdev_get_no_open: 270 callbacks suppressed."
To stop the machine I need to cut the power.
I tried also Debian 11.5 without problems.
So seems that the problem is 9.1 related. Actually I can't test the same with RHEL 9.1 but probably the problem will occour also on RHEL 9.1
There is a way to fix this or I should wait an upgrade?
Thank you in advance. _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
-- ---------------------------------------------- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania | adrian.sevcenco at {cern.ch,spacescience.ro} | ----------------------------------------------
I'm seeing the same on a newly installed 9.1 installation. However, my setup is different: * /dev/md127 on / type xfs * /dev/sda2 on /boot type xfs * /dev/sda1 on /boot/efi type vfat * /dev/md125 on /home type xfs * /dev/md126 on /var type xfs Most curiously, the message I receive is *also* about md125 not being able to be stopped. So I don't think this is related to the UEFI partition at all, as I don't have those as RAID. The system is compromised of two 4TB Seagate IronWolf harddisks which are partitioned as seen above. Is there more information that I can and should provide?
On 1/6/23 1:01 AM, Robert 'Bobby' Zenz wrote:
I'm seeing the same on a newly installed 9.1 installation. However, my setup is different:
* /dev/md127 on / type xfs * /dev/sda2 on /boot type xfs * /dev/sda1 on /boot/efi type vfat * /dev/md125 on /home type xfs * /dev/md126 on /var type xfs
Most curiously, the message I receive is *also* about md125 not being able to be stopped. So I don't think this is related to the UEFI partition at all, as I don't have those as RAID.
The system is compromised of two 4TB Seagate IronWolf harddisks which are partitioned as seen above.
Is there more information that I can and should provide? _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
Bobby, The *exact* error thrown would be incredibly helpful What is the output of these commands: cat /proc/mdstat mdadm --detail /dev/md125 This link: https://www.ducea.com/2009/03/08/mdadm-cheat-sheet/ will give you more information about what to expect to see and why
Error message is during shutdown as in the original mail:
md: md125 stopped
Or close enough, I currently can't stop the system. The message is spammed as fast as possible at as it seems. Output of the commands is as follow: # cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sda3[0] sdb1[1] 3756352512 blocks super 1.2 [2/2] [UU] bitmap: 2/28 pages [8KB], 65536KB chunk md126 : active raid1 sda5[0] sdb3[1] 58592256 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk md127 : active raid1 sda4[0] sdb2[1] 73399296 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> # mdadm --detail /dev/md125 /dev/md125: Version : 1.2 Creation Time : Sat Nov 19 14:58:50 2022 Raid Level : raid1 Array Size : 3756352512 (3.50 TiB 3.85 TB) Used Dev Size : 3756352512 (3.50 TiB 3.85 TB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Jan 6 11:21:50 2023 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : bitmap Name : arkham:home (local to host arkham) UUID : ad23388f:4e227a6c:2b3d141a:7a5f2338 Events : 9621 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 17 1 active sync /dev/sdb1 As I've said, I currently can't stop the system to see if it still happens. But I've seen it at least twice after the initial setup (and after the RAID had finished its initial sync).
Hi Bobby, I tried also RHEL 9.1 and got the same problem. Il 06/01/23 12:49, Robert 'Bobby' Zenz ha scritto:
Error message is during shutdown as in the original mail:
md: md125 stopped
Or close enough, I currently can't stop the system. The message is spammed as fast as possible at as it seems.
Output of the commands is as follow:
# cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sda3[0] sdb1[1] 3756352512 blocks super 1.2 [2/2] [UU] bitmap: 2/28 pages [8KB], 65536KB chunk
md126 : active raid1 sda5[0] sdb3[1] 58592256 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk
md127 : active raid1 sda4[0] sdb2[1] 73399296 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk
unused devices: <none>
# mdadm --detail /dev/md125 /dev/md125: Version : 1.2 Creation Time : Sat Nov 19 14:58:50 2022 Raid Level : raid1 Array Size : 3756352512 (3.50 TiB 3.85 TB) Used Dev Size : 3756352512 (3.50 TiB 3.85 TB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Fri Jan 6 11:21:50 2023 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0
Consistency Policy : bitmap
Name : arkham:home (local to host arkham) UUID : ad23388f:4e227a6c:2b3d141a:7a5f2338 Events : 9621
Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 17 1 active sync /dev/sdb1
As I've said, I currently can't stop the system to see if it still happens. But I've seen it at least twice after the initial setup (and after the RAID had finished its initial sync). _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
Hi, I tried also CentOS Stream 9 and got the same problem. Il 09/01/23 15:59, Alessandro Baggi ha scritto:
Hi Bobby, I tried also RHEL 9.1 and got the same problem.
Il 06/01/23 12:49, Robert 'Bobby' Zenz ha scritto:
Error message is during shutdown as in the original mail:
> md: md125 stopped
Or close enough, I currently can't stop the system. The message is spammed as fast as possible at as it seems.
Output of the commands is as follow:
# cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sda3[0] sdb1[1] 3756352512 blocks super 1.2 [2/2] [UU] bitmap: 2/28 pages [8KB], 65536KB chunk md126 : active raid1 sda5[0] sdb3[1] 58592256 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk md127 : active raid1 sda4[0] sdb2[1] 73399296 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> # mdadm --detail /dev/md125 /dev/md125: Version : 1.2 Creation Time : Sat Nov 19 14:58:50 2022 Raid Level : raid1 Array Size : 3756352512 (3.50 TiB 3.85 TB) Used Dev Size : 3756352512 (3.50 TiB 3.85 TB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Jan 6 11:21:50 2023 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : bitmap Name : arkham:home (local to host arkham) UUID : ad23388f:4e227a6c:2b3d141a:7a5f2338 Events : 9621 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 17 1 active sync /dev/sdb1
As I've said, I currently can't stop the system to see if it still happens. But I've seen it at least twice after the initial setup (and after the RAID had finished its initial sync). _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
Can you test-install elrepo's kernel-ml? It is currently at version 6.2.0.
https://elrepo.org/linux/kernel/el9/x86_64/RPMS/
Akemi
On Wed, Jan 11, 2023 at 6:28 AM Alessandro Baggi
Hi, I tried also CentOS Stream 9 and got the same problem.
Il 09/01/23 15:59, Alessandro Baggi ha scritto:
Hi Bobby, I tried also RHEL 9.1 and got the same problem.
Il 06/01/23 12:49, Robert 'Bobby' Zenz ha scritto:
Error message is during shutdown as in the original mail:
md: md125 stopped
Or close enough, I currently can't stop the system. The message is spammed as fast as possible at as it seems.
Output of the commands is as follow:
# cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sda3[0] sdb1[1] 3756352512 blocks super 1.2 [2/2] [UU] bitmap: 2/28 pages [8KB], 65536KB chunk md126 : active raid1 sda5[0] sdb3[1] 58592256 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk md127 : active raid1 sda4[0] sdb2[1] 73399296 blocks super 1.2 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> # mdadm --detail /dev/md125 /dev/md125: Version : 1.2 Creation Time : Sat Nov 19 14:58:50 2022 Raid Level : raid1 Array Size : 3756352512 (3.50 TiB 3.85 TB) Used Dev Size : 3756352512 (3.50 TiB 3.85 TB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Jan 6 11:21:50 2023 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : bitmap Name : arkham:home (local to host arkham) UUID : ad23388f:4e227a6c:2b3d141a:7a5f2338 Events : 9621 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 17 1 active sync /dev/sdb1
As I've said, I currently can't stop the system to see if it still happens. But I've seen it at least twice after the initial setup (and after the RAID had finished its initial sync). _______________________________________________ AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
AlmaLinux Users mailing list -- users@lists.almalinux.org To unsubscribe send an email to users-leave@lists.almalinux.org
Fantastic post, many thanks for sharing, Visit <a href="https://www.slikmagazine.com/">Slik Magazine</a> to read more informative contents like, News, Business, Fashion, and Travel.
participants (9)
-
Adrian Sevcenco
-
Akemi Yagi
-
Alessandro Baggi
-
Bruce Ferrell
-
bw9677249@gmail.com
-
Christopher Cox
-
Robert 'Bobby' Zenz
-
Robert Moskowitz
-
Stephen John Smoogen