Hello I am Jonathan Wright, Infrastructure Team Lead for AlmaLinux. I
manage most of the plumbing that keeps things humming smoothly along and
I’ve been working on some improvements to some parts of it to make things
more user friendly for our community.
AlmaLinux values transparency <https://wiki.almalinux.org/Transparency.html>
and communal decision making, it’s one of the reasons why I decided to
become a contributor. As part of some of the work I’m doing I’d like to
request some feedback from the community on a proposal to enable `dnf
countme` similar to the way the Fedora project does.
countme is a core feature of DNF implemented upstream in Fedora 32 (dnf
4.2.9). It is described by the docs as such:
Determines whether a special flag should be added to a single, randomly
chosen metalink/mirrorlist query each week. This allows the repository
owner to estimate the number of systems consuming it, by counting such
queries over a week's time, which is much more accurate than just counting
unique IP addresses (which is subject to both overcounting and
undercounting due to short DHCP leases and NAT, respectively).
The flag is a simple "countme=N" parameter appended to the metalink and
mirrorlist URL, where N is an integer representing the "longevity" bucket
this system belongs to. The following 4 buckets are defined, based on how
many full weeks have passed since the beginning of the week when this
system was installed: 1 = first week, 2 = first month (2-4 weeks), 3 = six
months (5-24 weeks) and 4 = more than six months (> 24 weeks). This
information is meant to help distinguish short-lived installs from
long-term ones, and to gather other statistics about system lifecycle.
countme was designed with privacy in mind and does not add any identifying
or unique information to requests so there is no tracking involved. Just a
simple “hello” to the repository.
Currently, AlmaLinux does not track any sort of usage statistics for our
distribution at all. We can technically try to aggregate basic metrics from
HTTP logs on our mirrorlist servers but the reliability of the data will
not be the best since counting unique IPs is undermined by things like NAT
and dynamic addressing. So, I’d like to propose we implement “countme=1” in
our repository configs just as Fedora and EPEL have done. I’d also like to
propose that the aggregated data be made available publicly, similar to
https://data-analysis.fedoraproject.org/ for the community to see.
I’ve setup a form for feedback at https://forms.gle/BShXoxJmsjNbMXCk6 in
case you’d like to give any input on this proposal. We will keep this form
open for about a week.
FAQ:
Q: When are “countme” requests sent?
A: Once a week at random during normal dnf activity. If you do not use dnf
calls that would otherwise trigger mirrorlist requests (makecache, install,
update) this flag will NOT cause dnf to go out of its way and make special
requests.
Q: What extra data will be sent that is not currently collected?
A: “countme=X” will be added to a random mirrorlist request each week from
DNF where X is a number, 1-4 which represents the number of weeks your
system has been installed. See above for the explanation of this from the
DNF documentation.
Q: Will aggregated data be made publicly available?
A: Yes
Q: What data do you use?
A: The only data we look at is in the HTTP request itself. Our log lines
are in the standard Combined Log Format. Ex:
172.30.61.81 - - [15/Dec/2021:17:02:12 +0000] "GET
/mirrorlist/8/baseos?countme=4 HTTP/1.1" 200 629 "-" "libdnf (AlmaLinux
8.3; generic; Linux.x86_64)"
We only look at log lines where the request is "GET", the query string
includes "countme=N", the result is 200 or 302, and the User-Agent string
matches the libdnf User-Agent header.
The only data we use are the timestamp, the query parameters (repo, arch,
countme), and the libdnf User-Agent data.
In the future we will also aggregate data by country using GeoIP. Our
processing and aggregation does not care about IPs themselves or their
uniqueness. When we implement the aggregation of geographic data it will
use MaxMind’s GeoIP database locally to turn the IP into a region which
will be used for tallying generalized metrics for that region.
Raw access logs are archived in case we find major issues in any of our
processing which would allow us to re-parse the data in the future and
correct the published statistics.
Q: Can I opt out?
A: Yes, but we’d prefer you not since the data is very helpful. The only
extra data you’ll be submitting is “countme=X” in one request per week.
If you’d like to opt out you can comment out the “countme=1” line in the
repository config files in /etc/yum.repos.d/
Discussion for this should be directed to the AlmaLinux Infrastructure
mailing list. You can join the list at
https://lists.almalinux.org/mailman3/lists/infra.lists.almalinux.org/
--
Jonathan Wright
AlmaLinux Foundation
Mattermost: chat <https://chat.almalinux.org/almalinux/messages/@jonathan>
Hello HPC Fans. A few weeks ago we updated our images on Microsoft Azure to
8.5. Today we are proud to announce the general availability of–AlmaLinux
OS for Azure HPC
<https://azuremarketplace.microsoft.com/en-us/marketplace/apps/almalinux.alm…>
based on AlmaLinux OS 8.5. This was a tremendous team effort starting from
community requests and developed in collaboration between the AlmaLinux
community and the team over at our amazing sponsors Microsoft Azure.
We had a couple of goals when we set out on this project. First, is to
further enable scientific computing and research. That is something that we
feel very strongly about and many across the vast expanse of scientific
research have begun to start leveraging the cloud due to the lack of
resource constraints there. It is much easier to do you work whether it be
simulations or something else if you can add nodes instantly with a quick
command in your terminal. Second, HPC is often out of reach for many due to
the large investment that is typically required. Combining the benefits of
cloud economies-of-scale along with HPC technology opens the door for many
to be able to experience what HPC is and what it really do. We hope that it
will inspire those already there to explore what more they can accomplish
and inspire the next generation of scientists and researchers to begin
experimenting.
You can now take advantage of Azure’s leading edge H-series CPU and
N-series GPU based instances, along with RDMA support and low latency
networking to power all of your cloud-based HPC workloads large scale
computation, circuit design, fluid dynamic analysis, natural resource
exploration and many other HPC workloads.
The image includes a suite of the most popular HPC tools and libraries
pre-installed including the NVIDIA/Mellanox OFED drivers, InfiniBand-based
MPI Libraries such as HPC-X and OpenMPI, Communication runtimes Libfabric
and OpenUCX, AMD BLIS, FFTW and FLAME libraries, Intel oneAPI Math Kernel
Library, and a host of other domain-specific libraries and utilities.
Join us <https://chat.almalinux.org>. Our team is always on the lookout for
more contributors. Report any bugs you may see on the Bug Tracker
<https://bugs.almalinux.org>. Join the AlmaLinux Community Chat
<https://chat.almalinux.org> or our HPC Forum
<https://almalinux.discourse.group/c/sigs/hpc/31> if you need any help,
post a question, or even if you just want to hang out. Reach us on Reddit
<https://reddit.com/r/almalinux> and on Twitter
<https://twitter.com/almalinux>.
–Happy Hacking
About two months ago we eagerly announced that we were able to then open
memberships for the AlmaLinux OS Foundation
<https://almalinux.org/blog/what-almalinux-foundation-membership-means-for-y…>.
I outlined why I thought it was important that we’d structured the
governance of the project this way–in order to give community members true
ownership and a direct voice in the direction of AlmaLinux OS. The
reception was tremendous and we’ve had hundreds of applicants and we’ve so
far approved, I believe over 100 community members and dozens of mirror
members. It’s very encouraging and refreshing to see the community rise to
the occasion. I salute each and every person who has been active in our
wonderful community, has helped to make it such a welcoming, friendly and
diverse place and has taken the step to make their voice heard.
We also have a membership tier for corporations who would like to
participate in governance. We know that thousands (and probably tens of
thousands) of groups, organizations and companies relied on the old CentOS
Linux for a wide array of reasons and we assumed that they would like more
direct participation in whatever solution they chose to move forward with.
We’ve spoken to several already and you’ll be hearing more about them
shortly but today we are here to celebrate Codenotary
<https://codenotary.com/>–our first platinum member!
You can read more about their take on the Codenotary Blog
<https://codenotary.com/blog/codenotary-joins-almalinux-foundation>.
Codenotary is a pioneer in DevSecOps and immutability (
https://github.com/codenotary/immudb) and offer developer friendly tools
(check out the Community Attestation Service https://cas.codenotary.com/
amongst others) that make truly verifiable software supply chain security a
reality–today. They have a far reaching vision and care deeply about
ensuring that the software you run, build and use is software that you (and
your users) can trust. Behind Codenotary also stands Moshe Bar, who is an
open source visionary and brought Xen and KVM to the world. He’s been a
lifelong staunch supporter of the Open Source community and his vote of
confidence means the world to us. Codenotary’s support will help us to
continue delivering for the community day in and day out, and we’ll be
working together to continue to grow the CentOS ecosystem and make sure
AlmaLinux is the secure base you can build your future on.
This moment is also a further step towards independence for the AlmaLinux
OS Foundation. As many people know, CloudLinux <https://cloudlinux.org/>
were the ones that kicked off the AlmaLinux OS project and foundation. The
intention always was to get it up on it’s feet so that it could be a truly
independent body governed and serving the community. Igor Seletskiy’s bold
move stepping down from the board
<https://blog.cloudlinux.com/why-i-have-decided-to-step-down-from-the-almali…>
was the first step towards ensuring that no single entity has an
overwhelming amount of control on the project.
For the next step towards independence, I will be transitioning out of
CloudLinux and into a role at Codenotary and they will be sponsoring my
work. I will retain my position on the board too–don’t worry you can’t get
shake me that easily. This leaves CloudLinux with just one board seat until
elections are held. As for me nothing changes, I’ll continue to make sure
that we deliver everything we promise, and more, for the community. That’s
why we are here. It’s our reason for doing it every day. It is an honor to
serve the community.
So it’s not New Year’s yet but don’t mind if we break out the champagne a
little early. On behalf of the whole AlmaLinux OS Foundation community and
the board, benny, Simon, Jesse, Eugene and myself we would like to thank
and toast Moshe Bar and the whole Codenotary team. We are immeasurably
thankful for your support.