Hosting Package Mirrors for Fun:

Alternative Title: Pushing the definition of unlimited bandwidth: A Guide to making your ISP Hate you.


Introduction


With Linux backing the worlds servers, and desktops, featuring a large package ecosystem, having a way to distribute terabytes of packages with low latency, and high bandwidth is a challenging task. The way this is acomplished is using a distributed network of “mirrors”

What are the benifits of running one?

Their isn’t a ton of benifits besides helping the open source community.

How to host one:


Requirements:

If you are going to host a mirror, their is a few requirements you must meet

The Storage space required will varry vastly depending on what you choose to mirror. With Arch Linux taking ~65Gb at time of writing, and Ubuntu taking ~1.5Tb your storage requirements should exceed the package mirrors you want to mirrror to allow room for growth, in optimially a ZFS/Raid 5+ to allow for disk failures without downtime.

You will use alot of bandwidth hosting mirrors, if you are on a capped bandwidth plan, this is not for you. As of time of writing, my reverse proxy is seeing ~800GiB/day(Stats can be seen here, You will also use a non-insignifcant amount of bandwidth for downloading the mirrors.

Dedication is the last, and most important part. If you are not into this for the long haul, do not do this. People configure their systems to use the closest mirrors, and while some are automatic, if people select yours execlusively for it to go down someday it can cause serious issues.

Choosing what to mirror:

Choosing what to mirror is the first thing to decide, some of the things I personally mirrors are the following:

Mirroring the Repo:

Once you have decided on what to mirror, you need to create a copy of the files. Locate the Tier 1 RSync mirror for the repo, and create a copy of it. The following script is an example of how I perform mirrors, with alpine:

#!/usr/bin/env sh

# make sure we never run 2 rsync at the same time
lockfile="/tmp/alpine-mirror.lock"
if [ -z "$flock" ] ; then
  exec env flock=1 flock -n $lockfile "$0" "$@"
fi

src=rsync://mirror.dst.ca/alpine
dest=/srv/mirrors/alpine

mkdir -p "$dest"
/usr/bin/rsync \
        --archive \
        --update \
        --hard-links \
        --delete \
        --delete-after \
        --delay-updates \
        --timeout=600 \
        $exclude \
        "$src" "$dest"

This can take a while depending on how large it is, and your connection speed

Sharing the files

The two main methods for making your files accessible to others are RSync & HTTP, and FTP aswell. Commonly HTTP & RSync are required, with FTP being optional

NGINX:

NGINX is a high performance web server, that is very good for this application. You can install it with the following command: apt install -y nginx This will create a web sever on port 80/443. The next step is editing the configuration files located in /etc/nginx/sites-available The following is an example:

server {
        listen 80; # Bind to port 80 ipv4
        listen [::]:80; # bind to port 80 ipv6
        root /srv/mirrors/; # set the root directory
        index index.html; # Sets the default page to return

        location / {
                try_files $uri $uri/ =404; # Will try the files in the URI name, if not found return 404

                autoindex on; # Show listing for folders
        }
}

RSync:

RSync is another popular method of sharing files that is preinstalled on most *NIX systems, We need to create a config file at /etc/rsyncd.conf The following is an example config:

[alpine]
        comment = Alpine Linux Mirror # Comment
        path = /srv/mirrors/alpine/ # Path
        hosts allow = * # Allow anyone to connect
        #hosts deny = *
        list=true # List Contents
        uid=www-data
        gid=www-data
        read only = yes # Dont allow anyone to write
        use chroot=yes
        dont compress # for better performance

We can start the rSync daemon by running rsync -d

(Optional) Note on SSL

If you wish to add SSL/HTTPs to your site checkout certbot to do this for free.

Port Forwarding:

You will need to forward port 80/443(HTTP/HTTPs), & 873(rSync). This process is highly specific to your router model thus i cannot possibly cover every option. However if you google “(router name) port forwarding” a guide should come up.

(Optional) Monitoring Bandwidth:

Monitoring how much bandwidth your ISP Killer Package mirror uses can be a great way to see how much your actually doing.

One option of doing this is vnStat which produces graphs like the following: Picture of bandwidth statistics from vnStat

Setting up Interface monitoring

First off, you need to see which interface you are using, this can be found with ip a, however in most cases it will be eth0 Install vnStat with apt install -y vnstat Configure your interface to be monitored: vnstat -u -i (interface) Start the service: systemctl enable vnstat && systemctl start vnstat

You can view the bandwidth consumption from the terminal by running vnstat

Adding a webUI / Images

Install needed packages: apt install -y vnstati fcgiwrap Add following configuration to nginx:

  location ~ \.cgi$ {
                fastcgi_pass  unix:/var/run/fcgiwrap.socket;
                # Fastcgi parameters, include the standard ones
                include /etc/nginx/fastcgi_params;
                # Adjust non standard parameters (SCRIPT_FILENAME)
                fastcgi_param SCRIPT_FILENAME  $document_root$fastcgi_script_name;
        }

Add CGI script to view traffic

wget https://github.com/vergoh/vnstat/blob/master/examples/vnstat.cgi -o /srv/mirrors # Edit to dest of server

Update .cgi to proper interfaces if not eth0

Getting your mirror added as offical:

The steps for having your mirror become an offical source for the distribution/package of choice varies from distro to distro, however most have a wiki on it, and generally follow the same steps of sending an email or opening a ticket. Here are the wiki’s for two mirrors as an example.

Conclusion

Welcome to being apart of the team of mirrors keeping the open source system afloat. if you have any questions or are unsure about this process please feel free to message me at the contact info found in the /contact-me tab in the header.