use mirrorbrain for download managemenet instead of broken mirror-choosing by timezone
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
RPM |
New
|
Undecided
|
Unassigned | ||
Mandriva |
Confirmed
|
Wishlist
|
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #1 |
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #2 |
Can this be deployed alongside the current setup?
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #3 |
yes - mirrorbrain is an apache module, the individual mirrors still would get the stuff via rsync, mirrorbrain is not involved in that part. All mirrors can be used individually as they are now. So when you got a server that can run apache & has the mirrored files, you got all it needs to set it up.
The only thing that needs to change woul be that the "mirrorlist" method would only return one single URL, namely the mirrorbrain one. (or if you want to play ultra, ultra safe, just provide the mirrorbrain url as additional mirror and advertise it on the webpage/forum/blogs instead of "forcing" it on users as an initial step, and do the switch of the mirrorlist method when you gained confidence in it)
LibreOffice as well as OpenOffice.org before for example also use mirrorbrain.
You got one single URL per download:
but depending on where the access is done from, the user will be redirected to the actual mirror. You can append a ".mirrorlist" to see what choice it did make, and what alternative mirrors you could use (as well as md5, sha1sum and some additional stuff)
(SuSE also uses it for its mirroring btw)
For users who already have a manual mirror configured, nothing will change, this is still possible.
Mirrorbrain won't interfere with the distribution of files to the mirrors and won't remove the option to manually specify a mirror. You can do a smooth transition as described above (set it up and advertise mirrorbrain as experimental method, i.e. invite users to change their media sources to the mirrorbrain URL, and after some time change the mirrorlist definition to only include the mirrorbrain URL), but it really doesn't demand much ressources/you don't need to be afraid of it generating too much load.
In bugs.mageia.org/ #3166, Marja11 (marja11) wrote : | #4 |
Sounds good :)
@ remmy
Just curious... WDYT?
In bugs.mageia.org/ #3166, Remco Rijnders (remco-p) wrote : | #5 |
It sounds good to me. One thing I notice though is this in the FAQ on mirrorbrain:
Is only HTTP supported?
No — FTP mirrors are fully supported, in addition to HTTP. Furthermore, BitTorrent can be integrated via Metalinks.
To scan mirrors for their content, rsync is used. It is the most efficient method for that purpose. However, if rsync isn't available on a mirror, FTP and HTTP can be used as fallback.
However, looking at http://
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #6 |
in terms of server load or server resources it doesn't really matter - but if it cannot use rsync to scan a mirror if it carries the files, mirrorbrain has to parse the returned html & has to manually iterate over all directories. So if you got hundreds of directories, hundreds of directory listings have to be parsed.
And while mirrorbrain knows about the "big" webserver implementations (apache, etc), there might be modified directory listings in use by some of your mirrors that could make mirrorbrain's html-parsing fail (and thus you might have to tweak mirrorbrain a little to accommodate for those mirrors)
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #7 |
That's about 50% of current mirrors that provide rsync. We can deploy this first for mirrors providing rsync and see later what we do.
In bugs.mageia.org/ #3166, Andre999mga (andre999mga) wrote : | #8 |
(In reply to comment #6)
> That's about 50% of current mirrors that provide rsync. We can deploy this
> first for mirrors providing rsync and see later what we do.
Note that about 2/3 of the faster (>1 Gb/s) mirrors have rsync.
Having a setup that first downloads from a default mirror, and then uses rsync from other mirrors for any missing packages, would be efficient.
Note that downloading a single large package (such as Openoffice or Libreoffice) is not the same problem as downloading a multitude of relatively small packages, which would be typical of most Mageia downloads. Excepting ISO's, of course.
The geolocation of mirrorbrain could be useful for deciding the initial mirror. Just don't use the directory download if rsync is not available on this initial mirror.
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #9 |
to be clear on this: the scanning will have no impact on the regular user. the user will never be provided an rsync URL by mirrorbrain - mirrorbrain will hand out the http-URL (or in case of a ftp-only mirror the ftp-URL).
scanning is used to check "Ah, this mirror has file path/to/X", so when a user requests mirrorbrain.
What andré suggest is a different downloading system, in my opinion once again "too smart"/not worth the effort.
rpm packages in general are small, so using download techniques that request the very same file from different mirrors is pointless/waste of resources. Better to run downloads of multiple different files in parallel.
In the same thought using rsync to get a set of files from a single mirror also is not really that much of an improvement (always assuming the use case of installing updates, what is what mirrors are mostly used for - the numbers of packages to update is not that big usually, except maybe after initial installation) - if your line is fat, I'd rather use parallel downloads from multiple mirrors.
But having written all that: This is out-of-scope of this issue. Changing the download-
In bugs.mageia.org/ #3166, Jeff Johnson (n3npq) wrote : | #10 |
tracked at https:/
tags: | added: mageia repo |
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #11 |
(In reply to comment #9)
> tracked at https:/
Off-topic, but... what for?
Changed in mandriva: | |
importance: | Unknown → Medium |
status: | Unknown → Confirmed |
In bugs.mageia.org/ #3166, Jeff Johnson (n3npq) wrote : | #12 |
for ROADMAP planning @rpm5.org. bugs == bugs wrto *.rpm packaging
In bugs.mageia.org/ #3166, Remco Rijnders (remco-p) wrote : | #13 |
@Romain
Can we assign this to you or the webteam?
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #14 |
I won't be able to work on this for some time. You can assign it to webteam, it's left for someone to pick it up.
In bugs.mageia.org/ #3166, Marja11 (marja11) wrote : | #15 |
(In reply to comment #13)
> I won't be able to work on this for some time. You can assign it to webteam,
> it's left for someone to pick it up.
Thx, assigning
In bugs.mageia.org/ #3166, Marja11 (marja11) wrote : | #16 |
Hi,
This bug was filed against cauldron, but we do not have cauldron at the moment.
Please report whether this bug is still valid for Mageia 2.
Thanks :)
Cheers,
marja
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #17 |
This is not relevant to Cauldron only, changing product.
Changed in mandriva: | |
importance: | Medium → Wishlist |
In bugs.mageia.org/ #3166, Marja11 (marja11) wrote : | #18 |
(In reply to comment #13)
> I won't be able to work on this for some time. You can assign it to webteam,
> it's left for someone to pick it up.
(In reply to comment #16)
> This is not relevant to Cauldron only, changing product.
@ Romain,
When the product was changed (btw, thanks for doing that), the assignee was automatically changed along with it. The assignee is now sysadmin-bugs. Do you want it to be assigned back to webteam, or to stay assigned to sysadminteam?
In bugs.mageia.org/ #3166, Thierry-vignaud (thierry-vignaud) wrote : | #19 |
Anyway we need some support on web site pior to being able to test patches...
In bugs.mageia.org/ #3166, Rdalverny (rdalverny) wrote : | #20 |
What support do you need from the web site? (or do you mean, support from an installed mirrorbrain instance?)
In bugs.mageia.org/ #3166, Thierry-vignaud (thierry-vignaud) wrote : | #21 |
Yes.
@Christian Lohmaier: For geolocation vs timezone picking of mirrors, I think it would best to just do geolocation rather than asking one to manually pick its location.
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #22 |
@Thierry: Picking timezone as in the bug mentioned in the initial description has nothing to do with mirrorbrain.
Mirrorbrain does geolocation by IP, so depending on timezone is not necessary at all..
I did create the patch for https:/
Currently, the $MIRRORLIST method uses the system's timezone-city as reference as to what mirror to use, and this is stupid, at least in central Europe (where there are many mirrors that are much closer than your country's capital city)
To make it clear: Mirrorbrain doesn't depend on any user-configured stuff. It decides based on the IP that is used what mirror to chose. It does geolocation by IP.
Ordered from worse to best:
* location reference point is taken from the timezone (urpmi as it is now)
* user has the option to manually specify his actual location (urpmi with patch, patch is available)
* user doesn't have to bother, but closest mirror is assigned by having the downlaod-server examine the IP that is used to connect (mirrorbrain on server, no patch to urpmi necessary, but no mirrorbrain installed yet)
If you meant that urpmi should query the geolocation on the user's machine and use that instead, this would be an alternative method, but of course without the other benefit that mirrorbrain would bring. And this way would also require Mageia to setup an appropriate service that would return the location on request (or you would have to require a geolocation-
In bugs.mageia.org/ #3166, Thierry-vignaud (thierry-vignaud) wrote : | #23 |
That's exactly what I wrote.
Using mirrorbrain is totally orthogonal and has nothing to do with the patches you posted for now
In bugs.mageia.org/ #3166, Pascal Terjan (pterjan42) wrote : | #24 |
*** Bug 11454 has been marked as a duplicate of this bug. ***
Per Øyvind Karlsen (proyvind) wrote : | #25 |
@Lohmaier, urpmi is doing user side geolocalization based on user selected timezone, removing the need for carrying additional geoip databases on user side.
What one actually doesn't get by not using mirrorbrain is a neat mirroring utility that can be used together with urpmi while at it, rather than the current mirror api that urpmi is using.
FWIW mirrorbrain support has just recently been introduced to original upstream branch of urpmi.. :)
In bugs.mageia.org/ #3166, Luigiwalser (luigiwalser) wrote : | #26 |
Note that Per Øyvind Karlsen has implemented mirrorbrain support in urpmi in his branch, which may be of some use.
In bugs.mageia.org/ #3166, Marja11 (marja11) wrote : | #27 |
Adjusting summary, there's more than one reason to wish for it
In bugs.mageia.org/ #3166, Neal Gompa (ngompa13) wrote : | #28 |
We also probably want to have an automatic redirector like Fedora's download.
The above link will redirect to any one of Fedora's 400+ mirrors that offer ISOs that provides a high quality connection to me (low latency and high throughput). In my case, it automatically redirected me to: http://
This redirector applies to anything replicated out to mirrors, and can be used to offer an automatic mirror director in a way that is transparent to the tool. This can be used as the repo URL in urpmi, for instance.
For example, https:/
In bugs.mageia.org/ #3166, Filip-komar (filip-komar) wrote : | #29 |
(In reply to Neal Gompa from comment #26)
> We also probably want to have an automatic redirector like Fedora's
> download.
> mirror that can provide a given directory/file.
We already use a simple download redirector[0] for almost all our ISO files[1] and for pdf and epub files doc[2] files.
But when the mirrorbrain infrastructure will be in place and well tested I'll do my best to implement it on mentioned web pages.
Current redirector is working well now but it relies on generated lists[3][4]. Refresh of those lists[5] is triggered manually.
[0] http://
[1] http://
[2] https:/
[3] http://
[4] http://
[5] http://
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #30 |
(In reply to Neal Gompa from comment #26)
> We also probably want to have an automatic redirector like Fedora's
> download.
> mirror that can provide a given directory/file.
mirrorbrain does that already. by geolocation and by optional giving custom weight to mirrors/limiting mirrors to only serve a specific region or network-block/ASN.
> This redirector applies to anything replicated out to mirrors,
No, it applies to anything, whether mirrored or not. just add something to the URL and you will still be redirected. This is difference to mirrorbrain. Mirrorbrain only redirects to mirrors known to have the file, and doesn't redirect on directory listing.
> For example, https:/
> could be in /etc/urpmi/
Same when mirrobrain is used.
if urpmi knows to follow redirects, then there's no difference.
I'd prefer being able to browse the representative/
In this regard mirrorbrain is "superior", as you have one definite filelisting that represents the state of the repository, no matter whether all mirrors did sync already or not. If a mirror doesn't carry the requested file, you won't be redirected to it. This solves the "repository-info updated, but rpms not synced yet problems", and any other random sync failures that mirrors encounter.
see e.g. http://
If you want to manually pick a mirror, you use the "Details" link in the listing, or just append ".mirrorlist" to the file's URL:
http://
→ that will not only show various hashes (that you could also request by e.g. appending .sha256 for the sha256sum: http://
It shows (a selection of) suitable mirrors (as determined by your request's geoIP data) you can pick from.
In bugs.mageia.org/ #3166, Lohmaier+mageia (lohmaier+mageia) wrote : | #31 |
oh, and I forgot: the redirect response not only contains the url to the mirror, but also contains metadata - just try a
not only get you the redirect to a matching mirror, you also get a list of alternative mirrors, in case the first one has a power outage or is down for maintenance or is not reachable for some random problem.
And also the hashes are included (base64 encoded)
and you can go fancy by selecting what type you want by using accept headers like e.g.
curl -H "Accept: application/
(will not redirect, but instead get you metalink xml - same as you'd get if you had requested file.metalink)
In bugs.mageia.org/ #3166, Thomas Backlund (tmb) wrote : | #32 |
Just a small update...
I now have a mirrorbrain running on dl.mageia.org
(but I haven't pushed it on public dns yet as I want to review my changes after some sleep to verify I haven't missed anything...)
for now I only use pseudo tree on dl.mageia.org until I get extra disks installed
In bugs.mageia.org/ #3166, Olav Vitters (ovitters) wrote : | #33 |
I've just obsoleted MirrorBrain in Cauldron as it hasn't been updated (or developed) since 2015. The alternative seems to be MirrorManager2, by Fedora infrastructure team. Either this bug needs to be updated or WONTFIX and a new bug.
In bugs.mageia.org/ #3166, Fri-5 (fri-5) wrote : | #34 |
This issue need a kick.
I think a new bug is in order, written by someone knowing more than me ;)
Related:
https:/
https:/
https:/
In bugs.mageia.org/ #3166, Fri-5 (fri-5) wrote : | #35 |
Kind of duplicate:
Bug 17400 - Deploy and configure MirrorBrain to manage mirrors and generate metalinks for DNF
As mageia did not include /qa.mandriva. com/show_ bug.cgi? id=56879
https:/
urpm still relies on the timezone to choose a mirror, and as there still is no working fallback in urpmi either, when that one mirror is not up-to-date or has some other failure, updating/installing a package will fail.
That's why I propose to drop the client-side mirror choosing altogether and use mirrorbrain instead.
Mirrorbrain itself doesn't need much resources, but has the benefit of
* checking mirrors whether they are up-to-date
* handing out the geographically closest mirror/mirror within the same ASN if possible, with the option to prefer powerful mirrors over weaker ones by giving the mirrors appropriate scores
* creation of mirrorlists
* managing of mirrors in once place
* can create torrents as well (as one part of bug#890)
* and as all links are passed through one instance, it is easier to create downloadstats (bug #2330)
With mirrorbrain, the problem with the installer also is gone (you can only give one specific URL when adding media in the installer, the installer is not aware of mirrorlist method)