-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potentially Broken Links in Comments #2338
Comments
Wow, that's a lot. Nice catch. |
Thats a lot of debt to cover. I have to do a quick history lesson on the 2012 round of gTLD additions before my next suggestions, because I am going to suggest replacing the IANA db link for the ICANN section entries, but it needs some elegance and curation related to gTLDs. The gTLD IANA DB Stuff is tricky. When the TLD delegation process phases occur, as there is an indefinite period between the time when a new TLD contracts to operate their TLD and when the TLD is delegted to the root zone, and actually lights up. This is a combination of operational readiness, an intentional pacing rate, and often in the case of spec 13 .brand TLDs, their corporate bravery, among other factors. short version of this statement is that the .JSON file we pull from ICANN differs from what's actually root-listed. Why are we doing it that way?The logic and reasoning here was that during high-flow delegation phases, which occur after an open round when parties can apply for new TLDs. When the 2012 round of new TLDs launched, there were up to 20 a week for a few years. Frequently, the TLD would be added to the root zone, but because for example Safari would updte its internal list at the pace of MacOS or IoS upgrades every 3-6 months, the result would be that instead of the domain typed in the location bar being recognized as a domain name, it would be treated as a search term and sent to a search engine. There was a need to get TLDs certain to be added by the ICANN (who are the authority) into the PSL at some advanced moment in time before delegation in order to offset the propogation delays that are beyond the PSL maintainers' control. So we used the contract signature as the way to treat it as 'on the way', which typically paced reasonably well with enough advance timing to offset the propogation delays. The thundering herd of additions slowed in 2017-2018, so if you're just joining the PSL party, you missed that whole flood of entries coming in, but our automation process held, and thank you to @cpu for all that hard work in getting that json->PSL stuff automated, because ther were >1000 TLDs and it was a LOT. Does this divert from following our ICP-3 mantra?Yes, but we also describe that we follow ICP-3 AND accept I* vetted (IETF : .onion) as well as the ICANN contracted but not yet in the root scenarios. So, I'll land the plane on the whole background here to provide a contextually relevant example: Given we're going to have an open round shortly for more TLD applications to happen, it is important to have all that context about the ICANN gTLD round stuff and the json being out of synch w the IANA DB for when the next thundering herd comes. We need to leave some room for the gTLD IANA DB URLs to be broken if sourced from the json What about this as a suggestion on URL cleanup?I propose the following to reduce this list:
I suspect this will burn down a significant amount of the list. |
The following URLs, which are documented in PSL comments, appear to be broken and return either 404 or 5xx HTTP status codes (scanner code). These broken links could indicate URL changes, website structural changes, changes in administrative bodies, or potentially outdated entries that might be identified during link maintenance. While this could be an opportunity for volunteers to contribute, it is probably a low-priority task since it only affects documentation. I will begin looking into some of these next week and work on cleaning up.
404 errors:
5xx errors:
The text was updated successfully, but these errors were encountered: