Here’s the data used to plot the above graph. The total number of bookmarked links vary from year to year, with a high between 2010 and 2011.
Note: I didn’t use Pinboard between 2014 and 2017 😶
Brian’s script works like this:
The code looks through your bookmarks and attempts to fetch each URL. If the HTTP code is less than 400 we mark it as a success. Without manually checking every URL, there might be some false positives: people selling existing domains, hosting provider redirects, etc. If the status code was 400 or higher, we marked it as a failure. After some manual investigation, we realized that some domains were not allowing bots to crawl them. Our code was using cURL, which appears as a bot, so we faked a browser’s user-agent string and decreased our failure rate by ~4%.
Pinboard aka del.icio.us
I started to use del.icio.us back in 2006, when I discovered the service at “The Future of Web Apps London” and somehow forgot about it between 2014 and 2018.
I converted my account last year when Pinboard’s creator Maciej reached out to ask if we, original one-time payment users, would consider converting to a subscription model, helping him to continue maintaining and developing the service, and make a living out of it..
I was surprised that the numbers of invalid links weren’t higher, considering that a vast majority of the links of my blog are now invalid. There is probably a significant number of false positives among the 88.87% of valid bookmarks. Randomly clicking through old links turned up a fair amount of them.
I still need to finish my link checking script that replaces invalid links by a link to the Internet Archive Wayback Machine project.