ExLisper Site
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Innerworld@lemmy.world to News@lemmy.worldEnglish · 6 days ago

Wikipedia Blacklists Archive.today, Starts Removing 695,000 Archive Links

arstechnica.com

external-link
message-square
16
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
  • technology@lemmy.world
171
external-link

Wikipedia Blacklists Archive.today, Starts Removing 695,000 Archive Links

arstechnica.com

Innerworld@lemmy.world to News@lemmy.worldEnglish · 6 days ago
message-square
16
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
  • technology@lemmy.world
Wikipedia blacklists Archive.today, starts removing 695,000 archive links
arstechnica.com
external-link
If DDoSing a blog wasn't bad enough, archive site also tampered with web snapshots.
alert-triangle
You must log in or # to comment.
  • AmbitiousProcess (they/them)@piefed.social
    link
    fedilink
    English
    arrow-up
    29
    ·
    edit-2
    6 days ago

    Here’s the relevant archive.today guidance page on Wikipedia for anyone curious:
    https://en.wikipedia.org/wiki/Wikipedia:Archive.today_guidance

    If you have a Wikipedia account, you can help replace these links!
    Go to the How you can help section, then click on the search links for any of the given domains, and you can go and manually re-archive any links with Archive.org, Ghostarchive, or Megalodon.

  • Doug Holland@lemmy.world
    link
    fedilink
    English
    arrow-up
    23
    ·
    6 days ago

    Crap. Obviously, I’m gonna gotta stop using archive.today, but it’s the only way around paywalls at numerous sites.

    Removepaywalls.com (plural) inserts ads, often for shady operations.

    Removepaywall.com (singular) usually works, but it’s tricky sharing the links (i.e., “choose option 2” or “choose option 4”).

    Byebyepaywall.com has old, dead options.

    Wayback Machine bombs out a lot.

    And ghostarchive.org is successful so rarely it’s really a last resort.

    Anyone know of any others?

    • Trudge@piefed.social
      link
      fedilink
      English
      arrow-up
      19
      ·
      6 days ago

      Possibly irrelevant, but some browsers have a “reading mode” which, in conjunction with the ol’ Hitting F11 and Then Esc Trick, will produce the whole article before a paywall can finish loading.

      • zqps@sh.itjust.works
        link
        fedilink
        arrow-up
        4
        ·
        5 days ago

        F11 plus Esc stops script execution or something like that?

        • Trudge@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          5 days ago

          reloads the page & aborts loading the page

      • Doug Holland@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        6 days ago

        Worth looking into, thanks.

    • CharlesDarwin@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      5 days ago

      The thing that has always annoyed me about archive.is is that using Firefox + VPN seems to result in endless Captcha. But works in Chrome, go figure. I’m very suspicious of sites that somehow only work properly under Chrome.

    • a Kendrick fan@lemmy.ml
      link
      fedilink
      arrow-up
      2
      ·
      6 days ago

      Ghostarchive is an archive.today revamp, I see no reason to not keep using either though…

  • SourDrink @lemmy.world
    link
    fedilink
    arrow-up
    24
    ·
    6 days ago

    I half thought this was archive.org they were blacklisting. Two whole different sites.

  • 🌞 Alexander Daychilde 🌞@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    6 days ago

    Dammit. Everyone’s been using that site to get around paywalls because it works well. Now I have to go find another one that works as well. :|

    • betterdeadthanreddit@lemmy.world
      link
      fedilink
      arrow-up
      21
      ·
      6 days ago

      There are others that don’t DDoS blogs.

      • 🌞 Alexander Daychilde 🌞@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 days ago

        Well, yes, I’ll be off looking for them next time I need to use an archival site. I’m bummed to learn this crap about archive.ph.

  • paraphrand@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    6 days ago

    Is there a reason self hosted paywall bypass tools don’t exist? Is it because these services pay for access?

    • hemko@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 days ago

      I think a subscribed user of the news site has to upload the “unlocked” article to the archive website.

  • UnderpantsWeevil@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    6 days ago

    Arguably the biggest problem with Wikipedia as it aged is the accumulation of dead links.

    Brilliant move.

    • SolacefromSilence@fedia.io
      link
      fedilink
      arrow-up
      4
      ·
      6 days ago

      I used to find dead links annoying until I realized that many dead links are also saved in the wayback machine. This comment isn’t only about Wikipedia.

News@lemmy.world

news@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !news@lemmy.world

Welcome to the News community!

Rules:

1. Be civil

Attack the argument, not the person. No racism/sexism/bigotry. Good faith argumentation only. This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban. Do not respond to rule-breaking content; report it and move on.


2. All posts should contain a source (url) that is as reliable and unbiased as possible and must only contain one link.

Obvious biased sources will be removed at the mods’ discretion. Supporting links can be added in comments or posted separately but not to the post body. Sources may be checked for reliability using Wikipedia, MBFC, AdFontes, GroundNews, etc.


3. No bots, spam or self-promotion.

Only approved bots, which follow the guidelines for bots set by the instance, are allowed.


4. Post titles should be the same as the article used as source. Clickbait titles may be removed.

Posts which titles don’t match the source may be removed. If the site changed their headline, we may ask you to update the post title. Clickbait titles use hyperbolic language and do not accurately describe the article content. When necessary, post titles may be edited, clearly marked with [brackets], but may never be used to editorialize or comment on the content.


5. Only recent news is allowed.

Posts must be news from the most recent 30 days.


6. All posts must be news articles.

No opinion pieces, Listicles, editorials, videos, blogs, press releases, or celebrity gossip will be allowed. All posts will be judged on a case-by-case basis. Mods may use discretion to pre-approve videos or press releases from highly credible sources that provide unique, newsworthy content not available or possible in another format.


7. No duplicate posts.

If an article has already been posted, it will be removed. Different articles reporting on the same subject are permitted. If the post that matches your post is very old, we refer you to rule 5.


8. Misinformation is prohibited.

Misinformation / propaganda is strictly prohibited. Any comment or post containing or linking to misinformation will be removed. If you feel that your post has been removed in error, credible sources must be provided.


9. No link shorteners or news aggregators.

All posts must link to original article sources. You may include archival links in the post description. News aggregators such as Yahoo, Google, Hacker News, etc. should be avoided in favor of the original source link. Newswire services such as AP, Reuters, or AFP, are frequently republished and may be shared from other credible sources.


10. Don't copy entire article in your post body

For copyright reasons, you are not allowed to copy an entire article into your post body. This is an instance wide rule, that is strictly enforced in this community.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 2.39K users / day
  • 5.01K users / week
  • 8.74K users / month
  • 16.4K users / 6 months
  • 1 local subscriber
  • 36.1K subscribers
  • 10.6K Posts
  • 103K Comments
  • Modlog
  • mods:
  • JonsJava@lemmy.world
  • gedaliyah@lemmy.world
  • 🌱 🐄🌱 @lemmy.world
  • jeffw@lemmy.world
  • enu@lemmy.world
  • rjc@lemmy.world
  • Tenthrow@lemmy.world
  • BE: 0.19.12
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org