• 1 Post
  • 31 Comments
Joined 5 months ago
cake
Cake day: March 31st, 2025

help-circle



  • Yeah, you’re absolutely right and I agree. So then do we have to resign the situation to being an eternal back-and-forth of just developing random new challenges every time the scrapers adapt to them? Like antibiotics for viruses? Maybe that is the way it is. And honestly that’s what I suspect. But Anubis feels so clever and so close to something that would work. The concept of making it about a cost that adds up, so that it intrinsically only effects massive processes significantly, is really smart…since it’s not about coming up with a challenge a computer can’t complete, but just a challenge that makes it economically not worth it to complete. But it’s disappointing to see that, at least with the current wait times, it doesn’t seem like it will cost enough to dissuade scrapers. And worse, the cost is so low that it seems like making the cost significant to the scrapers will require really insufferable wait times for users.


  • By negligence, I meant that the cost is negligible to the companies running scrapers, not that the solution itself is negligent. I should have said “negligibility” of Anubis, sorry - that was poor clarity on my part.

    But I do think that the cost of it is indeed negligible, as the article shows. It doesn’t really matter if the author is biased or not, their analysis of the costs seems reasonable. I would need a counter-argument against that to think they were wrong. Just because they’re biased isn’t enough to discount the quantification they attempted to bring to the debate.

    Also, I don’t think there’s any hypocrisy in me saying I’ve only thought about other solutions here and there - I’m not maintaining an anti-scraping library. And there’s already been indications that scrapers are just accepting the cost of Anubis on Codeberg, right? So I’m not trying to say I’m some sort of tech genius who has the right idea here, but from what Codeberg was saying, and from the numbers in this article, it sure looks like Anubis isn’t the right idea. I am indeed only having fun with my suggestions, not making whole libraries out of them and pronouncing them to be solutions. I personally haven’t seen evidence that Anubis is so clearly working? As the author points out, it seems like it’s only working right now because of how new it is, but if scrapers want to go through it, they easily can - which puts us in a sort of virus/antibiotic eternal war of attrition. And if course that is the case with many things in computing as well. So I guess my open wondering are just about if there’s ever any way to develop a countermeasure that the scrapers won’t find “worth it” to force through?

    Edit for tone clarity: I’m don’t want to be antagonistic, rude, or hurtful in any way. Just trying to have a discussion and understand this situation. Perhaps I was arrogant, if so I apologize. It was also not my intent, fwiw. Also, thanks for helping me understand why I was getting downvoted. I intended my post to just be constructive spitballing about what I see as the eventual inevitable weakness in Anubis. I think it’s a great project and it’s great that people are getting use out of it even temporarily, and of course the devs deserve lots of respect for making the thing. But as much as I wish I could like it and believe it will solve the problem, I still don’t think it will.




  • Yeah, well-written stuff. I think Anubis will come and go. This beautifully demonstrates and, best of all, quantifies the negligence negligible cost to scrapers of Anubis.

    It’s very interesting to try to think of what would work, even conceptually. Some sort of purely client-side captcha type of thing perhaps. I keep thinking about it in half-assed ways for minutes at a time.

    Maybe something that scrambles the characters of the site according to some random “offset” of some sort, e.g maybe randomly selecting a modulus size and an offset to cycle them, or even just a good ol’ cipher. And the “captcha” consists of a slider that adjusts the offset. You as the viewer know it’s solved when the text becomes something sensical - so there’s no need for the client code to store a readable key that could be used to auto-undo the scrambling. You could maybe even have some values of the slider randomly chosen to produce English text if the scrapers got smart enough to check for legibility (not sure how to hide which slider positions would be these red herring ones though) - which could maybe be enough to trick the scraper into picking up junk text sometimes.


  • I know it’s popular to call conservatives dumb and while I don’t like to beat dead horses, I really think the explanation for this is that they’re dumb. The illiterate form of dumb, to be precise. The caps are a way of adding emphasis - which is something that can also be done by phrasing, word choice, sentence structure, and so on. But those techniques would require a beyond 4th grade level of writing and reading ability, so they do not succeed in the conservative communication marketplace.

    I will add a disclaimer, my beloved Nietzsche also does all caps words sometimes, but I believe that since this is alongside his impressive eloquence, it is clearly not a sign of stupidity in that context. Likewise, it is not a sign of stupidity in many other contexts. That is to say:

    Using that style of communication does not always make it a safer bet that someone is stupid, but being stupid does make it a safer bet that they use that style of communication.






  • I wasn’t being totally serious, but also, I do think that while accessibility concerns come from a good place, there is some practical limitation that must be accepted when building fringe and counter-cultural things. Like, my hidden rebel base can’t have a wheelchair accessible ramp at the entrance, because then my base isn’t hidden anymore. It sucks that some solutions can’t work for everyone, but if we just throw them out because it won’t work for 5% of people, we end up with nothing. I’d rather have a solution that works for 95% of people than no solution at all. I’m not saying that people who use screen readers are second-class citizens. If crawlers were vision-based then I might suggest matching text to background colors so that only screen readers work to understand the site. Because something that works for 5% of people is also better than no solution at all. We need to tolerate having imperfect first attempts and understand that more sophisticated infrastructure comes later.

    But yes my image map idea is pretty much a joke nonetheless




  • I’m trying to take a progress over perfection approach to these things. My number one priority was to get off of Chrome and Firefox is pretty rough on mobile. I tried a few things and Brave was the one with the best experience, especially because of the ad blocking without needing to mess around with a bunch of plugins. I figure I can go deeper into that iceberg over time. What do you use?


  • I don’t know much about hacking, but it’s surprising to me that there’s not a way to get around this. What stops people from developing a forced workaround? What would need to be done to develop one?

    Edit: Answering my own question, sort of, it seems that a locked bootloader uses cryptographic keys stored on the device, so the problem becomes a typical key brute forcing scenario. What a mess. It’s so annoying that there aren’t more “touchscreen handheld computers” where you can just install whatever you want on them the same as building your own PC. I hate how everything like that is being chipped away over time.

    Stuff like this seems promising though in a very far-out, push-comes-to-shove kind of way: https://www.synacktiv.com/en/publications/how-to-voltage-fault-injection#protect