• 1 Post
  • 34 Comments
Joined 1 year ago
cake
Cake day: September 27th, 2023

help-circle
  • Never liked him, but I acknowledge that he had some effective economic policies during his time as mayor. He was at least competent and sane. He went completely off the rails a long time ago, though.

    He’s often credited with cleaning up Times Square, which was known for prostitution back in the 80s. But honestly, he reaped what his predecessors sowed to a large degree.

    He used 9/11 like his personal sword and shield. He was lucky to be in a prominent position related the biggest and least controversial issue in America. I don’t imagine he ever would have been on the national stage otherwise. He was pretty much at the natural end of his career before then.

    NYC has a history of conservative mayors, which seems a bit odd since we’re so solidly liberal in federal elections. It sure doesn’t help when we get a Democrat as infantile and corrupt as our current mayor, Eric Adams. See: https://en.wikipedia.org/wiki/Federal_prosecution_of_Eric_Adams


  • I posted some of my experience with Kagi’s LLM features a few months ago here: https://literature.cafe/comment/6674957 . TL;DR: the summarizer and document discussion is fantastic, because it does not hallucinate. The search integration is as good as anyone else’s, but still nothing to write home about.

    The Kagi assistant isn’t new, by the way; I’ve been using it for almost a year now. It’s now out of beta and has an improved UI, but the core functionality seems mostly the same.

    As far as actual search goes, I don’t find it especially useful. It’s better than Bing Chat or whatever they call it now because it hallucinates less, but the core concept still needs work. It basically takes a few search results and feeds them into the LLM for a summary. That’s not useless, but it’s certainly not a game-changer. I typically want to check its references anyway, so it doesn’t really save me time in practice.

    Kagi’s search is primarily not LLM-based and I still find the results and features to be worth the price, after being increasingly frustrated with Google’s decay in recent years. I subscribed to the “Ultimate” Kagi plan specifically because I wanted access to all the premium language models, since subscribing to either ChatGPT or Claude would cost about the same as Kagi, while Kagi gives me access to both (plus Mistral and Gemini). So if you’re interested in playing around with the latest premium models, I still think Kagi’s Ultimate plan is a good deal.

    That said, I’ve been disappointed with the development of LLMs this year across the board, and I’m not convinced any of them are worth the money at this point. This isn’t so much a problem with Kagi as it is with all the LLM vendors. The models have gotten significantly worse for my use cases compared to last year, and I don’t quite understand why; I guess they are optimizing for benchmarks that simply don’t align with my needs. I had great success getting zsh or Python one-liners last year, for example, whereas now it always seems to give me wrong or incomplete answers.

    My biggest piece of advice when dealing with any LLM-based tools, including Kagi’s, is: don’t use it for anything you’re not able to validate and correct on your own. It’s just a time-saver, not a substitute for your own skills and knowledge.







  • Ah, somehow I didn’t see 18 there and only looked at 17. Thanks!

    I tried pulling just the one package from the sid repo, but that created a cascade of dependencies, including all of llvm. I was able to get those files installed but not able to get clinfo to succeed. I also tried installing llvm-19 from the repo at https://apt.llvm.org/, with similar results. clinfo didn’t throw the fatal errors anymore, but it didn’t work, either. It still reported Number of devices 0 and OpenCL-based tools crashed anyway. Not with the same error, but with something generic about not finding a device or possibly having corrupt drivers.

    Should I bite the bullet and do a full ugprade to sid, or is there some way to this more precisely that won’t muck up Bookworm?





  • Thanks! I didn’t see that. Relevant bit for convenience:

    we call model providers on your behalf so your personal information (for example, IP address) is not exposed to them. In addition, we have agreements in place with all model providers that further limit how they can use data from these anonymous requests that includes not using Prompts and Outputs to develop or improve their models as well as deleting all information received within 30 days.

    Pretty standard stuff for such services in my experience.


  • I’m not entirely clear on which (anti-)features are only in the browser vs in the web site as well. It sounds like they are steering people toward their commercial partners like Binance across the board.

    Personally I find the cryptocurrency stuff off-putting in general. Not trying to push my opinion on you though. If you don’t object to any of that stuff, then as far as I know Brave is fine for you.



  • If you click the Chat button on a DDG search page, it says:

    DuckDuckGo AI Chat is a private AI-powered chat service that currently supports OpenAI’s GPT-3.5 and Anthropic’s Claude chat models.

    So at minimum they are sharing data with one additional third party, either OpenAI or Anthropic depending on which model you choose.

    OpenAI and Anthropic have similar terms and conditions for enterprise customers. They are not completely transparent and any given enterprise could have their own custom license terms, but my understanding is that they generally will not store queries or use them for training purposes. You’d better seek clarification from DDG. I was not able to find information on this in DDG’s privacy policy.

    Obviously, this is not legal advice, and I do not speak for any of these companies. This is just my understanding based on the last time I looked over the OpenAI and Anthropic privacy policies, which was a few months ago.


  • Oh yes, definitely. I think this is why Mozilla has not made this the default behavior in Firefox; there will always be the risk of false-positives breaking copied links, so it’s important that people know that there’s some kind of mutation happening.

    ClearURLs uses a JSON file with site-specific regex patterns and rules. In theory I could customize this for myself, or better yet submit a pull request on their GitHub. If I have time I’ll look into it.


  • Personally, I have found this feature to be too limited. I still use the ClearURLs extension, which is more effective in my experience.

    However, neither one is a silver bullet. Here’s an example I just took from Amazon (I blocked out some values with X’s):

    Original URL:
    https://www.amazon.com/Hydro-Flask-Around-Tumbler-Trillium/dp/B0C353845H/ref=XXXX?qid=XXXXXXXXXX&refinements=p_XXXXXXXXXXXXX&rps=1&s=sporting-goods&sr=XXX

    Using Firefox’s “copy link without site tracking” feature:
    https://www.amazon.com/Hydro-Flask-Around-Tumbler-Trillium/dp/B0C353845H/ref=XXXX?qid=XXXXXXXXXX&refinements=p_XXXXXXXXXXXXX&rps=1&s=sporting-goods

    Using ClearURLs:
    https://www.amazon.com/Hydro-Flask-Around-Tumbler-Trillium/dp/B0C353845H?refinements=p_XXXXXXXXXXXXX&rps=1

    The ideal, canonical URL, which no tools I’m familiar with will reliably generate:
    https://www.amazon.com/dp/B0C353845H

    Longer but still fully de-personalized URL:
    https://www.amazon.com/Hydro-Flask-Around-Tumbler-Trillium/dp/B0C353845H

    If anybody knows a better solution that works with a wide variety of sites, please share!


  • Yeah, I wouldn’t be too confident in Facebook’s implementation, and I certainly don’t believe that their interests are aligned with their users’.

    That said, it seems like we’re reaching a turning point for big tech, where having access to private user data becomes more of a liability than an asset. Having access to the data means that they will be required by law to provide that data to governments in various circumstances. They might have other legal obligations in how they handle, store, and process that data. All of this comes with costs in terms of person-hours and infrastructure. Google specifically cited this is a reason they are moving Android location history on-device; they don’t want to deal with law enforcement constantly asking them to spy on people. It’s not because they give a shit about user privacy; it’s because they’re tired of providing law enforcement with free labor.

    I suspect it also helps them comply with some of the recent privacy protection laws in the EU, though I’m not 100% sure on that. Again, this is a liability issue for them, not a user-privacy issue.

    Also, how much valuable information were they getting from private messages in the first place? Considering how much people willingly put out in the open, and how much can be inferred simply by the metadata they still have access to (e.g. the social graph), it seems likely that the actual message data was largely redundant or superfluous. Facebook is certainly in position to measure this objectively.

    The social graph is powerful, and if you really care about privacy, you need to worry about it. If you’re a journalist, whistleblower, or political dissident, you absolutely do not want Facebook (and by extension governments) to know who you talk you or when. It doesn’t matter if they don’t know what you’re saying; the association alone is enough to blow your cover.

    The metadata problem is common to a lot of platforms. Even Signal cannot use E2EE for metadata; they need to know who you’re communicating with in order to deliver your messages to them. Signal doesn’t retain that metadata, but ultimately you need to take their word on that.



  • This is correct, albeit not universal.

    KDE has a predefined schedule for “release candidates”, which includes RC2 later this month. So “RC1” is clearly not going to be the final version. See: https://community.kde.org/Schedules/February_2024_MegaRelease

    This is at least somewhat common. In fact, it’s the same way the Linux kernel development cycle works. They have 7 release candidates, released on a weekly basis between the beta period and final release. See: https://www.kernel.org/category/releases.html

    In the world of proprietary corporate software, I more often see release candidates presented as potentially final; i.e. literal candidates for release. The idea of scheduling multiple RCs in advance doesn’t make sense in that context, since each one is intended to be the last (with fingers crossed).

    It’s kind of splitting hairs, honestly, and I suspect this distinction has more to do with the transparency of open-source projects than anything else. Apple, for example, may indeed have a schedule for multiple macOS RCs right from the start and simply choose not to share that information. They present every “release candidate” as being potentially the final version (and indeed, the final version will be the same build as the final RC), but in practice there’s always more than one. Also, Apple is hardly an ideal example to follow, since they’ve apparently never even heard of semantic version numbering. Major compatibility-breaking changes are often introduced in minor point releases. It’s infuriating. But I digress.


  • As a Kagi subscriber, I’ve been very happy with their transparency in general. The feedback site is open to the public and Vlad and other staff members regularly engage in conversation about possible future features, limitations, and even business decisions in the Discord. It’s been refreshing.

    …which makes the response to this issue all the more frustrating and disappointing.

    I think Vlad’s comments in the original feedback thread were fair enough, but then later, in the Discord, I saw a lot of “let’s move this to a private chat”. They even changed their General channel to “slow mode” to prevent live conversations as this topic became hot. Now I see they were also deleting threads?! Ugh. That’s not transparent at all. Not what I expected based on my previous experience with Kagi.