Pebbling Club 🐧🪨

  • How I code with LLMs these days
    Notes
    To me, all signs point towards software engineering changing radically as a profession to be much more oriented around the what and why of software, and much less around the how. This will cause disruption at a massive scale in the long run. But in the short run, it's just a lot of fun to play with these tools and see what they can do.
    Unfurl
  • How to Increase the VRAM of Your Mac with Apple Silicone for LLMs? | Hardware Corner
    Notes
    It is surprisingly straightforward to increase the VRAM of your Mac (Apple Silicone M1/M2/M3 chips) computer and use it to load large language models. Here’s the rundown of my experiments. ... I found a way to bypass this limitation. To allocate more of your Mac’s system RAM to VRAM – in this case, up to 28 GB – the following command can be used in the terminal window: sudo sysctl iogpu.wired_limit_mb=27536
    Feed
    Embed
    Unfurl
  • ewintr.nl - building a personal, private ai computer on a budget
    Notes
    my attempt to build such a capable AI computer without spending too much. I ended up with a workstation with 48GB of VRAM that cost me around 1700 euros
    Feed
    Unfurl
  • Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory
    Notes
    You wrote an interesting comment about getting your work into the LLM training corpus: "there has never been a more vital hinge-y time to write." Do you mean that in the sense that you will be this drop in the bucket that’s steering the Shoggoth one way or the other? Or do you mean it in the sense of making sure your values and persona persist somewhere in latent space?
    Feed
    Embed
    Unfurl
  • How I use LLMs as a staff engineer | sean goedecke
    Notes
    Personally, I feel like I get a lot of value from AI. I think many of the people who don’t feel this way are ā€œholding it wrongā€: i.e. they’re not using language models in the most helpful ways. In this post, I’m going to list a bunch of ways I regularly use AI in my day-to-day as a staff engineer.
    Feed
    Unfurl
  • AI Slop, Suspicion, and Writing Back | Ben Congdon
    Notes
    Undoubtedly, the sloppification of the internet will likely get worse over the next few years. And as such, the returns to curating quality sources of content will only increase. My advice? Use an RSS feed reader, read Twitter lists instead of feeds, and find spaces where real discussion still happens (e.g. LessWrong and Lobsters still both seem slop-free).
    Unfurl
  • DeepSeek FAQ – Stratechery by Ben Thompson
    Notes
    At the same time, there should be some humility about the fact that earlier iterations of the chip ban seem to have directly led to DeepSeek’s innovations. Those innovations, moreover, would extend to not just smuggled Nvidia chips or nerfed ones like the H800, but to Huawei’s Ascend chips as well. Indeed, you can very much make the case that the primary outcome of the chip ban is today’s crash in Nvidia’s stock price.
    Unfurl
  • Knowing less about AI makes people more open to having it in their lives – new research
    Notes
    People with less knowledge about AI are actually more open to using the technology. We call this difference in adoption propensity the ā€œlower literacy-higher receptivityā€ link.
    Unfurl
  • Ignore the Grifters - AI Isn't Going to Kill the Software Industry — Dustin Ewers
    Notes
    I feel like half of my social media feed is composed of AI grifters saying software developers are not going to make it. Combine that sentiment with some economic headwinds and it's easy to feel like we're all screwed. I think that's bullshit. The best days of our industry lie ahead.
    Unfurl
  • Don't use cosine similarity carelessly
    Notes
    Just as Midas discovered that turning everything to gold wasn't always helpful, we'll see that blindly applying cosine similarity to vectors can lead us astray. While embeddings do capture similarities, they often reflect the wrong kind - matching questions to questions rather than questions to answers, or getting distracted by superficial patterns like writing style and typos rather than meaning. This post shows you how to be more intentional about similarity and get better results.
  • crawshaw - 2025-01-06
    Notes
    This document is a summary of my personal experiences using generative models while programming over the past year. It has not been a passive process. I have intentionally sought ways to use LLMs while programming to learn about them. The result has been that I now regularly use LLMs while working and I consider their benefits net-positive on my productivity. (My attempts to go back to programming without them are unpleasant.)
    Feed
    Unfurl
  • All-Hands-AI/OpenHands: šŸ™Œ OpenHands: Code Less, Make More
    Notes
    Welcome to OpenHands (formerly OpenDevin), a platform for software development agents powered by AI. OpenHands agents can do anything a human developer can: modify code, run commands, browse the web, call APIs, and yes—even copy code snippets from StackOverflow.
    Unfurl
  • Home - PaddleOCR Documentation
    Notes
    PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
    Unfurl
  • mittagessen/kraken: OCR engine for all the languages
    Notes
    kraken is a turn-key OCR system optimized for historical and non-Latin script material.
    Unfurl
  • Automatic Text Recognition / PyLaia Ā· GitLab
    Notes
    PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.
  • Implementing Filtered Semantic Search Using Pgvector and JavaScript
    Notes
    Conventional search methods rely on keyword matching, where the system locates exact words or phrases from the query within documents. This technique can be enhanced to better capture the context and intent behind the user's query, leading to more relevant and precise search results. Semantic search focuses on understanding the meaning and intent behind the query. Combining semantic search with filters—or additional parameters to narrow the results based on specific attributes—further improves accuracy. In this article, we explore semantic search with filters and demonstrate how you can implement it using pgvector and JavaScript.
    Feed
    Unfurl
  • The 70% problem: Hard truths about AI-assisted coding
    Notes
    AI isn't making our software dramatically better because software quality was (perhaps) never primarily limited by coding speed. The hard parts of software development – understanding requirements, designing maintainable systems, handling edge cases, ensuring security and performance – still require human judgment. What AI does do is let us iterate and experiment faster, potentially leading to better solutions through more rapid exploration. But only if we maintain our engineering discipline and use AI as a tool, not a replacement for good software practices. Remember: The goal isn't to write more code faster. It's to build better software. Used wisely, AI can help us do that. But it's still up to us to know what "better" means and how to achieve it.
    Feed
    Unfurl
  • heaversm/llamafile-code-completion: Use llamafile to generate inline code completions in react / next.js apps.
    Notes
    Use llamafile to generate inline code completions in react / next.js apps.
    Unfurl
  • Godot Isn't Making it
    Notes
    New, more powerful chips require entirely new methods to rack-mount, operate and cool them, and all of these parts must operate in sync, as overheating GPUs will die. While these units are big, some of their internal components are microscopic in size, and unless properly cooled, their circuits will start to crumble when roasted by a guy typing "Garfield with Gun" into ChatGPT.
    Feed
    Unfurl
  • The Illustrated Word2vec – Jay Alammar – Visualizing machine learning one concept at a time.
    Notes
    I hope that you now have a sense for word embeddings and the word2vec algorithm. I also hope that now when you read a paper mentioning ā€œskip gram with negative samplingā€ (SGNS) (like the recommendation system papers at the top), that you have a better sense for these concepts.
    Feed
    Unfurl
  • Bluesky, AI, and the battle for consent on the open web
    Notes
    So the problem Bluesky is dealing with is not so much a problem with Bluesky itself or its architecture, but one that’s inherent to the web itself and the nature of building these training datasets based on publicly-available data. Van Strien’s original act clearly showed the difference in culture between AI and open social web communities: on the former it’s commonplace to grab data if it can be read publicly (or even sometimes if it’s not), regardless of licensing or author consent, while on open social networks consent and authors’ rights are central community norms.
    Feed
    Unfurl
  • Between the Booms: AI in Winter – Communications of the ACM
    Notes
    After people stopped caring, artificial intelligence got more interesting.
    Feed
    Unfurl
  • On not using copilot - macwright.com
    Notes
    So, in summary: maybe people shy away from copilots because they’re tired of complexity, they’re tired of accelerating productivity without improving hours, they’re afraid of forgetting rote skills and basic knowledge, and they want to feel like writers, not managers. Maybe some or none of these things are true - they’re emotional responses and gut feelings based on predictions - but they matter nonetheless.
    Feed
    Unfurl
  • Adjacent Possible | Steven Johnson | Substack
    Notes
    A newsletter from author Steven Johnson exploring where good ideas come from—and how to keep them from turning against us. Click to read Adjacent Possible, by Steven Johnson, a Substack publication with tens of thousands of subscribers.
    Feed
    Unfurl
  • You Exist In The Long Context
    Notes
    The current state-of-the-art Gemini model can fit roughly 1.5 million words in its context. That’s enough for me to upload the full text of all fourteen of my books, plus every article, blog post, or interview I’ve ever published—and the entirety of my collection of research notes that I’ve compiled over the years. The Gemini team has announced plans for a model that could hold more than 7 million words in its short-term memory. That’s enough to fit everything I’ve ever written, plus the hundred books and articles that most profoundly shaped my thinking over the years. An advanced model capable of holding in focus all that information would have a profound familiarity with all the words and ideas that have shaped my personal mindset. Certainly its ability to provide accurate and properly-cited answers to questions about my worldview (or my intellectual worldview, at least) would exceed that of any other human. In some ways it would exceed my own knowledge, thanks to its ability to instantly recall facts from books I read twenty years ago, or make new associations between ideas that I have long since forgotten. It would lack any information about my personal or emotional history—though I suppose if I had maintained a private journal over the past decades it would be able to approximate that part of my mindset as well. But as reconstruction of my intellectual grounding, it would be unrivaled. If that is not considered material progress in AI, there is something wrong with our metrics.
    Unfurl
  • An introduction to fine-tuning LLMs at home with Axolotl • The Register
    Notes
    In this guide we'll discuss: Where and when fine-tuning can be useful. Alternative approaches to extending the capabilities and behavior of pre-trained models. The importance of data preparation. How to fine-tune Mistral 7B using your own custom dataset with Axolotl. The many hyperparameters and their effect on training. Additional resources to help you fine-tune your models faster and more efficiently.
    Feed
    Unfurl
  • Sponsoring the Web Applets project, an open approach to AI-empowered web apps - Mozilla Innovations
    Notes
    Web Applets are small, secure pieces of web code (bundles of HTML, JavaScript, and CSS) that can run anywhere, allowing a model to take actions within software much like a human would and then generate interfaces appropriate for the user’s intent. For example, a developer could write an applet that enables a model to respond to a query about local coffee shops by conducting internet searches and then displaying the results on an in-line map. And because the model can read the internal state of each applet, it can then conduct follow-up actions to complete a user’s request (for example, updating the map to display only coffee shops that will be open tomorrow afternoon). Anyone can build Web Applets and host them on the Web, and any client can potentially support them.
    Unfurl
  • Dead Labor, Dead Speech - by Nicholas Carr
    Notes
    If, as Marx argued, capital is dead labor, then the products of large language models might best be understood as dead speech. Just as factory workers produce, with their ā€œliving labor,ā€ machines and other forms of physical capital that are then used, as ā€œdead labor,ā€ to produce more physical commodities, so human expressions of thought and creativityā€”ā€œliving speechā€ in the forms of writing, art, photography, and music—become raw materials used to produce ā€œdead speechā€ in those same forms. LLMs, to continue with Marx’s horror-story metaphor, feed ā€œvampire-likeā€ on human culture. Without our words and pictures and songs, they would cease to function. They would become as silent as a corpse in a casket.
    Feed
    Unfurl
  • Everything I've learned so far about running local LLMs
    Notes
    Over the past month I’ve been exploring the rapidly evolving world of Large Language Models (LLM). It’s now accessible enough to run a LLM on a Raspberry Pi smarter than the original ChatGPT (November 2022). A modest desktop or laptop supports even smarter AI. It’s also private, offline, unlimited, and registration-free. The technology is improving at breakneck speed, and information is outdated in a matter of months. This article snapshots my practical, hands-on knowledge and experiences — information I wish I had when starting. Keep in mind that I’m a LLM layman, I have no novel insights to share, and it’s likely I’ve misunderstood certain aspects. In a year this article will mostly be a historical footnote, which is simultaneously exciting and scary.
    Feed
    Unfurl
  • Perceptually lossless (talking head) video compression at 22kbit/s | Martin Lumiste
    Notes
    I’ve been having quite a bit of fun with the fairly recent LivePortrait model, generating deepfakes of my friends for some cheap laughs.
    Feed
    Unfurl
  • [2410.16454] Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge
    Notes
    Large language models (LLMs) have shown remarkable proficiency in generating text, benefiting from extensive training on vast textual corpora. However, LLMs may also acquire unwanted behaviors from the diverse and sensitive nature of their training data, which can include copyrighted and private content. Machine unlearning has been introduced as a viable solution to remove the influence of such problematic content without the need for costly and time-consuming retraining. This process aims to erase specific knowledge from LLMs while preserving as much model utility as possible. Despite the effectiveness of current unlearning methods, little attention has been given to whether existing unlearning methods for LLMs truly achieve forgetting or merely hide the knowledge, which current unlearning benchmarks fail to detect. This paper reveals that applying quantization to models that have undergone unlearning can restore the "forgotten" information. To thoroughly evaluate this phenomenon, we conduct comprehensive experiments using various quantization techniques across multiple precision levels. We find that for unlearning methods with utility constraints, the unlearned model retains an average of 21\% of the intended forgotten knowledge in full precision, which significantly increases to 83\% after 4-bit quantization. Based on our empirical findings, we provide a theoretical explanation for the observed phenomenon and propose a quantization-robust unlearning strategy to mitigate this intricate issue...
    Unfurl
  • Whatever AI Looks Like, It's Not | Defector
    Notes
    It is nightmarish to me to read reports of how reliant on ChatGPT students have become, even outsourcing to the machines the ideally very personal assignment "briefly introduce yourself and say what you're hoping to get out of this class." It is depressing to me to read defenses of those students, particularly this one that compares an AI-written essay to using a washing machine in that it reduces the time required for the labor. This makes sense only if the purpose of a student writing an essay is "to have written an essay," which it is not. The teacher did not assign it as busywork. The purpose of an essay is to learn and practice communication skills, critical thinking, organization of one's own thoughts. These are useful skills to develop, even (especially!) if you do not go into a writing career.
    Unfurl
  • Oasis
    Notes
    Oasis takes in user keyboard input and generates real-time gameplay, including physics, game rules, and graphics. You can move around, jump, pick up items, break blocks, and more. There is no game engine; just a foundation model.
    Unfurl
  • Vector Databases Are the Wrong Abstraction
    Notes
    A more effective abstraction is conceptualizing vector embeddings not as independent tables or data types but as a specialized index on the embedded data. This is not to say that vector embeddings are literally indexes in the traditional sense, like those in PostgreSQL or MySQL, which retrieve entire data rows from indexed tables. Instead, vector embeddings function as an indexing mechanism that retrieves the most relevant parts of the data based on its embeddings.
    Feed
    Unfurl
  • You can now run prompts against images, audio and video in your terminal using LLM
    Notes
    I released LLM 0.17 last night, the latest version of my combined CLI tool and Python library for interacting with hundreds of different Large Language Models such as GPT-4o, Llama, Claude and Gemini.
    Feed
    Unfurl
  • The A.I. Bubble is Bursting with Ed Zitron - YouTube
    Notes
    Big tech is betting tens of billions of dollars on AI being the next big thing, but what if it isn't?
    Embed
    Unfurl
  • Embeddings are underrated
    Notes
    Embeddings aren't exactly new, but they have become much more widely accessible in the last couple years. What embeddings offer to technical writers is the ability to discover connections between texts at previously impossible scales.
    Feed
    Unfurl
  • Xan 9 from Outer Space: "Generative AI is like if capitalism reinvented the fae. It’ll trick you into accepting its agreement and then it’ll steal your face and start speaking with your voice." — Bluesky
    Notes
    Generative AI is like if capitalism reinvented the fae. It’ll trick you into accepting its agreement and then it’ll steal your face and start speaking with your voice.
    Embed
    Unfurl
  • Aman's AI Journal • Primers • Ilya Sutskever's Top 30
    Notes
    Ilya Sutskever shared a list of 30 papers with John Carmack and said, ā€œIf you really learn all of these, you’ll know 90% of what matters todayā€
    Feed
    Unfurl
  • AI Winter Is Coming
    Notes
    This is how we’re headed for another AI winter, just as we saw with the fall of data science, crypto, and the modern data stack. And that’s actually a good thing. The promoters will hop onto the next trendy buzzword, while the real producers will keep moving forward, building a more capable future for AI.
    Feed
    Unfurl
  • Introducing sqlite-lembed: A SQLite extension for generating text embeddings locally | Alex Garcia's Blog
    Notes
    sqlite-lembed is a SQLite extension for generating text embeddings, meant to work alongside sqlite-vec. With a single embeddings model file provided in the .gguf format, you can generate embeddings using regular SQL functions, and store them directly inside your SQLite database. No extra server, process, or configuration needed!
    Unfurl
  • KNN queries | sqlite-vec
    Notes
    The most common use-case for vectors in databases is for K-nearest-neighbors (KNN) queries. You'll have a table of vectors, and you'll want to find the K closest
    Unfurl
  • Hybrid full-text search and vector search with SQLite | Alex Garcia's Blog
    Notes
    You can use SQLite's builtin full-text search (FTS5) extension and semantic search with sqlite-vec to create "hybrid search" in your applications. You can combine results using different methods like keyword-first, re-ranking by "semantics", and reciprocal rank fusion. Best of all, since it's all in SQLite, experiments and prototypes are cheap and easy, no 3rd party services required!
    Unfurl
  • Using Llamafiles for Embeddings in Local RAG Applications
    Unfurl
  • BART Model for Text Summarization
    Notes
    This tutorial covers the origins and uses of the BART model for text summarization tasks, and concludes with a brief demo for using BART with Paperspace Notebooks.
    Unfurl
  • Overview — Ray 2.34.0
    Unfurl
  • Open-Source LLMs - Schneier on Security
    Notes
    We have entered an era of LLM democratization. By showing that smaller models can be highly effective, enabling easy experimentation, diversifying control, and providing incentives that are not profit motivated, open-source initiatives are moving us into a more dynamic and inclusive AI landscape. This doesn’t mean that some of these models won’t be biased, or wrong, or used to generate disinformation or abuse. But it does mean that controlling this technology is going to take an entirely different approach than regulating the large players.
    Feed
    Embed
    Unfurl
  • ChatGPT is not ā€˜artificial intelligence.’ It’s theft. | America Magazine
    Notes
    But in calling these programs ā€œartificial intelligenceā€ we grant them a claim to authorship that is simply untrue. Each of those tokens used by programs like ChatGPT—the ā€œlanguageā€ in their ā€œlarge language modelā€ā€”represents a tiny, tiny piece of material that someone else created. And those authors are not credited for it, paid for it or asked permission for its use. In a sense, these machine-learning bots are actually the most advanced form of a chop shop: They steal material from creators (that is, they use it without permission), cut that material into parts so small that no one can trace them and then repurpose them to form new products.
    Unfurl
  • ChatGPT Is a Blurry JPEG of the Web | The New Yorker
    Notes
    It’s possible that, in the future, we will build an A.I. that is capable of writing good prose based on nothing but its own experience of the world. The day we achieve that will be momentous indeed—but that day lies far beyond our prediction horizon. In the meantime, it’s reasonable to ask, What use is there in having something that rephrases the Web?
    Feed
    Unfurl
  • Transcribing all our conversations 24/7 will be weird and also useful maybe (Interconnected)
    Notes
    Sooner or later, every single conversation I have will be recorded and transcribed and I’ll be able to look back at it later – details from a phone call with the bank, in the hardware store asking a question, someone mentions a book at the pub, an idea in a workshop. Ignoring the societal consequences for a sec lol ahem… how should the app to manage all that chatter work?
    Feed
    Unfurl