NotesVectorVFS is a lightweight Python package that transforms your Linux filesystem into a vector database by leveraging the native VFS (Virtual File System) extended attributes. Rather than maintaining a separate index or external database, VectorVFS stores vector embeddings directly alongside each file—turning your existing directory structure into an efficient and semantically searchable embedding store.Unfurl
NotesSwear in your search request. I know it sounds ridiculous, but the most effective way I've found of it not doing the AI summary is just to add "fucking" go my search.FeedEmbedUnfurl
NotesConventional search methods rely on keyword matching, where the system locates exact words or phrases from the query within documents. This technique can be enhanced to better capture the context and intent behind the user's query, leading to more relevant and precise search results. Semantic search focuses on understanding the meaning and intent behind the query. Combining semantic search with filters—or additional parameters to narrow the results based on specific attributes—further improves accuracy.
In this article, we explore semantic search with filters and demonstrate how you can implement it using pgvector and JavaScript.FeedUnfurl
NotesI wrote code (in PHP) to import my old bookmarks. And you didn’t misread that, it’s 28,000 links over the last 20 years. They come from my original links blog, and from Delicious, and from Instapaper, and from Twitter back when I backfilled those shared links into Pinboard, which I then backfilled into my link blog database. It’s a lot of things, but really, it’s straightforward from a programming point of view. It’s a simple POST to the REST API which Linkding provides by default. I have shared most of the code I wrote as a Gist in GitHub. It includes code in PHP to convert exported CSV from Instapaper to a JSON format that my poster code can use. And over the last 2 days and over many hours, every few seconds a link would be posted. And as of now, it’s finished.FeedEmbedUnfurl
Notessqlite-lembed is a SQLite extension for generating text embeddings, meant to work alongside sqlite-vec. With a single embeddings model file provided in the .gguf format, you can generate embeddings using regular SQL functions, and store them directly inside your SQLite database. No extra server, process, or configuration needed!Unfurl
NotesThe most common use-case for vectors in databases is for K-nearest-neighbors (KNN) queries. You'll have a table of vectors, and you'll want to find the K closestUnfurl
NotesYou can use SQLite's builtin full-text search (FTS5) extension and semantic search with sqlite-vec to create "hybrid search" in your applications. You can combine results using different methods like keyword-first, re-ranking by "semantics", and reciprocal rank fusion. Best of all, since it's all in SQLite, experiments and prototypes are cheap and easy, no 3rd party services required!Unfurl
Notes Welcome to APIs.io - an experimental API Search service to help discover APIs on the web. The service uses the APIs.json proposed discovery format. To find APIs type your request in the search box. Unfurl
NotesYahoo Inc. has been quietly trying to find a way out of its struggling Web-search partnership with Microsoft Corp., a person familiar the situation said, but has so far failed in that effort.Unfurl
NotesTwo weeks ago, Google disabled the + operator for searches, requiring quotation marks to force inclusion of a word.
Today, Google Plus rolled out a new feature - Pages for companies and brands, so you can "build relationships with all the things you care about". Included is Direct Connect - go straight to Pepsi's Google+ page by searching for +Pepsi.FeedUnfurl
NotesA9.com invented the OpenSearch technology for search aggregation in 2004.
OpenSearch is a set of simple formats for the sharing of search results. Any website that has a search feature can make their results available in OpenSearch format. Other tools can then read those search results.Unfurl
Notes"@David, it's real. We've determined by looking at our traffic stats that people are doing Google searches for "facebook login" and coming upon RWW. They see the FB Connect button and assume that RWW is the "new Facebook."
Sigh.
The Internet Is Hard."Unfurl
Notes"The book aims to provide a modern approach to information retrieval from a computer science perspective. It is based on a course we have been teaching in various forms at Stanford University and at the University of Stuttgart. "Unfurl
Notes"Providing search services that span a number of disparate websites is a challenging problem that in the past has been left to the big-boys such as Google. However Amazon's OpenSearch RSS format is changing this reality and providing a means for effective multiple website search to be deployed at low cost by small development teams. "Unfurl
NotesIf you can get past the rambling paragraphs of awkward fun-poking at tags interspersed with library science / web 2.0 / cultural references—as well as a discovery of what, you know, Flickr is all about—there's a well-embellished and obsessively-assembled statistical analysis of tags vs title vs notes in finding photos featuring tourist heel-spinning on the testicles of a bull mosaic in Milan. My impression is that she's missed the point of tags, but I'm having trouble reducing the impression to a critique.FeedUnfurl
Notes"By a happy coincidence, the two primary difficulties inherent in searching emails — its sheer volume and its "noisy" nature — are susceptible of recent developments in machine learning technologies that make this task manageable. This is what IT DiscUnfurl
Notes"Here is to hoping Yahoo! figures out how to better leverage Delicious without killing it in the process. They have done an admirable job in not killing Flickr, and lets see if they can do the same for Delicious. In the meantime, kudos to Joshua and the DUnfurl
Notes"You go to your favorite Web 2.0 search engine and set up a query like http://web20.example.com/search=john+doe&ouptut=atom and search for "john doe," but rather than getting back results as the usual HTML web page, you get it back in Atom format."Unfurl
Notes"Used to indicate the re-distribution restrictions for a feed. The 'relationship' attribute is used to indicate whether a feed will 'allow' or 'deny' access."Unfurl
Notes"With MonitorThis you can subscribe to 22 different search engine feeds at the same time. Enter a search term and click the 'make monitor.opml' button to get a list of rss feeds in OPML format."Unfurl
NotesUhh, what? Who's the genius who thought this would work? "Every time you search the web you stand a chance of winning a prize from Kevin Federline."Unfurl
Notes"So now command line interfaces are back again, hiding under the name of search. ... And they will get better and better with time: mark my words, that is my prediction for the future of interfaces."Unfurl
Notes"All joking aside, SPA is not helpful, it's not cool, and it's not winning you readers -- It's bling, a silly little shiney thing designed specifically to increase awareness of Snap.com"Unfurl
Notes"if I have a bunch of publicly accessible objects in my S3 buckets, then won't I have to pay bandwidth costs when Google/yahoo/a9 and all the other search engines go and spider my buckets?"Unfurl
Notes"That's just an OPML document which displays all the categories and feeds we pre-populate with your keyword on Gada.be. Go ahead, change the word "keyword" to something else - watch what happens."Unfurl