Pebbling Club 🐧🪨

  • Home - PaddleOCR Documentation
    Notes
    PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
    Unfurl
  • mittagessen/kraken: OCR engine for all the languages
    Notes
    kraken is a turn-key OCR system optimized for historical and non-Latin script material.
    Unfurl
  • Automatic Text Recognition / PyLaia · GitLab
    Notes
    PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.
  • OCRAD.js: Pure Javascript OCR via Emscripten – Blog
    Notes
    The idea of the extension was kind of simple and also kind of magical: a browser extension that allowed users to highlight, copy, and paste text from any image as if it were plain text. Of course the implementation is a bit difficult and actually relies on the advent of a number of newfangled technologies.
    Unfurl
  • How we built a DIY book scanner with speeds of 150 pages per minute | Ars Technica
    Unfurl
  • unpaper 0.3
    Notes
    unpaper is a post-processing tool for scanned sheets of paper, especially for book pages that have been scanned from previously created photocopies. The main purpose is to make scanned book pages better readable on screen after conversion to PDF.
    Feed
    Embed
    Unfurl
  • John Resig - OCR and Neural Nets in JavaScript
    Notes
    "A pretty amazing piece of JavaScript dropped yesterday and it's going to take a little bit to digest it all. It's a GreaseMonkey script, written by 'Shaun Friedle', that automatically solves captchas provided by the site Megaupload. There's a demo online if you wish to give it a spin."
    Feed
    Embed
    Unfurl
  • The Digital Book: Paper's Last Hurrah
    Notes
    "While Sony Readers and Amazon Kindles take to the scene, one paper lover, in celebration of the Blood on Paper exhibition (something we've never heard of but have a pretty good idea what it's about), released this USB copy of The New Machiavelli. Photographed page by page, those who think its contents might resemble Google Book Search would be dreadfully wrongWe thoroughly enjoy that his hands are in each shot. That's what the fancy Kindle has been missing all this time!"
    Unfurl
  • tesseract-ocr - Google Code
    Notes
    "one of the most accurate open source OCR engines available"
    Unfurl