OCR
OCR is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text. In this case useful to extract text from Visual Novels when a Texthooker is not available.
OwOCR
OwOCR is a Command line client for several Japanese OCR
Installation
Install through pip or pipx according to your system:
pip install owocr
Or
pipx install owocr
NOTE
Requires Python 3.11, 3.12 or 3.13.
Providers
You need OCR providers to recognize the text, as of right now the best one is google lens, but you can install a few of them and test for yourself, some of them are offline too in case there is no access to internet.
Local:
- Manga OCR: install with pip install owocr[mangaocr] ("m" key)
- EasyOCR: install with pip install owocr[easyocr] ("e" key)
- RapidOCR: install with pip install owocr[rapidocr] ("r" key)
Cloud:
- Google Lens: Google Vision in disguise (no need for API keys!), install with pip install owocr[lens] ("l" key)
- Bing: Azure in disguise (no need for API keys!) ("b" key)
NOTE
To install with pipx
use the command: pipx inject owocr [name]
Example pipx inject owcr manga-ocr easyocr rapidocr-onnxruntime betterproto pyjson5 google-cloud-vision
Usage
To run it open a terminal and run owocr
. Once it is running it will automatically read images from your clipboard, any read text will be shown in the window and copied back to the clipboard. Once open you can press a key to change the provider
Setup
To optimize the usage there is a couple of shortcuts that can be useful
- Start OwOCR:
#!/bin/bash
# Configuration
OWOCR_DIR="/tmp/owocr"
# Create directory if it doesn't exist
if [ ! -d "$OWOCR_DIR" ]; then
mkdir -p "$OWOCR_DIR"
echo "Created directory: $OWOCR_DIR"
fi
# Run OwOCR
owocr -r "/tmp/owocr" -w clipboard -e glens -d -n
- Take screenshots (Using spectacle)
#!/bin/bash
# Configuration
OWOCR_DIR="/tmp/owocr"
# Create directory if it doesn't exist
if [ ! -d "$OWOCR_DIR" ]; then
mkdir -p "$OWOCR_DIR"
echo "Created directory: $OWOCR_DIR"
fi
# Generate filename with current date and time
FILENAME="$(date +%Y%m%d_%H%M%S).png"
OUTPUT_FILE="$OWOCR_DIR/$FILENAME"
# Take screenshot
spectacle --region --background --nonotify --output "$OUTPUT_FILE"
Instead of filling the clipboard which can be annoying and buggy:
- Will create a tmp folder where your screenshots will be stored.
- OwOCR will start with google lens and scanning the folder for changes.
- Once a picture has been found it will scan the text from the picture.
- It will copy the text to the clipboard, delete the image and give a notification.
To use it:
- run owocr_start.sh in a terminal.
- Make a shortcut for screenshot.sh.
- Use the shortcut to take specific screenshots that will be OCR.
NOTE
These are prototype scripts, feel free to modify according to your Desktop Environment and needs.
Preview:
GameSentenceMiner
GSM is a full GUI application to optimize OCR with Anki integration.
GSM will use OwOCR with OBS to manually detect text changes on the screen and automatically OCR the text without the need of manually taking screenshots.
Setup
- Download latest appimage from the GitHub releases page
- Install OBS through your package system, flatpak, etc.
- Make sure the OBS websocket is activated on port 7274.
Tools -> WebSocket Server Settings
NOTE
The websocket plugin is included by default in latest versions of OBS
If you don't see the option you may need the optional dependency qrcodegencpp-cmake
to enable WebSocket support.
- Video tutorial: https://www.youtube.com/watch?v=Y0BnL4TUzn8
Anki
TO-DO