Eppascan – Epson scanner connection for Paperless-NGX

A service for a headless Paperless NGX server that waits for the scan button to be pressed on any Epson scanner on the network to automatically start a scan and transfer the documents to the Paperless Consume directory.

Our chaos

We were constantly looking for documents on all sorts of things. The folders and pointless piles in drawers and shelves got thicker and thicker, the searches took longer and longer. It was a real disaster. With Paperless-NGX we could get to grips with our flood of paper:

  • Recognise text in documents (OCR)
  • Find quickly using full text search
  • Define rules for automatic filing
  • Automatically assign tags and correspondents
  • Manage documents centrally in the web browser
  • Process different file formats
  • Enable long-term archiving (PDF/A)
  • Manage correspondents (contacts)

And a document scanner was needed. It had to be network-compatible and fast. We need to scan the documents from previous years, and our Canon multifunctional device is far too slow for that. It can scan to a network, but it’s just far too complicated to use. No, this thing is a pain in the arse for our project.

There are document scanners that can scan directly to a network drive. They all have their price and when we wanted to buy, none of them were on offer. My favourite would have been Brother, they simply offer the best drivers, even after many, many years. But they’re not exactly cheap either.

There are countless document scanners on the market. Prices are around 500 to 900 EUR. Sure, with a fat budget that’s no problem, but we don’t have one.

The cheapest, good devices are from Epson. The Epson WorkForce ES-580W for around 380 EUR offers everything we need. But it’s still too expensive. My pain threshold was EUR 240, which is still a lot of wood from my point of view!

However, there was a little brother that on paper could do almost everything just as well as the 580W, but didn’t have a display! I don’t need it, the WebGUI of the scanner is enough for me! At least I thought so.

We got our Epson WorkForce ES-500W II directly from Epson on Ebay with some discount vouchers for 269 EUR. Given the good price, the promised network capability and the excellent reviews on YouTube, we went for it.

ES-500W II:

  • High scanning speed (35 pages/min)
  • Double-sided scanning in one pass (= 70 pages/min)
  • 50 sheet ADF
  • Robust mechanics (up to 4,000 scans/day)
  • Recognises double feed (ultrasonic sensor)

Oh, Epson!

Epson does not offer support for headless servers such as PaperlessNGX. This only works with a PC with a graphical operating system such as macOS, Linux or Windows on the same network. EpsonScan2 must be running on this and the user must click on a scan button. The scanner then sends the scans to the desktop PC, which moves them to the network drive. This is what Epson calls „network capability“. 🙁 I call it „cumbersome“ and a „vague advertising promise“.

I contacted the German support. No, they don’t have a solution to my problem. They don’t have any documentation of the network protocols either. So I had to do it myself and create a script.

Free knowledge

I received the scanner on 02.04.2025 and it wasn’t until 13 May that Eppascan ran without errors for the first time. It definitely took me too much time and that is in small proportion to the savings when buying the scanner. However, it was much more powerful during development. Unfortunately, I got bogged down in things like modularity or recognising the scanners in the network. I had umpteen dedicated scripts and in the end many things just didn’t work reliably and I started all over again (again and again!). In the meantime, it was so bad that I was already dreaming of the script and troubleshooting.

Finding complicated solutions is easy. I learnt back in my Commodore 64 days that in the end only simple solutions lead to the goal. The code has to be understandable, clear and maintainable.

I am an absolute beginner in this field, without the various AIs and their good explanations of the various commands I would never have created this script. I don’t like reading instructions and memorising commands. I don’t have the time for that either. And I’m also too old to read manpages. 🙂 I only knew what I wanted to achieve, which tools (e.g. Sane) I wanted to use and which solutions (button recognition using packets sent by the scanner) I wanted. Especially with the latter points, AIs are terribly unimaginative.

But as with my 3D printed models, I am convinced that knowledge must be free. The whole website is based on it! Nobody should have to start from scratch for their projects! I also gathered my information from free sources on the net:

  • My research turned into instructions and the „My thoughts on“ articles
  • My own models can be found for free on Thingiverse and Printables
  • The house blog serves as a source of information for anyone who wants to build or renovate a house
  • My Amazon reviews became test reports

I am now a software developer!

Well, anyway, I now have my own Github repo. 🙂 There you can find Eppascan. The script took me a lot of time and nerves. But it works really well now. At least on my Paperless-NGX-LXC under Proxmox and with my Epson ES-500W II.

But what is Eppascan now?

Eppascan

Epson scanner connection for Paperless-NGX on a headless server (Proxomox)

Description

Eppascan is an automated solution for the seamless integration of Epson network scanners with Paperless-NGX. The system consists of two components:

  1. eppascan.sh
    A daemon script that continuously listens for Epson scanner broadcasts on the local network on a Linux server. As soon as a scan process is triggered on the scanner by pressing a button, the script recognises the scanner, optionally waits for initialisation, starts an automatic duplex scan of all pages in the feeder (ADF) and saves the scans directly in the paperless consumption directory(/opt/paperless/consume). Status and error messages are written to a log file with a time stamp.
  2. eppascan-install
    An installation script that installs all necessary packages and dependencies, copies the main script to the correct location, takes over the system service setup (systemd) and sets the necessary rights. The scan service is then automatically executed at system startup and is immediately ready for use.

With Eppascan, scanning processes are integrated directly and efficiently into the paperless workflow – without any manual intermediate steps or additional software.

Instructions

Please note that it takes about 15 seconds after pressing the scan button on the Epson scanner for the actual scanning process to begin. Please be patient!

The procedure is as follows:

  • Insert the paper into the feeder (ADF).
  • The scanner wakes up, the WLAN symbols light up.
  • Press the scan button.
  • The orange exclamation mark on the scanner lights up briefly and goes out again after a few seconds.
  • Only then does the scan start and all pages in the ADF are automatically fed and processed.

Cause of the waiting time:

After pressing the scan button, the scanner first searches for the original Epson Scan 2 software in the network. As this is not available in this solution, the scanner waits a moment and then sends another broadcast. Unfortunately, it does not respond to requests or commands during this time. This delay is therefore due to technical reasons and unfortunately cannot be changed.

Note on configuration:

The default waiting time in the script is 15 seconds. If the scanning process does not start reliably (e.g. because the scanner is not yet ready after waking up), it may be useful to increase this value in the script(SCAN_INITIAL_DELAY).
If necessary, adjust the value if the scanner needs more time to be ready.

Documentation

Note: This script was specially developed and tested for the Paperless-NGX installation via the ProxmoxVE UserHelperScripts. In this environment, the script is executed as root, as tcpdump only works via this account.

1. Introduction

Eppascan is an open source solution for the automatic integration of Epson network scanners (e.g. ES-500WII) with Paperless-NGX. The system consists of a daemon script for scanning and an installation script for simple setup as a system service. After installation, scanning processes are automatically recognised on the scanner and the scanned documents are stored directly in the Paperless-NGX consume directory.

2. How it works

  1. Listening for scanner broadcasts: The main script(eppascan.sh) monitors the local network for special multicast broadcasts that Epson scanners send out when the scan button is pressed.
  2. Automatic scanning: After recognising a scanner signal, a duplex scan of all pages in the automatic document feeder (ADF) is started. The scans are saved as image files (JPEG) with a time stamp in the destination folder.
  3. Direct integration into Paperless-NGX: The scan files end up in the /opt/paperless/consume directory, from where Paperless-NGX processes them automatically (conversion to PDF, OCR, sorting, etc.).
  4. Automatic service start: The installation script(eppascan-install.sh) sets up the main script as a system service (systemd) so that Eppascan runs automatically in the background after each restart.

3. Requirements

  • An Epson network scanner (tested with ES-500WII, other models with rudimentary network function, but without scan-to-folder, should also work). The scanner must be logged into the network.
  • A Linux server (Debian/Ubuntu recommended, Proxmox LXC tested)
  • Paperless-NGX installed and configured(/opt/paperless/consume exists and is writable)
  • Network access between scanner and server (same subnet, multicast allowed)
  • Root rights (as usual with Proxmox UserHelperScripts)

4. Installation

  1. Clone repository
    git clone https://github.com/michael-hessi/Eppascan.git /tmp/Eppascan
  2. Change to the directory and make the installation script executable
    cd /tmp/Eppascan 
    chmod x eppascan_install.sh
  3. Execute the installation script
    ./eppascan_install.sh

The installation script does the following automatically:

  • Installation of required packages(tcpdump, sane-utils)
  • Copy the main script to /usr/local/bin/
  • Creating and activating the systemd service
  • Set the appropriate rights

Bildschirmfoto vom 2025 05 13 20 50 55

Done!
After installation, the service runs automatically in the background and starts every time the system is restarted.

5. Configuration

The most important settings can be found directly in the script /usr/local/bin/eppascan.sh and can be customised before installation if required:

VariableVariable DescriptionExample value
NETWORK_INTERFACENetwork interface of the servereth0
MULTICAST_ADDRMulticast address for Epson scanner239.255.255.253
SCAN_OUTPUT_DIRDestination folder for scans (paperless consume)/opt/paperless/consume
SCAN_RESOLUTIONScan resolution in dpi300
SCAN_MODEScan mode (colour, greyscale, line drawing)Grey
SCAN_FORMATOutput formatjpeg

Tip: For optimum OCR results in Paperless-NGX, we recommend a scan resolution of at least 300 dpi and the use of greyscale mode.

6. Customisation for other scanners or networks

  • Multicast address: Epson uses 239.255.255.253. Other manufacturers may use other addresses (e.g. Brother: 239.255.255.250). The address can be determined with tcpdump -i eth0 igmp.
  • Backend: The script uses the epsonds backend from SANE by default. For other scanners, select epson2 or a suitable backend if necessary.

7. Troubleshooting

Troubleshooting

Check Eppascan log

If there are problems with scanner recognition or the scanning process, it is advisable to check the Eppascan log file. This contains detailed status and error messages that help with the diagnosis.

cat /var/log/eppascan_scanimage_errors.log 

Or for the last lines:

tail -n 50 /var/log/eppascan_scanimage_errors.log

The log file shows, among other things

  • Recognition of the scanner and its IP address
  • Start and completion of scanning processes
  • Error messages in the event of network or scanner problems
  • Information on possible causes of failed scans

Tip:
If the scan does not start or is cancelled, first take a look at the log file. The messages there usually provide a clear indication of the cause.

  • Scanner is not recognised:
    – Check the multicast address with tcpdump -i eth0 igmp.
    – Check firewall/network rules: UDP 3289 (ENPC), UDP 3702 (WS-Discovery), TCP 1865 (network scan), TCP 445 (SMB)
    – Scanner and server must be in the same subnet.
  • Scan cancels after one page:
    – Load ADF correctly (exclude paper jam).
    – Test SANE backend:

    scanimage -L
  • Files do not appear in Paperless-NGX:
    – Check permissions for /opt/paperless/consume:
    chown -R paperless:paperless /opt/paperless/consume
    – Check paperless consumer log:
    journalctl -u paperless-consumer -f

8. Uninstall

  • Call up the installer:

    eppascan_install.sh

    Select Uninstall.

9. References

How does the scan command work with variables in Bash?

As a complete beginner, this was the biggest hurdle for me! Variables are used in the Bash script to flexibly control the scanning process. This allows you to dynamically adjust settings such as resolution, mode or file names without having to change the command every time.

timeout "${SCAN_TIMEOUT}s" scanimage \ --device-name="epsonds:net:$DETECTED_IP" \ --source "ADF Duplex" \ --resolution "$SCAN_RESOLUTION" \ --mode "$SCAN_MODE" \ --format "$SCAN_FORMAT" \ --batch="$SCAN_FULLPATH_PATTERN" \ --batch-count -1 \ 1>/dev/null \ 2> >(while read -r line; do log_eppascan "[scanimage] $line"; done) || true 
  • timeout „${SCAN_TIMEOUT}s“: Executes the scan only for the specified time (in seconds). The variable $SCAN_TIMEOUT is replaced by Bash with its value, e.g. timeout "60s". This prevents the daemon from no longer responding in the event of an error and instead terminates the main loop.
  • -device-name=“epsonds:net:$DETECTED_IP“: The variable $DETECTED_IP contains the IP address of the scanner, which was determined by tcpdump. This allows the script to react to any Epson scanner in the network.
  • -resolution „$SCAN_RESOLUTION“, –mode „$SCAN_MODE“, –format „$SCAN_FORMAT“, –batch=“$SCAN_FULLPATH_PATTERN“: All these options get their value from variables. In Bash, variables in inverted commas are automatically replaced by their value, even if they contain spaces or special characters. „ADF Duplex“ would otherwise cause a scanimage exit code 9.
  • -batch-count -1: Ensures that all sheets are scanned in the automatic document feeder (ADF). However, this depends on the scanner firmware and may not always work correctly.
  • 1>/dev/null: The standard output (everything that scanimage normally displays) is discarded. This is a service at the end and should not give any CLI feedback.
  • 2> >(while read -r line; do log_eppascan „[scanimage] $line“; done): Error outputs (stderr) are forwarded line by line to a separate function(log_eppascan). This is a so-called process substitution. This means that the error messages from scanimage also appear in the eppascan log.
  • || true: Even if the scan command fails, the script continues to run and does not abort. Important to keep the service online.

Important: In Bash, variables with $VARNAME are replaced by their value when the command is executed. If they are in inverted commas, values with spaces are also passed correctly.

Example:
Assuming the variables are set like this:

 SCAN_TIMEOUT=60 DETECTED_IP=192.168.1.42 SCAN_RESOLUTION=300 SCAN_MODE=Gray SCAN_FORMAT=jpeg SCAN_FULLPATH_PATTERN="/opt/paperless/consume/scan_d.jpg" 

Bash will then execute the command like this:

 timeout "60s" scanimage \ --device-name="epsonds:net:192.168.1.42" \ --source "ADF Duplex" \ --resolution "300" \ --mode "grey" \ --format "jpeg" \ --batch="/opt/paperless/consume/scan_d.jpg" \ --batch-count -1 \ 1>/dev/null \ 2> >(while read -r line; do log_eppascan "[scanimage] $line"; done) || true 

To summarise:
Bash automatically replaces variables with their values when executing the command. It was important to me that the scanning process can be flexibly and easily customised without changing the script itself.

The scripts

Maximum transparency here too.

eppascan.sh

/usr/bin/env bash

# Eppascan - Epson Scanner Integration for Paperless-NGX, v0.1
#
# A service for a headless Paperless-NGX server that waits for the scan
# button on any networked Epson scanner to be pressed, to then
# automatically start a scan process and transfer the documents to the
# Paperless consume directory.
#
# Listens for Epson scanner broadcast and automatically scans all pages
# to /opt/paperless/consume
#
# Copyright (C) 2025 Michael Hessburg, www.hessburg.de
# Licence: GNU GPLv3 or later - see <http://www.gnu.org/licenses/>.

set -euo pipefail
IFS=$'\n\t'

# --- Configuration ---

NETWORK_INTERFACE="eth0"
MULTICAST_ADDR="239.255.255.253"
SCAN_INITIAL_DELAY=15
SCAN_TIMEOUT=300
SCAN_RESOLUTION="300"
SCAN_MODE="Gray"
SCAN_FORMAT="jpeg"
SCAN_OUTPUT_DIR="/opt/paperless/consume"
LOGFILE="/var/log/eppascan_scanimage_errors.log"

# --- Logging ---
log_eppascan() {
    echo "[$(date ' %Y-%m-%dD%H:%M:%S%z')] [EPPASCAN] $*" >> "$LOGFILE"
}

# --- Temp file & cleanup ---
TMP_TCPDUMP_OUTPUT=$(mktemp)
trap 'rm -f "$TMP_TCPDUMP_OUTPUT"; log_eppascan "Script terminated."; exit 0' SIGINT SIGTERM EXIT

log_eppascan "Script started. Listening on $NETWORK_INTERFACE for Epson scanner broadcasts."

# --- Main loop ---
while true; do
    : > "$TMP_TCPDUMP_OUTPUT" # Clear the file using ':' (no-op)

    DETECTED_IP=""

    # Listen for IGMP packet (Epson broadcast)
    if ! tcpdump -i "$NETWORK_INTERFACE" -v igmp -l -c 1 > "$TMP_TCPDUMP_OUTPUT" 2>/dev/null; then
        log_eppascan "Warning: tcpdump failed on interface $NETWORK_INTERFACE."
        sleep 5
        continue
    fi

    # Check for multicast address and extract scanner IP
    if grep -q "$MULTICAST_ADDR" "$TMP_TCPDUMP_OUTPUT"; then
        DETECTED_IP=$(grep -oE '[0-9]{1,3}(\.[0-9]{1,3}){3}' "$TMP_TCPDUMP_OUTPUT" | head -n 1)
        log_eppascan "Detected scanner at $DETECTED_IP."
    else
        sleep 2
        continue
    fi

    if [[ -n "$DETECTED_IP" ]]; then
        log_eppascan "Waiting $SCAN_INITIAL_DELAY seconds for scanner initialisation."
        sleep "$SCAN_INITIAL_DELAY"

        SCAN_FILENAME_BASE="scan_$(date '%Y%m%d_%H%M%S')"
        SCAN_FULLPATH_PATTERN="$SCAN_OUTPUT_DIR/${SCAN_FILENAME_BASE}_d.$SCAN_FORMAT"

        log_eppascan "Starting scan to $SCAN_FULLPATH_PATTERN."

        timeout "${SCAN_TIMEOUT}s" scanimage \
            --device-name="epsonds:net:$DETECTED_IP" \
            --source "ADF Duplex" \
            --resolution "$SCAN_RESOLUTION" \
            --mode "$SCAN_MODE" \
            --format "$SCAN_FORMAT" \
            --batch="$SCAN_FULLPATH_PATTERN" \
            --batch-count -1 \
            1>/dev/null \
            2> >(while read -r line; do log_eppascan "[scanimage] $line"; done) || true

        log_eppascan "Scan process finished (errors, if any, are logged above)."
        sleep 5
    else
        log_eppascan "No scanner IP detected after IGMP packet. Retrying..."
        sleep 2
    fi

    sleep 2
done

eppascan_install.sh

/usr/bin/env bash

# Eppascan Installer & Maintenance Script
# Installs, repairs, or uninstalls the Eppascan daemon for Paperless (root only, suitable for LXC)
#
# Copyright (C) 2025 Michael Hessburg, www.hessburg.de
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public Licence as published by
# the Free Software Foundation, either version 3 of the Licence, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public Licence for more details.
#
# You should have received a copy of the GNU General Public Licence
# along with this program. If not, see <http://www.gnu.org/licenses/>.

set -euo pipefail

# --- Configuration ---
EPPASCAN_SCRIPT="eppascan.sh" # Name of the main script in current directory
EPPASCAN_TARGET="/usr/local/bin/eppascan.sh" # Target path for the script
SERVICE_FILE="/etc/systemd/system/eppascan.service" # Path for the systemd unit file
LOGFILE="/var/log/eppascan_scanimage_errors.log" # Log file for Eppascan and scanimage errors

# --- Helper functions ---

# Print info message to terminal
info() {
    echo -e "\033[1;32m[INFO]\033[0m $*"
}

# Print warning message to terminal
warn() {
    echo -e "\033[1;33m[WARN]\033[0m $*"
}

# Print error message to terminal and exit
error() {
    echo -e "\033[1;31m[ERROR]\033[0m $*" >&2
    exit 1
}

# Check if script is run as root
require_root() {
    if [[ $EUID -ne 0 ]]; then
        error "This script must be run as root."
    fi
}

# Check if Eppascan is already installed
check_installed() {
    [[ -f "$EPPASCAN_TARGET" ]] && [[ -f "$SERVICE_FILE" ]]
}

# Install required packages if missing
install_dependencies() {
    info "Checking and installing required packages..."
    PKGS=(tcpdump sane-utils)
    for pkg in "${PKGS[@]}"; do
        if ! dpkg -s "$pkg" &>/dev/null; then
            info "Installing $pkg..."
            apt-get update
            apt-get install -y "$pkg"
        else
            info "$pkg is already installed."
        fi
    done
    # Verify scanimage is available
    if ! command -v scanimage &>/dev/null; then
        error "scanimage not found! Please check your SANE installation."
    fi
}

# Copy Eppascan script to target location
install_script() {
    info "Copying $EPPASCAN_SCRIPT to $EPPASCAN_TARGET..."
    cp "$EPPASCAN_SCRIPT" "$EPPASCAN_TARGET"
    chmod 755 "$EPPASCAN_TARGET"
}

# Ensure logfile exists and is writable
install_logfile() {
    info "Ensuring log file exists and is writable..."
    touch "$LOGFILE"
    chmod 644 "$LOGFILE"
}

# Create systemd service file and start service
install_service() {
    info "Creating systemd service file..."
    cat > "$SERVICE_FILE" <

Further development

There was already a simple webend for changing the scan settings/variables. This will come back next, but at the moment I want to use my script to scan all documents.

Roadmap

  • Document processing
    Document length detection, discard empty documents.
  • Faster scanning start:
    Find out what response the scanner is expecting from the server so that the orange exclamation mark does not appear while the scanner is not ready, the scanning process starts faster and thus the waiting time is minimised.
  • Web front end:
    • All settings should be stored centrally in an external configuration file and can be edited via the web front end.
    • Predefined scan profiles (such as „Standard“, „Duplex“, „Colour“) as well as the option to create and delete your own profiles.
    • Selection from several target folders (inbox or watch folder) for common DMS and the entry of your own path.
    • Display of a temporary weblog with live information on the current scan (status, progress), which is automatically deleted once the scan is complete.
    • Scan button for indirect control.
  • Notifications:
    I schedule email and MQTT notifications when the scanner button is detected and starts scanning or when errors occur.
  • Scanner buttons:
    Launch different scan profiles directly from the scanner’s hardware buttons.
  • Cross-manufacturer support:
    The aim is to support scanners from other manufacturers (e.g. Canon, Brother, HP) in addition to Epson.
  • What I am not planning:
    • No user or rights management in the web frontend (exclusively for private use).
    • No development of iOS or Android apps.

Feedback and new ideas are always welcome!

Tests

I have carried out a first massive test. The scanner is really good and has processed approx. 150 sheets in several batches without any problems. This included sheets of different sizes, thicknesses and even sheets that were really scuffed. So the hardware is top notch!
I don’t see how it could scan any faster. i think 35 pages per minute is an understatement!

Eppascan also worked smoothly, even when the Proxmox server was working to capacity with Paperless-NGX and PaperlessAI with a local LLM. The i6100T certainly needs four or five minutes for a DIN A4 page, but that only plays a role with masses of new documents. I’m satisfied and don’t need to upgrade straight away. The RAM, currently only 16 GB but soon to be 32 GB, is also easily enough so far!

There were real problems with PaperlessAI. It had problems connecting to the Paperless NGX. „Error ‚JSON.parse: unexpected character‘ or ‚Failed to get own user ID‘ in the Paperless-AI logs indicated a faulty Paperless-NGX API connection setting. The webend was not saving the data correctly in /opt/paperless-ai/data/.env. I had to change it manually to „http://192.168.141.40:8000/opt/“ and change the username „Admin“ correctly to „admin“. After that it worked immediately.

But I have already started an article on this.

Über den Autor

Hessi

Hessi

Michael "Hessi" Hessburg ist ein erfahrener Technik-Enthusiast und ehemaliger Informatiker. Seine Website, die er seit über 25 Jahren betreibt, deckt vielfältige Themen ab, darunter Haus & Garten, Hausrenovierung, IT, 3D-Druck, Retrocomputing und Autoreparatur. Zudem behandelt er gesellschaftspolitische Themen wie Datenschutz und Überwachung. Hessi ist seit 20 Jahren freiberuflicher Autor und bietet in seinem Blog fundierte Einblicke und praktische Tipps. Seine Beiträge sind sorgfältig recherchiert und leicht verständlich, um Leser bei ihren Projekten zu unterstützen.

Schreibe einen Kommentar

Ich bin mit der Datenschutzerklärung und der Speicherung meiner eingegebenen Daten einverstanden.