OCR-Docker

Extract text from images & pdf files

OCR-Docker is a Python & Flask powered, easy to use system that helps us to easily extract text from images and pdf files in multiple languages.

Features

Extract text from images (png, jpg, tiff).
Extract text from pdf files (single or multiple pages).

Components and Frameworks used in TTS-STT

tesseract-ocr - open source ocr
tessdata - tesseract-ocr data models
ghostscript
imagemagick
pytesseract
Pillow
Image
Flask
Loguru
PyYAML

The OCR (Optical Character Recognition) feature is free thanks to tesseract-ocr which is an Open Source OCR project.

Installation

docker-compose from hub

version: "3.7"
services:
  ocr:
    image: techblog/ocr-docker:latest
    ports:
      - "8080:8080"
    container_name: tts-stt
    labels:
      - "com.ouroboros.enable=true"
    networks:
      - default
    restart: unless-stopped

Now, run docker-compose up -d to pull and run your container. Open your browser and navigate to your container ip address with port 8080, you should see the following screen.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
ocr		ocr
screenshot		screenshot
traineddata		traineddata
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
docker-compose.yaml		docker-compose.yaml
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR-Docker

Extract text from images & pdf files

Features

Components and Frameworks used in TTS-STT

Installation

docker-compose from hub

About

Releases

Packages

Contributors 2

Languages

License

t0mer/ocr-docker

Folders and files

Latest commit

History

Repository files navigation

OCR-Docker

Extract text from images & pdf files

Features

Components and Frameworks used in TTS-STT

Installation

docker-compose from hub

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages