Detect text on image using tess4j library on linux

Detect text on image using tess4j library on linux

Tesseract OCR library is the best way to detect text on image. Let's start with Java spring boot project.



- Step 1: Download template spring boot project on: https://github.com/habogay/spring-boot-gcp

- Step 2: Install tesseract otc:

sudo apt-get install tesseract-ocr

  - Step 3: create environment in tool (I use eclipse): TESSDATA_PREFIX=/usr/share/tesseract-ocr/tessdata/

- Step 4: Use Tesseract: 

String rs="";

ITesseract tess = new Tesseract();

 try {

    // Specify trained data folder

    // tess.setDatapath("./tessdata");

    // Specify detected language 

    tess.setLanguage("eng");

    File img = new File("/home/habogay/Desktop/lh.png");

    rs = tess.doOCR(img);

    model.addAttribute("rs", rs);

    System.out.println(rs);

  } catch (Exception e) {

        System.out.println(e.getMessage());

  }


source code: https://github.com/habogay/fb-controller

Done :d