Class ReferenceImagesKt

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
    • Constructor Summary

      Constructors 
      Constructor Description
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Constructor Detail

    • Method Detail

      • splitAndSaveSubImages

         final static List<Path> splitAndSaveSubImages(TextDetector $self, BufferedImage sampleImage, Path outputDir, Function2<Integer, BufferedImage, String> subImageFilenameWithoutExt)

        Splits the given sampleImage into sub-images of text elements, and saves them as files into the given outputDir. Each file is named based on the result of subImageFilenameWithoutExt, which is called for each sub-image.

        Sub-images are often individual characters, but sometimes several characters can be grouped together due to kerning. For instance, a lowercase letter following an uppercase T ou V can be part of a single sub-image (Te, To, Va...).

      • splitAndSaveSubImages

         final static List<Path> splitAndSaveSubImages(TextDetector $self, BufferedImage sampleImage, UniqueImageStore imageStore)

        Splits the given sampleImage into sub-images of text elements, and saves them as files into the given imageStore. The image store reuses images and doesn't write duplicates.

        Sub-images are often individual characters, but sometimes several characters can be grouped together due to kerning. For instance, a lowercase letter following an uppercase T ou V can be part of a single sub-image (Te, To, Va...).

      • splitAndSaveSubImages

         final static Unit splitAndSaveSubImages(TextDetector $self, Path sampleImagesDir, Path outputDir, String sampleImagesGlob)

        Reads all images from sampleImagesDir matching sampleImagesGlob, and splits them into sub-images of text elements. The resulting sub-images are saved in the given outputDir, with no exact duplicates.

        Sub-images are often individual characters, but sometimes several characters can be grouped together due to kerning. For instance, a lowercase letter following an uppercase T ou V can be part of a single sub-image (Te, To, Va...).

        Those sub-images should then be manually renamed according to their text content, so they can be used as reference images by the OCR. Load them using ReferenceImages.readFrom.

      • splitAndSaveCharacterImages

         final static List<Path> splitAndSaveCharacterImages(TextDetector $self, BufferedImage sampleImage, String sampleText, Path outputDir)

        Splits the given sampleImage into sub-images of text elements, and saves them as files into the given outputDir. Each file is named based on the characters (more specifically, the unicode code points) in sampleText. Characters that are not valid as file names are escaped.

        Sub-images are often individual characters, but sometimes several characters can be grouped together due to kerning. For instance, a lowercase letter following an uppercase T ou V can be part of a single sub-image (Te, To, Va...).

        If the kerning of your font causes this kind of grouping, this method will not properly map images to characters. In that case, please prefer splitAndSaveSubImages.