public class XEasyPdfDocumentExtractor extends Object implements Serializable
Copyright (c) 2020-2023 xsx All Rights Reserved. x-easypdf is licensed under Mulan PSL v2. You can use this software according to the terms and conditions of the Mulan PSL v2. You may obtain a copy of Mulan PSL v2 at: http://license.coscl.org.cn/MulanPSL2 THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. See the Mulan PSL v2 for more details.
| 限定符和类型 | 方法和说明 |
|---|---|
XEasyPdfDocumentExtractor |
addRegion(String regionName,
Rectangle rectangle)
添加区域
|
XEasyPdfDocumentExtractor |
clearRegion()
清理区域
|
XEasyPdfDocumentExtractor |
extractForm(Map<String,String> formMap)
提取表单
|
XEasyPdfDocumentExtractor |
extractImage(List<BufferedImage> imageList)
提取图片(全部)
|
XEasyPdfDocumentExtractor |
extractImage(List<BufferedImage> imageList,
int... pageIndex)
提取图片
|
XEasyPdfDocumentExtractor |
extractText(List<String> textList,
int... pageIndex)
提取文本
|
XEasyPdfDocumentExtractor |
extractText(List<String> textList,
String regex,
int... pageIndex)
提取文本
|
XEasyPdfDocumentExtractor |
extractTextByRegions(List<Map<String,String>> dataList,
int... pageIndex)
根据区域提取文本
|
XEasyPdfDocumentExtractor |
extractTextByRegionsForSimpleTable(List<List<String>> textList,
Rectangle rectangle,
int pageIndex)
提取区域表格文本(单行单列)
|
XEasyPdfDocumentExtractor |
extractTextForSimpleTable(List<List<String>> textList,
int pageIndex)
提取表格文本(单行单列)
|
XEasyPdfDocument |
finish()
完成操作
|
public XEasyPdfDocumentExtractor addRegion(String regionName, Rectangle rectangle)
regionName - 区域名称rectangle - 区域图形public XEasyPdfDocumentExtractor clearRegion()
public XEasyPdfDocumentExtractor extractText(List<String> textList, int... pageIndex)
textList - 待接收文本列表pageIndex - 页面索引public XEasyPdfDocumentExtractor extractText(List<String> textList, String regex, int... pageIndex)
textList - 待接收文本列表regex - 正则表达式pageIndex - 页面索引public XEasyPdfDocumentExtractor extractTextByRegions(List<Map<String,String>> dataList, int... pageIndex)
dataList - 待接收文本字典列表(key=区域名称,value=提取文本)pageIndex - 页面索引public XEasyPdfDocumentExtractor extractTextForSimpleTable(List<List<String>> textList, int pageIndex)
textList - 待接收文本列表(第一层为行,第二层为列)pageIndex - 页面索引public XEasyPdfDocumentExtractor extractTextByRegionsForSimpleTable(List<List<String>> textList, Rectangle rectangle, int pageIndex)
textList - 待接收文本列表(第一层为行,第二层为列)rectangle - 区域图形pageIndex - 页面索引public XEasyPdfDocumentExtractor extractImage(List<BufferedImage> imageList)
imageList - 待接收图片列表public XEasyPdfDocumentExtractor extractImage(List<BufferedImage> imageList, int... pageIndex)
imageList - 待接收图片列表pageIndex - 页面索引public XEasyPdfDocumentExtractor extractForm(Map<String,String> formMap)
formMap - 待接收表单字典public XEasyPdfDocument finish()
Copyright © 2023. All rights reserved.