Apache Tika로 iWorksDocument 구문 분석 문제

debugcn 에 게시 Dev

사친

내가 구문 분석하려고했다 iWorksDoc을 함께 아파치 티카 . 그러나 콘텐츠 처리기에서 다른 출력을 얻는 대신 구문 분석 된 콘텐츠를 얻지 못합니다. 내가 사용한 코드 스 니펫과 내가 얻은 출력이 아래에 추가됩니다.

    private void parseFile(File file) {
    try{
        File file = new File("/home/user/tika/samples/budget.numbers");
        FileInputStream inputStream = new FileInputStream(file);
        ParseContext context = new ParseContext();
        BodyContentHandler bodyHandler = new BodyContentHandler(-1);
        Parser parser=new AutoDetectParser();
        parser.parse(inputStream, bodyHandler, new Metadata(), context);
        System.out.println("Contents of the file :"+bodyHandler.toString());
        }
        catch(IOException | SAXException | TikaException e){
            e.printStackTrace();
        }
}

출력 :-

Contents of the file :
Index/Document.iwa
Index/ViewState.iwa
Index/CalculationEngine.iwa
Index/Tables/HeaderStorageBucket-2.iwa
Index/Tables/Tile.iwa
Index/Metadata.iwa
Metadata/Properties.plist

Detector api를 올바르게 사용하여 파일 형식을 감지 할 수 있습니다. 그러나 문서에서 유용한 내용을 얻지 못하고 있습니다. 도와주세요!

팀 앨리슨

Tika는 Numbers 문서를 구문 분석 할 수 있어야합니다. 문서를 공유 할 수있는 경우 Jira에 게시하십시오 . 파서를 살펴보면 네임 스페이스를 좀 더 견고하게 처리 할 수 있으며 문제 가 될 수 있지만 문서 없이는 알 수 없습니다.

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-06-7

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사

Apache Tika로 iWorksDocument 구문 분석 문제

Apache Tika로 iWorksDocument 구문 분석 문제

Perl 구문 분석 Apache 로그

JSON을 NSMutableDictionary 문제로 구문 분석

NodeJS로 JSON 구문 분석시 문제

NodeJS로 JSON 구문 분석시 문제

sed 문제로 구문 분석

Alamofire 문제로 JSON 구문 분석

Java에서 Apache Tika로 구문 분석하는 동안 PDF 글 머리 기호가 물음표로 표시됩니다.

Perl로 Apache 로그의 시간 구문 분석

Apache 액세스 로그 정규식 구문 분석

Apache Spark에서 스키마로 파일 구문 분석

러시아어 문자로 문자열 구문 분석 문제

xpath를 사용하여 tika로 사용자 정의 xml을 구문 분석

ODF 및 이전 (1997-2003) MS Word 문서의 Apache Tika 구문 분석을 수정 하시겠습니까?

bash 및 awk로 제한 구문 분석

AngleSharp로 CSS 구문 분석

dcg로 stdin 구문 분석

regexextract로 json 구문 분석

jquery로 JSON 구문 분석

jquery로 JSON 구문 분석

Stax로 DTD 구문 분석

Awk로 구문 분석

JAVA로 XML 구문 분석

XmlUtils로 XML 구문 분석

Awesomium으로 구문 분석

Xpath로 구문 분석

Swift로 Json 구문 분석

JSONArray로 JSONObject 구문 분석

JSONArray로 JSONObject 구문 분석

JSONArray로 JSONObject 구문 분석