如何在芝麻中读取非标准RDF格式

debugcn 发表于 Dev

Java开发人员

将以下格式的N3 RDF文件加载到芝麻存储库中时，可以告诉芝麻'|' 作为分隔符？

下面的三元组由|分隔。：

http://article.com/1-3|http://relationship.com/wasGeneratedBy|http://edit.com/comment1-2

简·布罗克斯特拉

就像注释中所指出的那样：您拥有的格式根本不是N3语法，因此您不能使用Sesame的N3解析器上载该格式。它也不是任何其他标准化格式，因此没有解析器可用于处理它。

但是，以这种方式“手动”处理文件并将其自己添加到Sesame中将是非常简单的。这可能会达到目的：

   try (RepositoryConnection conn = rep.getConnection()) {
      ValueFactory vf = conn.getValueFactory(); 

      File file = new File("/path/to/weirdlyformattedrdffile.txt");

      // open the file for reading
      try (BufferedReader br = new BufferedReader(new FileReader(file))) {
         String line;

         // start a transaction
         conn.begin();

         // read the file line-by-line
         while ((line = br.readLine()) != null) {
            // each line is an RDF triple with a subject, predicate, and object
            String[] triple = line.split("|");

            IRI subject = vf.createIRI(triple[0]);
            IRI predicate = vf.createIRI(triple[1]);
            IRI object = vf.createIRI(triple[2]);

            // add the triple to the database
            conn.add(subject, predicate, object);
         }                   
         // commit the txn when all lines are read
         conn.commit();
      }
   }

当然，如果您的文件还包含IRI之外的其他内容（例如，文字或空白节点），则您将必须包含一些逻辑来区分这些内容，而不仅仅是为所有内容盲目创建IRI。但这就是您不使用标准化语法格式所要付出的代价。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。