How do I print out just certain elements from a text file that has xml tags to a new text file?

Junipero

I need help with something that sounds easy but has given me some trouble.

I have a text file (record.txt) that has a root element 'PatientRecord' and sub tags in it ('first name', 'age', blood type, address etc...) that repeat over and over but with different values since it's a record for each person. I'm only interested in printing out the values in between the tags to a new text file for each person but only for the elements I want. For example with the tags I mentioned above I only want the name and age but not the rest of the info for that patient. How do I print out just those values separated by commas and then go to the next patient? Here is the code I have so far

    package patient.records;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.OutputStreamWriter;
import java.io.Writer;
public class ProcessRecords {
private static final String FILE = "C:\\Users\\Desktop\\records.txt";
private static final String RECORD_START_TAG = "<PatientRecord>";
private static final String RECORD_END_TAG = "</PatientRecord>"; 
private static final String newFileName = "C:\\Users\\Desktop\\DataFolder\\";    
public static void main(String[] args) throws Exception {
    String scan;    
    FileReader file = new FileReader(FILE);
    BufferedReader br = new BufferedReader(file);
    Writer writer = null;

    while ((scan = br.readLine()) != null)        
    {            
        if (scan.contains(RECORD_START_TAG)) { 


            //This is the logic I am missing that will only grab the element values
            //between the tags inside of the file

            writer = new BufferedWriter(new OutputStreamWriter(
            new FileOutputStream(newFileName + "Record Data" + ".txt"), "utf-8"));             
            }      
        else if (scan.contains(RECORD_END_TAG)) {
            writer.close();
            writer=null;
        }
        else {
            // only write if writer is not null
           if (writer!=null) {
            writer.write(scan);
           }
        }            
    }       
    br.close();
    }   
}   //This is the end of my code             

The text file (record.txt) I am reading in looks like this:

<PatientRecord> <---first patient record--->
<---XML Schema goes here--->
            <Info>
                <age>66</age>
                <first_name>john</first_name>
                <last_name>smith</last_name>
                <mailing_address>200 main street</mailing_address>
                <blood_type>AB</blood_type>
                <phone_number>000-000-0000</phone_number>
</PatientRecord>
<PatientRecord> <---second patient record--->
<---XML Schema goes here--->
            <Info>
                <age>27</age>
                <first_name>micheal</first_name>
                <last_name>thompson</last_name>
                <mailing_address>123 baker street</mailing_address>
                <blood_type>O</blood_type>
                <phone_number>111-222-3333</phone_number>
</PatientRecord>

So in theory if I ONLY wanted to print out the values from the tags first name, mailing address, and blood type from this text file for all patients it should look like this:

john, 200 main street, AB
//this line is blank
michael, 123 baker street, O

Thanks for any and all help. If you feel like my code should be modified then I'm all for it. Thank you.

MadProgrammer

My first gut feeling is to wrap the entire text content around some outer tag and process the text as XML, something like...

<Patients>
    <PatientRecord> <---first patient record--->
        <Info>
            <age>66</age>
            <first_name>john</first_name>
            <last_name>smith</last_name>
            <mailing_address>200 main street</mailing_address>
            <blood_type>AB</blood_type>
            <phone_number>000-000-0000</phone_number>
    </PatientRecord>
    ...
</Patients>

But there are two problems with this...

One <---first patient record---> isn't a valid XML comment or text and two, there is no closing </Info> tag...[sigh]

So, my next thought was, read in each <PatientRecord> individual, as text, and then process that as XML....

Here come the problems...we need to remove anything surrounded by <--- ... ---> including the little arrows...There is a lot of assumptions about this, but hopefully we can ignore it...

The next problem is, we need to insert a closing </Info> tag...

After that, it's all really easy...

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class Test {

    private static final String RECORD_START_TAG = "<PatientRecord>";
    private static final String RECORD_END_TAG = "</PatientRecord>";

    public static void main(String[] args) {
        File records = new File("Records.txt");
        try (BufferedReader br = new BufferedReader(new FileReader(records))) {
            StringBuilder record = null;
            String text = null;
            while ((text = br.readLine()) != null) {

                if (text.contains("<---") && text.contains("--->")) {
                    String start = text.substring(0, text.indexOf("<---"));
                    String end = text.substring(text.indexOf("--->") + 4);
                    text = start + end;
                }

                if (text.trim().length() > 0) {
                    if (text.startsWith(RECORD_START_TAG)) {

                        record = new StringBuilder(128);
                        record.append(text);

                    } else if (text.startsWith(RECORD_END_TAG)) {

                        record.append("</Info>");
                        record.append(text);

                        try (ByteArrayInputStream bais = new ByteArrayInputStream(record.toString().getBytes())) {

                            Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(bais);
                            XPath xPath = XPathFactory.newInstance().newXPath();
                            XPathExpression exp = xPath.compile("PatientRecord/Info/first_name");
                            Node firstName = (Node) exp.evaluate(doc, XPathConstants.NODE);

                            exp = xPath.compile("PatientRecord/Info/mailing_address");
                            Node address = (Node) exp.evaluate(doc, XPathConstants.NODE);

                            exp = xPath.compile("PatientRecord/Info/blood_type");
                            Node bloodType = (Node) exp.evaluate(doc, XPathConstants.NODE);

                            System.out.println(
                                    firstName.getTextContent() + ", "
                                    + address.getTextContent() + ", "
                                    + bloodType.getTextContent());

                        } catch (ParserConfigurationException | XPathExpressionException | SAXException ex) {
                            ex.printStackTrace();
                        }

                    } else {

                        record.append(text);

                    }

                }

            }
        } catch (IOException exp) {
            exp.printStackTrace();
        }
    }

}

Which prints out...

john, 200 main street, AB
micheal, 123 baker street, O

The long and short of it is, go back to the person who gave you this file, slap them, then tell them to put into a valid XML format...

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

How to read the text file from the Application Directory text file

来自分类Dev

Writing a text file in new format

来自分类Dev

How to erase line from text file in Python?

来自分类Dev

How to print certain elements from vector?

来自分类Dev

How to read a text file, then capture and read new added line

来自分类Dev

How to parse a text file (CSV) into haskell so I can operate on it?

来自分类Dev

Need to generate a new text file and save it every time I run a script in python

来自分类Dev

Reading Regular Expressions from a text file

来自分类Dev

Using JavaCC to read input from a text file

来自分类Dev

Retrieve the values from a dictionary in a text (log) file

来自分类Dev

reversing text from input file using linq

来自分类Dev

How to print out the file name and line number of the test in python nose?

来自分类Dev

How to convert a text file into binary and vice versa?

来自分类Dev

How to read numbers in text file as numbers in to list?

来自分类Dev

How do I highlight (select) text in Text widget with a button click?

来自分类Dev

XSLT Display all text Between XML tags

来自分类Dev

Write text to file in columns

来自分类Dev

How do I retrieve the text of a cell comment

来自分类Dev

请解释此代码列表=''.join(如果列表中不是i,则i为list_text_from_file中的i)

来自分类Dev

How can I extract text from images?

来自分类Dev

How to fill a Map<String, List<String>> from a text file ? -Difficulites dynamically naming each List

来自分类Dev

Using Gradle, how can I ensure that a file exists at a certain location?

来自分类Dev

How can I recursively find the *directories* containing a file of a certain type?

来自分类Dev

(echo'text'; cat file.txt)> new file.txt实际如何工作?

来自分类Dev

What happens when I cat a non-text file?

来自分类Dev

Insert text into file after a specific text

来自分类Dev

Get total number of non-blank lines from text file?

来自分类Dev

code that read from text file, remove stopwords and then apply case folding

来自分类Dev

Creating a JSON object from a tabbed tree text file

Related 相关文章

  1. 1

    How to read the text file from the Application Directory text file

  2. 2

    Writing a text file in new format

  3. 3

    How to erase line from text file in Python?

  4. 4

    How to print certain elements from vector?

  5. 5

    How to read a text file, then capture and read new added line

  6. 6

    How to parse a text file (CSV) into haskell so I can operate on it?

  7. 7

    Need to generate a new text file and save it every time I run a script in python

  8. 8

    Reading Regular Expressions from a text file

  9. 9

    Using JavaCC to read input from a text file

  10. 10

    Retrieve the values from a dictionary in a text (log) file

  11. 11

    reversing text from input file using linq

  12. 12

    How to print out the file name and line number of the test in python nose?

  13. 13

    How to convert a text file into binary and vice versa?

  14. 14

    How to read numbers in text file as numbers in to list?

  15. 15

    How do I highlight (select) text in Text widget with a button click?

  16. 16

    XSLT Display all text Between XML tags

  17. 17

    Write text to file in columns

  18. 18

    How do I retrieve the text of a cell comment

  19. 19

    请解释此代码列表=''.join(如果列表中不是i,则i为list_text_from_file中的i)

  20. 20

    How can I extract text from images?

  21. 21

    How to fill a Map<String, List<String>> from a text file ? -Difficulites dynamically naming each List

  22. 22

    Using Gradle, how can I ensure that a file exists at a certain location?

  23. 23

    How can I recursively find the *directories* containing a file of a certain type?

  24. 24

    (echo'text'; cat file.txt)> new file.txt实际如何工作?

  25. 25

    What happens when I cat a non-text file?

  26. 26

    Insert text into file after a specific text

  27. 27

    Get total number of non-blank lines from text file?

  28. 28

    code that read from text file, remove stopwords and then apply case folding

  29. 29

    Creating a JSON object from a tabbed tree text file

热门标签

归档