Matching across Multiple documents with ElasticSearch

lwdjustin

I am relatively new to ElasticSearch. I am using it as a search platform for pdf documents. I break the PDFs into text-pages and enter each one as an elasticSearch record with it's corresponding page ID, parent info, etc.

What I'm finding difficult is matching a given query not only to a single document in ES, but making it match any document with the same parent ID. So if two terms are searched, if the terms existed on page 1 and 7 of the actual PDF document (2 separate entries into ES), I want to match this result.

Essentially my goal is to be able to search through the multiple pages of a single PDF, matching happening on any of the document-pages in the PDF, and to return a list of matching PDF documents for the search result, instead of matching "pages"

imotov

It's somewhat tricky. First of all, you will have to split your query into terms yourself. Having a list of terms (let's say foo, bar and baz, you can create a bool query against type representing PDFs (parent type) that would look like this:

{
    "bool" : {
        "must" : [{
            "has_child" : {
                "type": "page",
                "query": {
                    "match": {
                        "page_body": "foo"
                    }
                }
            }
        }, {
            "has_child" : {
                "type": "page",
                "query": {
                    "match": {
                        "page_body": "bar"
                    }
                }
            }
        }, {
            "has_child" : {
                "type": "page",
                "query": {
                    "match": {
                        "page_body": "baz"
                    }
                }
            }
        }]
   }
}

This query will find you all PDFs that contain at least one page with each term.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Elasticsearch: Matching documents with an array in it

From Dev

Insert multiple documents in elasticsearch

From Dev

Pattern matching across multiple lines

From Dev

Complex matching across multiple lines

From Dev

Pattern matching across multiple lines

From Dev

Matching multiple attribs with elasticsearch

From Dev

Filter OUT matching documents in elasticsearch with aggregation

From Dev

ElasticSearch: Labelling documents with matching search term

From Dev

Filter OUT matching documents in elasticsearch with aggregation

From Dev

MongoDB - count instances across multiple documents

From Dev

Elasticsearch Multiple Prefix query OR Matching

From Dev

How to find documents matching multiple criteria

From Dev

ElasticSearch NEST query across multiple types

From Dev

Elasticsearch - Efficiency of search across multiple types

From Dev

Autocomplete and Fuzzy search across multiple indecies in Elasticsearch

From Dev

ElasticSearch: Finding documents with multiple identical fields

From Dev

How to update multiple documents that match a query in elasticsearch

From Dev

Elasticsearch: get multiple specified documents in one request?

From Dev

ElasticSearch: Finding documents with multiple identical fields

From Dev

Elasticsearch get the latest documents, grouped by multiple fields

From Dev

ElasticSearch how display all documents matching date range aggregation

From Dev

Find duplicates in Mongo DB array across multiple documents

From Dev

Count number of times string matches a field across multiple documents

From Dev

Sharing Apache-httpd documents across multiple Linux distributions?

From Dev

Elasticsearch: query for multiple words across multiple fields (with prefix)

From Dev

Elasticsearch - searching across multiple multiple types of a index and its different types

From Dev

Regular expressions matching across multiple line in Sublime Text

From Dev

Regular expression to extract text between matching strings across multiple lines?

From Dev

Find documents matching multiple fields in an object array in MongoDB

Related Related

  1. 1

    Elasticsearch: Matching documents with an array in it

  2. 2

    Insert multiple documents in elasticsearch

  3. 3

    Pattern matching across multiple lines

  4. 4

    Complex matching across multiple lines

  5. 5

    Pattern matching across multiple lines

  6. 6

    Matching multiple attribs with elasticsearch

  7. 7

    Filter OUT matching documents in elasticsearch with aggregation

  8. 8

    ElasticSearch: Labelling documents with matching search term

  9. 9

    Filter OUT matching documents in elasticsearch with aggregation

  10. 10

    MongoDB - count instances across multiple documents

  11. 11

    Elasticsearch Multiple Prefix query OR Matching

  12. 12

    How to find documents matching multiple criteria

  13. 13

    ElasticSearch NEST query across multiple types

  14. 14

    Elasticsearch - Efficiency of search across multiple types

  15. 15

    Autocomplete and Fuzzy search across multiple indecies in Elasticsearch

  16. 16

    ElasticSearch: Finding documents with multiple identical fields

  17. 17

    How to update multiple documents that match a query in elasticsearch

  18. 18

    Elasticsearch: get multiple specified documents in one request?

  19. 19

    ElasticSearch: Finding documents with multiple identical fields

  20. 20

    Elasticsearch get the latest documents, grouped by multiple fields

  21. 21

    ElasticSearch how display all documents matching date range aggregation

  22. 22

    Find duplicates in Mongo DB array across multiple documents

  23. 23

    Count number of times string matches a field across multiple documents

  24. 24

    Sharing Apache-httpd documents across multiple Linux distributions?

  25. 25

    Elasticsearch: query for multiple words across multiple fields (with prefix)

  26. 26

    Elasticsearch - searching across multiple multiple types of a index and its different types

  27. 27

    Regular expressions matching across multiple line in Sublime Text

  28. 28

    Regular expression to extract text between matching strings across multiple lines?

  29. 29

    Find documents matching multiple fields in an object array in MongoDB

HotTag

Archive