Simple terms and phrases

The search can use simple terms, ie single words and phrases, that is, phrases composed of several words included in quotation marks, eg "Nicolaus Copernicus University". When using quotation marks, only those documents that contain the entire phrase will be searched.

Search terms can be combined using logical operators. You can also use the so-called. Mask characters that replace any letters or numbers and their sequences, search for similar terms that are at a distance or define the priority of the search terms.

Operatory logiczne

  • AND - also recorded as && - means that the terms linked to the operator must simultaneously occur in the searchable document. For example, the query in the form of Copernicus & Chopin will select only documents in which both names occur simultaneously. Using the AND operator is the default behavior of the search engine if you enter more than one word, so the same result is obtained by typing Copernic, Chopin.
  • OR - also marked || - requires at least one of the terms to appear in the searchable document, eg Copernicus || Copernicus will select documents in which the astronomer's name occurs in one of the forms given.
  • NOT - alternative record ! - excludes from the list of documents in which the deadline occurs. For example, "Nicolaus Copernicus" NOT university searches for documents containing the phrase "Nicolaus Copernicus", but does not contain the word university at the same time. This operator can not be used alone, eg a query in the form of NOT "Nicolaus Copernicus" will not return the correct results.
  • + (Operator of the required term) - looks for documents containing the term immediately after the "+", but not necessarily other terms, eg + university library will select documents that must contain the word library and may (but do not need) contain the word: university.
  • - (operator of the forbidden term) - works just like the NOT operator. "Nicolaus Copernicus" -  "Nicolaus Copernicus University" will search for documents containing the name "Nicolaus Copernicus" but not the name "Nicolaus Copernicus University".

 

Masking characters

 

  • ? - Replaces any one character. Eg. Kowalsk? It fits both Kowalski and Kowalska.
  • * - replaces the string, for example: bu*a will search for words like buda, buddha etc. The masking character must not be placed at the beginning of the search expression.

 

Fuzzy search

 

Fuzzy search is used for simple terms like Copernicus, Copernikus, Kopernikus. Documents containing these terms can be searched by adding a tilde to the term: copernicus ~.

The degree of desired similarity can be determined by a factor that varies from 0 (no similarity) to 1 (identical terms). By default, the affinity factor is set to 0.5. To add a tilde to the search term, add a tilde character along with a clearly defined coefficient,eg. kopernik~0.4.

Search by neighborhood

 

It is also possible to specify how far one of the search terms should be from another (proximity search). For example, if we remember that the document was a short distance from each other, Choral-buch and Westpreussen, we can use the following query: "Choral-buch Westpreussen"~6.

Determining the validity of the term

 

You can specify the priority of the search term by adding the ^ sign with a number (greater than 1). For example, question stempowski^4 grydzewski will return the documents containing the two names, but at the beginning of the list will be the ones where the higher priority name appears. The default search priority is 1.

Linking queries

Expressions in compound queries can be grouped using parentheses. This procedure allows for elaborate inquiries intended, unequivocal sense, just as it happens in arithmetic operations.

First, partial expressions are inside the parentheses and then the larger whole. The query "De revolutionibus orbium coelestium" AND (Copernicus OR Kopernik) will search for documents containing the title Copernicus and his name at least in one of two forms.

Special signs

For understandable reasons, the characters used to build compound queries (+ - && ||! () {} [] ^ "* *?: \) Are treated differently than others: they are the query syntax rather than the query expression. In order to include them in the search process, you have to put the so-called "escape character" in front of them. For example, to search for "(2 + 2) * 2" type "\ (2 \ + 2 \) \ * 2"

Source description

Full description of how to formulate queries: Jakarta Lucene Query Parser Syntax.

Text originally posted on the pages Kujawsko-Pomorska Digital Library.

Creative Commons License This work is available under the Creative Commons Attribution-Share Alike 2.5 United States.

See Jakarta Lucene Query Parser Syntax

This page uses 'cookies'. Learn more