en | de | es | gl
About PaGeS
|
Text resources
|
Publications
|
Team
|
Contact

Help


The standard search function searches the entire core corpus by default. Here you can choose to include or exclude Europarl or TED from your query.

In the advanced search you can refine your search using different filters from the drop-down menus. In addition, you can activate or deactivate the options above the search fields.

Lemmatization will be applied to your search by default. Enter your search term within quotation marks " " to search for the exact word form or the exact phrase (s. table 1 and 2).

The search supports single (?), multiple character (*), and fuzzy (~) wildcard searches (s. table 1 and table 2).

For more complex searches, you can use the operators AND, OR, NOT and the distance operator ~ or a combination of them. These searches must be preceded by the expression [SS] (Solr Search) (s. table 1 and 2).

You can search for one word or for a word sequence (= multiword search) (s. table 2).

Results of multiword searches include text fragments in which the queried words appear at a distance of 0, 1 or 2 words between them. For longer distances between the searched words, attach the distance operator ~3, ~4, etc. to the last word of the query and begin the query with the [SS] operator.

In multiword searches you can also use the multiple character wildcard (*) to query any token (typically a word). You can also use the regex expression [a-z]+ to search for any word These searches must be preceded by the expression [SS] (Solr Search) (s. table 2).

The advanced search function supports bilingual searches. To search for an expression with a specific correspondence in the other language, enter each term (words or word sequences) in the corresponding language field (s. table 3).


Table 1: Simple Word

 Search  Results
brachte bringen, brachte, hat gebracht, bringt, brachten…
"brachte" brachte
relataría relataría, relatar, relataba, ha sido relatado…
"relataría" relataría
er*gen erschlagen, ertragen, erzogen, Erfahrungen,…
er??gen
er???gen
erzogen, erlegen, ergen…
ertragen
*weise verständlicherweise, seltsamerweise, … (BUT NOT Weise)
Rithmus~ Rythmus, Rithmus, Rittmus
[SS] "Glück" NOT gut Glück (exact word) BUT NOT gut (lemmatized)
[SS] er*gen NOT *ungen erklingen, erschlagen, ergeben, … BUT NOT Erfahrungen, Erkundungen, …

Table 2: Multiword search

Word order in the search is relevant.

 Search  Results
gutes Buch gutes und altes Buch, gute Bücher, guten Büchern,…
"gutes Buch" gutes Buch
gut "Buch" gute Buch, gutes Buch, großes Buch and gut Glück (gut lemmatized and Buch exact word)
hat Angst Lemmas haben and Angst at a distance of 0 or 1 words: habe Angst, hat Angst, hatte solche Angst, habe keine Angst, hatte ich Angst, hatte sowieso Angst
llevar a cuestas Lemmas llevar, preposition a and lemma cuestas at a distance of 0 or 1 words between each (form) of them: llevar a cuestas, lleva nada a cuestas, lleva siempre a cuestas …
"llevamos a cuestas" llevamos a cuestas
[SS] quedar remedio~2 Lemmas quedar and remedio at a maximum distance of 2 words: quedó más remedio, quedaba otro remedio, queda expuesto sin remedio
[SS] hinter * her hinter ihm her, hinter Otello her, hinter Informationen her
[SS] an [a-z]+ vorbei am Fenster vorbei, an ihm vorbei

Table 3: Bilingual search

 Search  Results
DEUTSCH: Mut
ESPAÑOL: valor
Bitexts where Mut corresponds to valor.
DEUTSCH: Mut
ESPAÑOL: [SS] NOT valor
Bitexts where Mut does not correspond to valor.
DEUTSCH: bleiben übrig~3
ESPAÑOL: "queda más remedio"
Bitexts with the lemmas bleiben and übrig within a maximal distance of 3 words in German and the exact phrase queda más remedio in Spanish.
DEUTSCH: [SS] NOT Unterschied
ESPAÑOL: [SS] marcar diferencia~2
Bitexts with the lemmas marcar and diferencia at a maximum distance of 2 words (marca las diferencias, marca grandes diferencias, marcaba una importante/honda diferencia) , but where diferencia does not correspond to Unterschied.


If the search term is preceded by [SS] (Solr Search) you are in full command of the query syntax which is used by the underlying query tool Solr. Version 7.5.0. For more information about Solr query syntax, please click here.


Display of the search results

Matches are displayed in sets of 100 bitexts.

The language entered in the search is considered the original language. In bilingual searches for sorting and retrieval, the original language is considered to be German.

Matches in the original texts are displayed first in the left column. Matches in the translated texts are displayed afterwards in the right column. In cases where both versions are translations from a third language or the original language is unknown, German is conventionally presented in the left column and Spanish in the right column.

As for the sorting of the results, the matches in the original texts are shown first, followed by the matches in the translations and, finally, the matches in translations from third languages.

Reference information is displayed in square brackets after every match, comprising work ID, and the corresponding section of the work: part and/or chapter [0000, 1, 1]. By clicking on the reference will take you to a page where you can expand the context and eventually access more details of the bibliographic information.

You can find following abbreviations within the texts:

[n_t_s] in the translation text means that an original string has not been translated, which is also included in square brackets.

[a_s_t] in the original text means that a sequence, shown in square brackets, has been added in the translation.

[…] indicates that a fragment of text has been omitted from the work (in original and in translation).

                                                    
PaGeS Vers. 2.1
Last updated: 04.12.2023
ISLRN 300-741-224-666-2
ISSN 2605-5228 ©PaCorES
Creative Commons Licencia Creative Commons
University of Santiago de Compostela
This project is funded by the State Research Agency (AEI) of Spanish Ministry of Science, Innovation and University (PID2021-125313OB-I00).