Once the graphic-phonetic varieties were collected, linguistic tagging criteria were established, taking into account that, morphy-syntactic analysis aims to carry out automatic searches in the tagging process of relevant elements while paying attention to their number and function.

The methodology carried out to extract and treat the morphological data, was the following:
  1. Definition of a series of inflection and conjugation models for noun and verb phrases, taking into consideration the dialectal and sub-dialectal differences.
  2. Manual lemmatisation of the terms and then, association with the aforementioned models. Thus, we will obtain all the inflected and conjugated forms of each element, properly classified with the rest of the information added.
  3. With regard to verbal forms, the reconstruction has been made from the collection of textual testimonies and from the texts, since this last method, although completing all paradigms, would force us to work on the analogical base and therefore, the forms would proof to be less reliable.
  4. The part-of-speech tagging has been made up in two stages. First, basic terms have been listed and second, these terms have been classified into different categories.
For the POS tagging, a set of descriptive tags have been developed, in accordance with XML (Extensive Markup Language) syntax.

The methodology at a syntactic level is different from the point of view of the development. It is currently not available since it is still in the implementation and test stage.

Nevertheless, we have been forced to progress cautiously and work on a reduced areas of the analysis due to the great difficulty of the project and dialectal richness. We have been working on the morphological aspects of every dialects but regarding syntax, being an aspect of special difficulty, all our efforts have been devoted to one single dialect. Thus, we are currently working on the Eastern Biscayan variety and then, we will continue with the rest of the dialects.

Furthermore, the morphological aspect is still incomplete. Some errors are awaiting correction and new formulas are still to be explored. Nonetheless, we strongly believe the results indicate that our project is moving into the right direction and we would like to share these results with the academic world.
Text selection|Morphological analyzer|Syntactic analyzer|Glossary|Text comparison|From Batua
Bizkaiko Foru Aldundia - Diputación Foral de Bizkaia UNIVERSIDAD DE DEUSTO · DEUSTUKO UNIBERTSITATEA