Learnability of Semi-rigid Grammars

Jacek Marciniec (Adam Mickiewicz University, Poznań)

As we showed in [7, 6], the image of a unifiable and infinite set of types, given by its most general unifier, can be always established on the basis of its finite subset. There is a very strict relationship between this property and that of learnability (in the sense of Gold's identification in the limit [3]) of rigid categorial languages (cf. [4, 5]). The same property of compactness holds for optimal unification - the notion introduced in [2] as a natural generalization of its standard counterpart. However, an infinite set of types may generally have an infinite number of optimal unifiers. Therefore, a learning function for some class of non-rigid grammars cannot be grounded barely on Buszkowski and Penn's algorithm, following the way the learning function for rigid categorial grammars is based on Buszkowski's discovery procedure [1]. Some additional mechanism of selection has to be also involved. Usually, the grammar classes and the corresponding learning algorithms are described in terms of the conditions of quantitative nature, concerning the number of types assigned by the grammars. Our approach is different. We employ the presented in [8], modified kind of optimal unification, accompanied with the appropriate algorithm. For any, even infinite set of types that is bounded in length, there exist only a finite number of optimal unifiers of the new kind. The reduction of the number of solutions is achieved by enforcing the order in which types are unified. Our algorithm, applied to an infinite sequence of functor-argument structures (admitting finite type assignment - an enumeration of some functorial language for instance), always stabilizes after some step, so a learning function based on the algorithm always converges to some grammar. We characterize a class of semi-rigid categorial grammars that is learnable by the function, and which lies in-between (learnable) class of rigid grammars and (introduced in [5], not learnable) class of optimal grammars. The class definition does not rely on any relation between grammars, therefore, for any grammar its membership to the class can be determined by solely examining its primary type assignment. The learning function we provide is responsive, set-driven and consistent, but not prudent.

References

[1]    Wojciech Buszkowski. Discovery procedures for categorial grammars. In Evan Klein and Johann van Benthem, editors, Categories, Polymorphism and Unification. Universiteit van Amsterdam, Amsterdam, 1987.

[2]    Wojciech Buszkowski and Gerald Penn. Categorial grammars determined from linguistic data by unification. Studia Logica, XLIX(4):431-454, 1990.

[3]    E. Mark Gold. Language identification in the limit. Information and Control 10:447-474, 1967

[4]    Makoto Kanazawa. Identification in the limit of categorial grammars. Journal of Logic, Language and Information, 5(2):115-155, 1996.

[5]    Makoto Kanazawa. Learnable Classes of Categorial Grammars. Studies in Logic, Language and Information. CSLI Publications & FoLLI, Stanford, California, 1998.

[6]    Jacek Marciniec. Connected sets of types and categorial consequence. In Christian Retor´e, editor, Logical Aspects of Computational Linguistics, volume 1328 of Lecture Notes in Artificial Inteligence, pages 292-309. Springer, Berlin, 1997.

[7]     Jacek Marciniec. Infinite set unification with application to categorial grammar. Studia Logica, LVIII(3):339-355, 1997.

[8]    Jacek Marciniec. Optimal unification of infinite sets of types. Fundamenta Informaticae, 62(3,4):395-407, 2004.