This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompati

[experimental] Natural Language Processing functions

submited by
Style Pass
2023-03-21 23:00:07

This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompatible ways in future releases. Set allow_experimental_nlp_functions = 1 to enable it.

With the plain extension type we need to provide a path to a simple text file, where each line corresponds to a certain synonym set. Words in this line must be separated with space or tab characters.

With the wordnet extension type we need to provide a path to a directory with WordNet thesaurus in it. Thesaurus must contain a WordNet sense index.

Detects the language of the UTF8-encoded input string. The function uses the CLD2 library for detection, and it returns the 2-letter ISO language code.

Similar to the detectLanguage function, but detectLanguageMixed returns a Map of 2-letter language codes that are mapped to the percentage of the certain language in the text.

Similar to the detectLanguage function, except the detectLanguageUnknown function works with non-UTF8-encoded strings. Prefer this version when your character set is UTF-16 or UTF-32.

Leave a Comment