TOKENIZE(<X>, **kwargs)
: This operation tokenizes long texts into token windows of a
given length with a given stride. This is an element-wise column operation.
X
should be a VARCHAR
column. Each row is treated and tokenized as an independent long text.
Alternatively, one can specify the exact number of splits to tokenize the long texts into.
This operation returns an array-valued column that can be exploded with an unnesting operation.