Guix will hold its first online conference November 22nd. Propose a talk before November 6th. Learn more!

python2-uniseg 0.7.1 Python library to determine Unicode text segmentations

Uniseg is a Python package used to determine Unicode text segmentations. Supported segmentations include:

  1. Code point (any value in the Unicode codespace)

  2. Grapheme cluster (user-perceived character made of a single or multiple Unicode code points, e.g. "G" + acute-accent)

  3. Word break

  4. Sentence break

  5. Line break