undefined | Better HN

0 pointsrippeltippel4y ago0 comments

> This engine first determines the alphabet of the input text and searches for characters which are unique in one or more languages. If exactly one language can be reliably chosen this way, the statistical model is not necessary anymore.

Can this be a problem? If a text in Language_A includes names/words of Language_B, only relying on special characters would wrongly classify the entire text as Language_B.

0 comments

No comments yet.

0 pointsrippeltippel4y ago0 comments

> This engine first determines the alphabet of the input text and searches for characters which are unique in one or more languages. If exactly one language can be reliably chosen this way, the statistical model is not necessary anymore.

Can this be a problem? If a text in Language_A includes names/words of Language_B, only relying on special characters would wrongly classify the entire text as Language_B.

0 comments

No comments yet.