ArabicStemmers_LightStemmers: Arabic Stemmer / Light Stemmer

Author:Motaz K. Saad <motaz.saad{[at]}gmail.com>
Maintainer:Motaz K. Saad <motaz.saad{[at]}gmail.com>

Performs Stemming / Light Stemming for Arabic words using the Stemming / Light Stemming algorithms.
Stemming reduces words to their stems. Light stemming, in contrast, removes common affixes from words without reducing them totheir stems. The main idea for using light stemming is that many word variants do not have similar meanings or semantics although these word variants are generated from the same root. Thus, root extraction algorithms affect the meanings of words. Light stemming aims to enhance feature/keyword reduction while retaining the words meanings. It removes some defined prefixes and suffixes from the word instead of extracting the original root.
For more information, please refer to: Motaz K. Saad and Wesam Ashour, "Arabic Morphological Tools for Text Mining", 6th ArchEng International Symposiums, EEECS’10 the 6th International Symposium on Electrical and Electronics Engineering and Computer Science, pp. 112-117, European University of Lefke, Cyprus, 2010.
This package requires to run Weka with (-Dfile.encoding=utf-8) option
or change the line (fileEncoding=Cp1252) to (fileEncoding=utf-8) in weka.ini file
files also should be in utf-8 format.
To use Open Source Arabic Corpora (OSAC), please refer to Motaz K. Saad and Wesam Ashour, "OSAC: Open Source Arabic Corpora", 6th ArchEng International Symposiums, EEECS’10 the 6th International Symposium on Electrical and Electronics Engineering and Computer Science, pp. 118-123, European University of Lefke, Cyprus, 2010.
OSAC can be downloaed from: http://sourceforge.net/projects/ar-text-mining/files/Arabic-Corpora/
For other Arabic Corpora, Please refer to http://aracorpus.e3rab.com

All available versions:
Latest
1.0.0