Feature Selection Method Based on Improved Document Frequency
Abstract: Feature selection is
an important part of the process of text classification, there is a direct
impact on the quality of feature selection because of the evaluation function.
Document frequency (DF) is one of several commonly methods used feature selection,
its shortcomings is the lack of theoretical basis on function construction,
itwill tend to select high-frequency words in selecting. To solve the problem,
we put forward a improved algorithm named DFMcombined withclass distribution of
characteristics and realize the algorithm with programming, DFM were compared
with some feature selection method commonly used with experimental using
support vector machine, as text classification .The results show that, when
feature selection, the DFM methods performance is stable at work andis better
than other methodsin classification results.
Author: Wei Zheng
Journal Code: jptkomputergg140124