Rhetorical Sentence Classification for Automatic Title Generation in Scientific Article
Abstract: In this paper, we
proposed a work on rhetorical corpus construction and sentence classification model
experiment that specifically could be incorporated in automatic paper title
generation task for scientific article. Rhetorical classification is treated as
sequence labeling. Rhetorical sentence classification model is useful in task
which considers document’s discourse structure. We performed experiments using two
domains of datasets: computer science (CS dataset), and chemistry (GaN
dataset). We evaluated the models using 10-fold-cross validation (0.70-0.79
weighted average F-measure) as well as on-the-run (0.30-0.36 error rate at
best). We argued that our models performed best when handled using SMOTE filter
for imbalanced data.
Keywords: rhetorical corpus
construction, rhetorical classification, automatic title generation, scientific
article
Author: Jan Wira Gotama Putra,
Masayu Leylia Khodra
Journal Code: jptkomputergg170126