Streamed Sampling on Dynamic data as Support for Classification Model
Abstract: Data mining process
on dynamically changing data have several problems, such as unknown data size
and changing of class distribution. Random sampling method commonly applied for
extracting general synopsis from very large database. In this research,
Vitter’s reservoir algorithm is used to retrieve k records of data from the
database and put into the sample. Sample is used as input for classification
task in data mining. Sample type is backing sample and it saved as table
contains value of id, priority and timestamp. Priority indicates the
probability of how long data retained in the sample. Kullback-Leibler
divergence applied to measure the similarity between database and sample
distribution. Result of this research is showed that continuously taken samples
randomly is possible when transaction occurs. Kullback-Leibler divergence with
interval from 0 to 0.0001, is a very good measure to maintain similar class distribution
between database and sample. Sample results are always up to date on new
transactions with similar class distribution. Classifier built from balance
class distribution showed to have better performance than from imbalance one.
Keywords: random sample,
relative entropy, skewness, kullback liebler divergence, dynamic classification
Author: Astried Silvanie,
Taufik Djatna, Heru Sukoco
Journal Code: jptkomputergg130119

Artikel Terkait :
Jp Teknik Komputer gg 2013
- Gamelan Music Onset Detection based on Spectral Features
- A Mobile Ecotourism Recommendations System Using Cars-Context Aware Approaches
- Improved Harmony Search Algorithm with Chaos for Absolute Value Equation
- Future Smart Cooking Machine System Design
- A Review of Communication Protocols for Intelligent Remote Terminal Unit Development
- Power Balance AODV Algorithm of WSN in Agriculture Monitoring
- Circularly Polarized Proximity-Fed Microstrip Array Antenna for Micro Satellite
- Ovarian Cancer Identification using One-Pass Clustering and k-Nearest Neighbors
- Localizing Region-Based Level-set Contouring for Common Carotid Artery in Ultrasonography
- Separability Filter for Localizing Abnormal Pupil: Identification of Input Image
- Feature Extraction of Composite Damage on Acoustic Emission Signals
- A Camera Self-Calibration Method Based on Plane Lattice and Orthogonality
- Application of Wavelet Analysis in Detecting Runway Foreign Object Debris
- Palmprint Verification Using Time Series Method
- Time Series Based for Online Signature Verification
- Two-phase Flow Visualization Employing Gauss-Newton Method in Microchannel
- Study on Thermal Conductivity Methane Sensor Constant Temperature Detection Method
- Fuzzy Adaptive PID Control of a New Hydraulic Erecting Mechanism
- Optimization of Membership Functions for the Fuzzy Controllers of the Water Tank and Inverted Pendulum with Differents PSO Variants
- Study of an Improved Fuzzy Direct Torque Control of Induction Motor
- Research of NiMH Battery Modeling and Simulation Based on Linear Regression Analysis Method
- Design and Modeling of an Integrated Micro-Transformer in a Flyback Converter
- Renewable Distributed Generation Models in Three-Phase Load Flow Analysis for Smart Grid
- A Design Study of Dual-Stator Permanent Magnet Brushless DC Motor