İleri Mühendislik Çalışmaları ve Teknolojileri Dergisi
Yazarlar: Zeynep ORMAN, Mert KAVİ
Konular:Mühendislik
Anahtar Kelimeler:Distributed systems,Big data,Stream processing,Scalability,Queuing theory
Özet: Establishing large-scale distributed stream processing systems and ensuring their operations is a very complex and costly process. These systems should be capable of adapting the varying rates of data stream and they must be scaled, if required. It is usually inevitable to use an effective automatic scaling system which can be integrated into such systems. In recent literature, there are numerous studies on this issue. Many of these studies have focused on how these systems will operate under normal conditions. There are limited studies on scalability where scaling is usually implemented with a set of resources. In this study, based on these shortcomings, a system design which can adapt to changing working loads and work on Apache Flink, is proposed. Apache Flink is used for both system development and calculating the scaling metrics. Scaling is performed by evaluating the expected latency calculated with Queuing Theory and some critical metrics. It is aimed to improve system performances and reduce quality losses with this model, which can be integrated into big data processing systems. Pre-scaling and post-scaling cases are also demonstrated by simulations to show the effectiveness of the proposed system.