Real-Time Big Data Analytics for Data Stream Challenges: An Overview
##plugins.themes.bootstrap3.article.main##
The conventional approach of evaluating massive data is inappropriate for real-time analysis; therefore, analysing big data in a data stream remains a critical issue for numerous applications. It is critical in real-time big data analytics to process data at the point where they are arriving at a quick reaction and good decision making, necessitating the development of a novel architecture that allows for real-time processing at high speed and low latency. Processing and anlayzing a data stream in real-time is critical for a variety of applications; however, handling a large amount of data from a variety of sources, such as sensor networks, web traffic, social media, video streams, and other sources, is a considerable difficulty. The main goal of this paper is to give an overview of the current architecture for real time big data analytics, real-time data stream processing methods available, including their system architectures Lambda, kappa, and delta large data stream processing.
References
-
Laney D. 3D data management: Controlling data volume, velocity and variety. META group research note. 2001; 6(70): 1.
Google Scholar
1
-
Gantz J, Reinsel D. Extracting value from chaos. IDC iview. 2011; 1142(2011): 1-2.
Google Scholar
2
-
Hariri RH, Fredericks EM, Bowers KM. Uncertainty in big data analytics: survey, opportunities, and challenges. Journal of Big Data. 2019; 6(1): 1-6.
Google Scholar
3
-
Data IB, Hub A. Extracting business value from the 4 V's of big data. 2016; 19: 2017.
Google Scholar
4
-
Snow D. Dwaine Snow's Thoughts on Databases and Data Management. 2012.
Google Scholar
5
-
Gandomi A, Haider M. Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management. 2015; 35(2): 137-44.
Google Scholar
6
-
Pokorný J, Škoda P, Zelinka I, Bednárek D, Zavoral F, Kruliš M, et al. Big data movement: a challenge in data processing. InBig Data in Complex Systems 2015: 29-69.
Google Scholar
7
-
Chen M, Mao S, Liu Y. Big data: A survey. Mobile Networks and Applications. 2014; 19(2): 171-209.
Google Scholar
8
-
Bakshi K. Considerations for big data: Architecture and approach. In2012 IEEE aerospace conference 2012: 1-7.
Google Scholar
9
-
Elgendy N, Elragal A. Big data analytics: a literature review paper. InIndustrial conference on data mining 2014: 214-227.
Google Scholar
10
-
Plattner H, Zeier A. In-memory data management: technology and applications. Springer Science & Business Media; 2012.
Google Scholar
11
-
Watson HJ. Tutorial: Big data analytics: Concepts, technologies, and applications. Communications of the Association for Information Systems. 2014; 34(1): 65.
Google Scholar
12
-
Zhang L, Stoffel A, Behrisch M, Mittelstadt S, Schreck T, Pompl R, et al. Visual analytics for the big data era—A comparative review of state-of-the-art commercial systems. In2012 IEEE Conference on Visual Analytics Science and Technology (VAST) 2012: 173-182.
Google Scholar
13
-
Elgendy N. Big Data Analytics in Support of the Decision Making Process. M. S. Thesis. German University. 2013.
Google Scholar
14
-
He Y, Lee R, Huai Y, Shao Z, Jain N, Zhang X, et al. RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems. In2011 IEEE 27th International Conference on Data Engineering 2011: 1199-1208.
Google Scholar
15
-
Tallat R, Latif RM, Ali G, Zaheer AN, Farhan M, Shah SU. Visualization and Analytics of Biological Data by Using Different Tools and Techniques. In2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST) 2019: 291-303.
Google Scholar
16
-
Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Hung Byers A. Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute. 2011.
Google Scholar
17
-
Shen Z, Wei J, Sundaresan N, Ma KL. Visual analysis of massive web session data. InIEEE symposium on large data analysis and visualization (LDAV) 2012: 65-72.
Google Scholar
18
-
Mohamed S, Ismail O, Hogan O. Data equity: Unlocking the value of big data. London, UK: Centre for Economics and Business Research. 2012.
Google Scholar
19
-
Unit EI. The deciding factor: Big data & decision making. Capgemini Reports. 2012: 1-24.
Google Scholar
20
-
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. Journal of Big Data. 2015; 2(1): 1-21.
Google Scholar
21
-
Qiu J, Wu Q, Ding G, Xu Y, Feng S. A survey of machine learning for big data processing. EURASIP Journal on Advances in Signal Processing. 2016; 2016(1): 1-6.
Google Scholar
22
-
Kaur N, Singh G. A Review Paper on Data Mining And Big Data. International Journal of Advanced Research in Computer Science. 2017; 8(4).
Google Scholar
23
-
Rao JN, Ramesh M. A Review on Data Mining & Big Data. Machine Learning Techniques. Int. J. Recent Technol. Eng. 2019; 7: 914-6.
Google Scholar
24
-
Tseng FM, Hu YC. Comparing four bankruptcy prediction models: Logit, quadratic interval logit, neural and fuzzy neural networks. Expert Systems with Applications. 2010 Mar 15; 37(3): 1846-53.
Google Scholar
25
-
Ratra R, Gulia P. Big data tools and techniques: A roadmap for predictive analytics. International Journal of Engineering and Advanced Technology (IJEAT). 2019; 9(2): 4986-92.
Google Scholar
26
-
Marcu OC, Costan A, Antoniu G, Pérez-Hernández M, Tudoran R, Bortoli S, et al. Storage and Ingestion Systems in Support of Stream Processing: A Survey. Ph. D. Thesis. INRIA Rennes-Bretagne Atlantique and University of Rennes 1.
Google Scholar
27
-
Etzion O, Niblett P. Event Processing in Action, Stamford.
Google Scholar
28
-
Linington PF, Milosevic Z, Tanaka A, Vallecillo A. Building enterprise systems with ODP: an introduction to open distributed processing. CRC Press; 2011.
Google Scholar
29
-
Luckham DC. Event processing for business: organizing the real-time enterprise. John Wiley & Sons; 2011.
Google Scholar
30
-
Milosevic Z, Chen W, Berry A, Rabhi FA, Buyya R, Calheiros RN, Dastjerdi AV. Real-time analytics. Big Data: Principles and Paradigms. 2016: 39-61.
Google Scholar
31
-
Murphy BM, O'Driscoll C, Boylan GB, Lightbody G, Marnane WP. Stream computing for biomedical signal processing: A QRS complex detection case-study. In2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2015: 5928-5931.
Google Scholar
32
-
Brown PC. Architecting Complex-Event Processing Solutions with TIBCO®. Addison-Wesley; 2013.
Google Scholar
33
-
Margara A, Cugola G, Tamburrelli G. Learning from the past: automated rule generation for complex event processing. InProceedings of the 8th ACM international conference on distributed event-based systems 2014: 47-58.
Google Scholar
34
-
Ellis B. Real-time analytics: Techniques to analyze and visualize streaming data. John Wiley & Sons; 2014.
Google Scholar
35
-
Chrysos G, Papapetrou O, Pnevmatikatos D, Dollas A, Garofalakis M. Data stream statistics over sliding windows: How to summarize 150 Million updates per second on a single node. In2019 29th International Conference on Field Programmable Logic and Applications (FPL) 2019: 278-285.
Google Scholar
36
-
Braud RE. Query-based debugging of distributed systems. University of California, San Diego; 2010.
Google Scholar
37
-
Traub J, Grulich PM, Cuéllar AR, Breß S, Katsifodimos A, Rabl T, et al. Efficient Window Aggregation with General Stream Slicing. InEDBT 2019; 19: 97-108.
Google Scholar
38
-
Grimaila MR, Myers J, Mills RF, Peterson G. Design and analysis of a dynamically configured log-based distributed security event detection methodology. The Journal of Defense Modeling and Simulation. 2012; 9(3): 219-41.
Google Scholar
39
-
Chen W, Rabhi FA. Enabling user-driven rule management in event data analysis. Information Systems Frontiers. 2016; 18(3): 511-28.
Google Scholar
40
-
Lassinantti J, Ståhlbröst A, Runardotter M. Relevant social groups for open data use and engagement. Government Information Quarterly. 2019; 36(1): 98-111.
Google Scholar
41
-
Zaharia M, Das T, Li H, Hunter T, Shenker S, Stoica I. Discretized streams: Fault-tolerant streaming computation at scale. InProceedings of the twenty-fourth ACM symposium on operating systems principles 2013: 423-438.
Google Scholar
42
-
Landset S, Khoshgoftaar TM, Richter AN, Hasanin T. A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data. 2015; 2(1): 1-36.
Google Scholar
43
-
Ounacer S, Talhaoui MA, Ardchir S, Daif A, Azouazi M. A new architecture for real time data stream processing. International Journal of Advanced Computer Science and Applications. 2017; 8(11): 44-51.
Google Scholar
44
-
Rahman H, Begum S, Ahmed MU. Ins and outs of big data: A review. InInternational Conference on IoT Technologies for HealthCare 2016: 44-51.
Google Scholar
45
-
Mohammed EA, Far BH, Naugler C. Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends. BioData mining. 2014; 7(1): 1-23.
Google Scholar
46
-
Khezr SN, Navimipour NJ. MapReduce and its applications, challenges, and architecture: a comprehensive review and directions for future research. Journal of Grid Computing. 2017; 15(3): 295-321.
Google Scholar
47
-
Lakhe B, Lakhe. Practical Hadoop Migration. Berkeley: Apress; 2016.
Google Scholar
48
-
Grover P, Kar AK. Big data analytics: A review on theoretical contributions and tools used in literature. Global Journal of Flexible Systems Management. 2017; 18(3): 203-29.
Google Scholar
49
-
Kreps J. Questioning the lambda architecture. Online article, July. 2014; 205.
Google Scholar
50
-
Salloum S, Dautov R, Chen X, Peng PX, Huang JZ. Big data analytics on Apache Spark. International Journal of Data Science and Analytics. 2016; 1(3): 145-64.
Google Scholar
51