ACTIVE ANTI-ENTROPY MECHANISM BASED ON SPECTRAL BLOOM FILTER AND PH-2 HASH ALGORITHM FOR RECONCILATION OF REPLICAS OF NOSQL DISTRIBUTED DOCUMENT ORIENTED DATABASES
DOI:
https://doi.org/10.32689/maup.it.2023.3.8Keywords:
NoSQL, distributed system, Active Anti-Entropy, Spectral Bloom Filter, consistency, PH2 hash algorithmAbstract
Information systems are used in many areas of human activity, which are not limited to one country or continent. This may require horizontal scaling for the system to function properly. Ignoring this can affect performance and availability, which in turn can lead to a loss of reputation and users. Horizontal scaling increases the number of database replicas, which creates the need for data reconciliation, since writing operations to different nodes increases entropy. There are various technologies aimed at reducing it, including Active Anti-Entropy. Its essence is to detect inconsistencies and start the reconciliation process between replicas. It is actively used in a database such as Riak and uses the Merkle Tree data structure, which is based on the use of hashing algorithms. The speed of inconsistency identification depends on the chosen hashing algorithms and the number of documents in the collection. An increase in the number of documents or even their size can worsen the even distribution and lead to an increase in the number of collisions. The occurrence of collisions increases the time period of data inconsistency, because the system cannot detect the inconsistency in time. In addition to the collisions that can occur, you need to consider the delay due to data transfer over the network when nodes interact, and remember that such verification is not a one-time operation, but requires constant computation on replicas and sending for verification. Minimizing the time of these operations will speed up the data reconciliation process. Critically important data must be reconciled with minimal delay, as an untimely or incorrectly made decision can lead to material or even human losses. To prevent this, there must be a solution that will minimize the delay of matching such data.
References
Changlin H. Survey on NoSQL Database Technology. Journal of Applied Science and Engineering Innovation. 2015. 2, 50-54. URL: http://www.jasei.pub/PDF/2-2/2-50-54.pdf
Muniswamaiah M., Agerwala T., C. Tappert C. Performance of databases in IoT applications. 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). 2020. (190-192). New York, NY, USA : IEEE. URL: https://doi.org/10.1109/CSCloud-EdgeCom49738.2020.00041
K. Aguilera M., B. Terry D. The Many Faces of Consistency. IEEE Database Engineering Bulletin. 2016. 3-13. URL: http://sites.computer. org/debull/A16mar/p3.pdf
Belous R., Krylov E. TIME OPTIMIZATION OF PROCESS OF DATA CONSISTENCY IN NOSQL. Herald of the Khmelnytskyi National University. Series: "Technical Sciences". 2023. 3, 37-42. URL: http://journals. khnu.km.ua/vestnik/wp-content/uploads/2023/07/vknu-ts-2023-n3321-37-42.pdf
Nikitin V., Krylov E. A collision-resistant hashing algorithm for maintaining consistency in distributed NoSQL databases. Adaptive Systems of Automatic Control Interdepartamental scientific and technical collection. 2022. 2, 45-57. URL: https://doi.org/10.20535/1560-8956.41.2022.271338
Tarkoma S., Rothenberg C., Lagerspetz E. Theory and Practice of Bloom Filters for Distributed Systems. IEEE Communications Surveys & Tutorials. 2011. 14, 131-155. URL: https://doi.org/10.1109/SURV.2011.031611.00024
Cohen S., Matias Y. Spectral Bloom Filters. Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. 2003. 1-12. URL: http://dx.doi.org/10.1145/872757.872787
Nikitin V., Krylov E. Comparison of hashing methods for supporting of consistency in distributed databases. Adaptive Systems of Automatic Control Interdepartmental scientific and technical collection. 2022. 1, 48-53. URL: http://asac.kpi.ua/article/view/261646/258069
Al-Dhief F., Sabri N., Latiff N., Obaid O. Performance comparison between TCP and udp protocols in different simulation scenarios. International Journal of Engineering & Technology. 2018. 7, 172-176. URL: https://doi.org/10.14419/ijet.v7i4.36.23739