As the flagship product of HashData, HashData Warehouse integrates the high performance and rich analytics of MPP databases, the scalability and flexibility of big data platforms, as well as the elasticity and agility of cloud computing. Leveraging its innovative architecture that separates metadata, computation, and storage, HashData offers unparalleled high concurrency, elasticity, user-friendliness, high availability, high performance, and scalability that traditional solutions cannot match.
The typical use cases of HashData are characterized by large volumes of data, high query complexity, high concurrent access, and high system availability. In addition to the public cloud data warehouse service, HashData considers the local IT and business environments and also supports private and hybrid cloud deployments.
Pay-as-you-go and low unit storage costs considerably bring down the usage costs of object storage-based cloud data warehouses. Moreover, the unique compression algorithm offered by HashData Warehouse technology enhances storage space efficiency by 2 to 3 times.
Metadata is persisted using a globally transactional, distributed Key-Value (KV) database. We've designed a stateless service node layer on top of this KV database to process access requests to system metadata coming from the computing layer, all while maintaining high data availability.
Leveraging a UDP-based high-speed data transmission protocol, HashData Warehouse executes data exchange while queries run in a streamlined, parallel fashion across each computing node, significantly boosting query efficiency. Computing nodes are purely computing resource units, which are created, deleted, and scaled vertically as per requirement, and they also come equipped with local SSD storage for caching purposes.
HashData Warehouse horizontally scales the concurrent computing capacity of the cluster by adding physical clusters, while maintaining a shared, unified metadata and data storage system, and ensuring strong data consistency across all clusters.
HashData Warehouse can initiate independent clusters for each workload as needed, which not only meets the diverse configuration requirements of computing nodes for different workloads, but also solves the performance issues arising from resource contention between different workloads.
Based on shared data storage, HashData Warehouse employs a consistent hashing distribution scheme to avoid data migration when new nodes are added.
Copyright© 2024 HashData Technology (Hong Kong) Limited - All rights reserved.
Photos credited to Unsplash.