Big Data Clusters got introduced in 2019 SQL Server edition.
SQL Server 2019 includes Hadoop Distributed File System (HDFS) and Apache Spark. It helps with the scalability of computation as well as storage. So, all there of them are called “big data cluster”.
SQL Server database engine + Spark + HDFS = Big Data Cluster
Where SQL is not enough…
It’s not possible to perform analytics of a large scale data like petabytes on a single instance of SQL Server. Also, the newer industry requirements like data processing or machine learning that requires scale-out computing is not possible on a single instance of SQL server along with storing and analyzing unstructured data. So, Microsoft came out with a new solution to extend the features of SQL server.
SQL Server 2019 deploys multiple instances of SQL Server with Spark and HDFS to create a Big Data Cluster and in doing so it adds supports for big and unstructured data.
Azure Cloud Push
Overall with this update Microsoft is pushing it’s Azure Cloud so hard and why not? It’s a world class platform.
You can deploy Big Data Clusters on any cloud that supports a managed Kubernetes service. For e.g. Azure Kubernetes Service (AKS).
You can also deploy Big Data Clusters on any cloud that supports on-premises Kubernetes clusters. For e.g. Azure Kubernetes Service (AKS) or Azure Stack.
Big Data Clusters offers many built-in management services such as backup, analytics and an admin portal for a quick view.
A few features of Big Data Clusters
- Deploy Big Data Clusters of SQL Server, Spark, and HDFS on Kubernetes.
- Read, write, and process big data using T-SQL or using Spark.
- Combine and analyze high-value relational data with high-volume big data.
- Query the stored data from multiple external sources such as SQL Server, Oracle, Teradata, MongoDB etc. through the cluster.
- Make use of AI and machine learning on the data.
Conclusion
Big Data Clusters are a great way to use good old SQL Server for big data that is relational and scale it without worrying about processing speed. It’s just not limited to that. With Big Data Clusters, you can create a complete AI platform to create intelligent apps for your organization.
AI azure basics cloud database microsoft ML