Yogesh Chauhan's Blog

What are Big Data Clusters in SQL Server?

in SQL/MySQL on November 10, 2021

Big Data Clusters got introduced in 2019 SQL Server edition.

SQL Server 2019 includes Hadoop Distributed File System (HDFS) and Apache Spark. It helps with the scalability of computation as well as storage. So, all there of them are called “big data cluster”.

SQL Server database engine + Spark + HDFS = Big Data Cluster

Where SQL is not enough…

It’s not possible to perform analytics of a large scale data like petabytes on a single instance of SQL Server. Also, the newer industry requirements like data processing or machine learning that requires scale-out computing is not possible on a single instance of SQL server along with storing and analyzing unstructured data. So, Microsoft came out with a new solution to extend the features of SQL server.

SQL Server 2019 deploys multiple instances of SQL Server with Spark and HDFS to create a Big Data Cluster and in doing so it adds supports for big and unstructured data.

Azure Cloud Push

Overall with this update Microsoft is pushing it’s Azure Cloud so hard and why not? It’s a world class platform.

You can deploy Big Data Clusters on any cloud that supports a managed Kubernetes service. For e.g. Azure Kubernetes Service (AKS).

You can also deploy Big Data Clusters on any cloud that supports on-premises Kubernetes clusters. For e.g. Azure Kubernetes Service (AKS) or Azure Stack.

Big Data Clusters offers many built-in management services such as backup, analytics and an admin portal for a quick view.

A few features of Big Data Clusters

  • Deploy Big Data Clusters of SQL Server, Spark, and HDFS on Kubernetes.
  • Read, write, and process big data using T-SQL or using Spark.
  • Combine and analyze high-value relational data with high-volume big data.
  • Query the stored data from multiple external sources such as SQL Server, Oracle, Teradata, MongoDB etc. through the cluster.
  • Make use of AI and machine learning on the data.


Big Data Clusters are a great way to use good old SQL Server for big data that is relational and scale it without worrying about processing speed. It’s just not limited to that. With Big Data Clusters, you can create a complete AI platform to create intelligent apps for your organization.

Most Read

#1 Solution to the error “Visual Studio Code can’t be opened because Apple cannot check it for malicious software” #2 How to add Read More Read Less Button using JavaScript? #3 How to check if radio button is checked or not using JavaScript? #4 Solution to “TypeError: ‘x’ is not iterable” in Angular 9 #5 How to uninstall Cocoapods from the Mac OS? #6 PHP Login System using PDO Part 1: Create User Registration Page

Recently Posted

#Apr 8 JSON.stringify() in JavaScript #Apr 7 Middleware in NextJS #Jan 17 4 advanced ways to search Colleague #Jan 16 Colleague UI Basics: The Search Area #Jan 16 Colleague UI Basics: The Context Area #Jan 16 Colleague UI Basics: Accessing the user interface
You might also like these
How does Next.js load pages faster?NextJSHow to create a Recent Posts function in WordPress?WordPressHow to find the HCF or GCD and LCM of two given numbers using Swift?SwiftWhat is iFrame in HTML? Why do we need it?HTMLQuery to increment or decrement value in MySQL ignoring negative valuesSQL/MySQLThe simple difference between var, let and const in JavascriptJavaScript