Yogesh Chauhan's Blog

What are Big Data Clusters in SQL Server?

in SQL/MySQL on November 10, 2021

Big Data Clusters got introduced in 2019 SQL Server edition.

SQL Server 2019 includes Hadoop Distributed File System (HDFS) and Apache Spark. It helps with the scalability of computation as well as storage. So, all there of them are called “big data cluster”.

SQL Server database engine + Spark + HDFS = Big Data Cluster

Where SQL is not enough…

It’s not possible to perform analytics of a large scale data like petabytes on a single instance of SQL Server. Also, the newer industry requirements like data processing or machine learning that requires scale-out computing is not possible on a single instance of SQL server along with storing and analyzing unstructured data. So, Microsoft came out with a new solution to extend the features of SQL server.

SQL Server 2019 deploys multiple instances of SQL Server with Spark and HDFS to create a Big Data Cluster and in doing so it adds supports for big and unstructured data.

Azure Cloud Push

Overall with this update Microsoft is pushing it’s Azure Cloud so hard and why not? It’s a world class platform.

You can deploy Big Data Clusters on any cloud that supports a managed Kubernetes service. For e.g. Azure Kubernetes Service (AKS).

You can also deploy Big Data Clusters on any cloud that supports on-premises Kubernetes clusters. For e.g. Azure Kubernetes Service (AKS) or Azure Stack.

Big Data Clusters offers many built-in management services such as backup, analytics and an admin portal for a quick view.

A few features of Big Data Clusters

  • Deploy Big Data Clusters of SQL Server, Spark, and HDFS on Kubernetes.
  • Read, write, and process big data using T-SQL or using Spark.
  • Combine and analyze high-value relational data with high-volume big data.
  • Query the stored data from multiple external sources such as SQL Server, Oracle, Teradata, MongoDB etc. through the cluster.
  • Make use of AI and machine learning on the data.


Big Data Clusters are a great way to use good old SQL Server for big data that is relational and scale it without worrying about processing speed. It’s just not limited to that. With Big Data Clusters, you can create a complete AI platform to create intelligent apps for your organization.

Most Read

#1 How to check if radio button is checked or not using JavaScript? #2 How to add Read More Read Less Button using JavaScript? #3 Solution to “TypeError: ‘x’ is not iterable” in Angular 9 #4 How to uninstall Cocoapods from the Mac OS? #5 PHP Login System using PDO Part 1: Create User Registration Page #6 How to Use SQL MAX() Function with Dates?

Recently Posted

#Dec 4 What is Etrieve Flow? #Dec 2 The unique operator($) in Envision Basic #Nov 25 Steps to Install Microsoft SQL Server on a MacOS #Nov 11 What is DevOps? #Nov 10 The * arithmetic operator in Envision Basic #Nov 10 What are Big Data Clusters in SQL Server?
You might also like these
What’s a Log File and What are Log File Monitors?MiscellaneousHow to CREATE TABLE in SQL with and without using Another Table?SQL/MySQLAdvanced Array Methods in JavaScript (with examples)JavaScriptQuery to increment or decrement value in MySQL ignoring negative valuesSQL/MySQLAll different methods for accessing elements in the DOM using JavaScriptJavaScriptRBV Framework and closing of big brandsMiscellaneous