Enhanced PolyBase SQL 2019 Installation and basic overview

Enhanced PolyBase SQL 2019 Installation and basic overview

Enhanced PolyBase SQL 2019 – Installation and basic overview

SQLShack

SQL Server training Español

Enhanced PolyBase SQL 2019 – Installation and basic overview

October 24, 2018 by Rajendra Gupta SQL Server 2019 is recently launched in the ignite 2018 event by Microsoft. We can get an overview of SQL 2019 preview version and learn how to install it on Windows environment by following up the article SQL Server 2019 overview and installation. We will explore SQL 2019 Enhanced PolyBase feature in a series of article. In this first part of the article, we will explore below topics Overview of ETL and PolyBase Install PolyBase into SQL 2019 Overview and Installation of Azure Data Studio SQL Server 2019 preview extension in Azure Data Studio

Overview of ETL and PolyBase

In today’s industry requirement, we have data in various databases such as Oracle, MongoDB, Teradata, PostgreSQL, etc. The application requires accessing data from these various data sources and combining data into a single source. It is a challenging task for the database developers and data scientists. We normally use ETL (Extract-Transform-Load) to move the data around the different sources. Below are the steps involved in ETL processes Read data from the data source of your choice and extract the specific data Transform process works on this data based on the logic, rules, and convert data Load process writes the data to the destination database ETL provides great values to apply business logic to the data transform data from various sources and move the data into a single destination or multiple formats. ETL process is having some challenges as below: We need to move data from the source that will require extra resources in terms of disk space Data security is also another aspect. Copy of the data should be should be secured from unauthorized access An ETL process is slow to process and requires efforts to maintain due to its complex logic In SQL Server 2016, we came across new feature ‘PolyBase’ that allows querying relational and non-relational databases. This data virtualization allows integrating data from the multiple sources without moving the data. This actually creates a virtual data layer called as data lake or data hub. We can access all data from the single sources that allows controlling security as well from a single point. We can query Hadoop and Azure Blob Storage using PolyBase in SQL Server 2016. In the article, SQL Server 2016 – PolyBase tutorial, we explored query a CSV file stored in Azure Blob storage from SQL Server 2016 using PolyBase. SQL 2019 provides enhancement to PolyBase to access data from various data sources such as Oracle, Teradata, MongoDB, and PostgreSQL. We can also access data from any data sources with an ODBC driver. We can create external tables that link to these data sources (SQL Server, Oracle, Teradata, MongoDB, or any data source with an ODBC). Users can access these data from external tables similar to a relational database table. These external tables are linked to the data sources and when we execute any query, data from an external table is retrieved and shown to the user. On the image below, we can see PolyBase in SQL Server 2019:

Install PolyBase into SQL Server 2019

Let us first install PolyBase into SQL 2019. In an earlier article, SQL Server 2019 installation on Windows, we installed SQL 2019 preview version. Therefore, I will not cover complete installation here. Put a checkbox against ‘PolyBase Query Service for external data’ in the feature selection page. You need to install Oracle JRE 7 update 51 or higher to install Polybase. If it is not installed, you will get below error message while checking the rules for installation. To fix this error, go to ‘Java SE Runtime Environment 8 Downloads‘ and download Java SE Runtime Environment 8u191E. Double click on the setup file to install it. In the next page, we need to do the PolyBase Configuration. If we are installing PolyBase on a standalone instance, select the option ‘ Use this SQL Server as standalone PolyBase enabled instance’ We can also set up PolyBase a scale-out configuration in which we define the head node and compute nodes. This allows getting performance improvement for the large data sets. You can get more information about this option from PolyBase scale-out groups as shown in below image obtained from this page. In this article, we will use PolyBase on standalone SQL Server instance. Therefore, select the first option ‘Use this server as a standalone PolyBase enabled instance’ and click Next. In the next page, we can specify the service accounts for below two PolyBase services. Service account should be the same for both the services. SQL Server PolyBase engine SQL Server PolyBase data movement Review the configuration and click on Install. Below is the confirmation page after ‘PolyBase Query Service for External data’ service installation is successful. Check the services in the configuration manager. It should be in running state.

Overview and Installation of Azure Data Studio

In the previous articles, Azure Data Studio, we learned that Azure Data Studio is a new GUI based tool that works on Windows, Mac OS and Linux operating systems. It connects to SQL Server, Azure database, and SQL Data Warehouse. Azure Data studio is now a new name for SQL Operation Studio. Azure Data studio provides support for the SQL Server 2019 new features in the October release such as support to big data clusters, enhanced PolyBase, Azure notebook, Azure resource explorer. We can install Azure Data Studio on Windows, Linux, and MacOS. In this article, we will install on the windows environment. Follow the below steps: Download the latest October release of Azure Data Studio from the link Once the setup download is complete, double-click to launch the setup wizard. . Accept the license agreement and click on Next. Specify the destination directory. The default location is ‘C:\Program Files\Azure Data Studio’. We need to have at least 365.2 MB of free disk space in the disk. Setup creates the start menu folder. We can select the folder in the start menu. If we do not want to create the startup menu folder, put a checkbox on ‘Don’t create a Start Menu folder’. We can also select to create a desktop icon. This also adds a PATH in the environment variable. We can also register Azure Data studio to use an editor for the supported file types. To do so, put a check here as shown below. Configuration is now completed, Click on Install to complete the installation process of Azure Data Studio. We get the below screen once the setup is complete for the Azure Data Studio. We can launch the Azure Data Studio from here itself or from the Start menu. Default screen for the Azure Data Studio is as shown below. SQL Server 2019 is in preview state so here we get the option whether we want to enable preview features. Click on yes to enable the preview features. Enter the connection details like instance name, authentication type, server group (we can select existing server group or create a new group). Azure Data Studio also allows specifying the friendly name for the connection in the recent release. As shown below, we are connected to SQL 2019 preview instance with the friendly name in Azure Data Studio. Now in order to use SQL 2019 preview version all features, we need to install ‘SQL Server 2019 (Preview)’ extension from the Marketplace. Click on the ‘SQL Server 2019 (Preview)’ extension in the Marketplace and we can get an overview of the preview extension. You can go through it to get more information about the extension. Click on Install opens up a webpage where we can download the SQL Server 2019 extension (preview) .vsix file. Now go to file -> “Install Extensions from VSIX Package” and provide the path of the downloaded .vsix file. Click Yes to install the extension. This will take some time to install this SQL Server 2019 preview extension. We get the below message after the extension is successfully installed. Click on Reload Now to install its dependencies and take this extension into effect.

Conclusion

In this article, we took an overview of SQL 2019 PolyBase enhancements, Azure Data Studio installation and its extension to support SQL Server 2019 preview features. In the next article, we will create sample database objects in the Oracle and create external tables to access these objects from the SQL 2019 PolyBase external tables.

Table of contents

Enhanced PolyBase SQL 2019 – Installation and basic overview Enhanced PolyBase SQL 2019 – External tables for Oracle DB Enhanced PolyBase SQL 2019 – External tables using t-SQL Enhanced PolyBase SQL 2019 – External tables SQL Server, Catalog view and PushDown Enhanced PolyBase SQL 2019 – MongoDB and external table Author Recent Posts Rajendra GuptaHi! I am Rajendra Gupta, Database Specialist and Architect, helping organizations implement Microsoft SQL Server, Azure, Couchbase, AWS solutions fast and efficiently, fix related issues, and Performance Tuning with over 14 years of experience.

I am the author of the book "DP-300 Administering Relational Database on Microsoft Azure". I published more than 650 technical articles on MSSQLTips, SQLShack, Quest, CodingSight, and SeveralNines.

I am the creator of one of the biggest free online collections of articles on a single topic, with his 50-part series on SQL Server Always On Availability Groups.

Based on my contribution to the SQL Server community, I have been recognized as the prestigious Best Author of the Year continuously in 2019, 2020, and 2021 (2nd Rank) at SQLShack and the MSSQLTIPS champions award in 2020.

Personal Blog: https://www.dbblogger.com
I am always interested in new challenges so if you need consulting help, reach me at [email protected]

View all posts by Rajendra Gupta Latest posts by Rajendra Gupta (see all) Copy data from AWS RDS SQL Server to Azure SQL Database - October 21, 2022 Rename on-premises SQL Server database and Azure SQL database - October 18, 2022 SQL Commands to check current Date and Time (Timestamp) in SQL Server - October 7, 2022

Related posts

Enhanced PolyBase SQL 2019 – External tables using t-SQL Enhanced PolyBase SQL 2019 – External tables for Oracle DB Enhanced PolyBase SQL 2019 – External tables SQL Server, Catalog view and PushDown Enhanced PolyBase SQL 2019 – MongoDB and external table Overview of SQL Server 2019 General Availability and installation 13,522 Views

Follow us

Popular

SQL Convert Date functions and formats SQL Variables: Basics and usage SQL PARTITION BY Clause overview Different ways to SQL delete duplicate rows from a SQL Table How to UPDATE from a SELECT statement in SQL Server SQL Server functions for converting a String to a Date SELECT INTO TEMP TABLE statement in SQL Server SQL WHILE loop with simple examples How to backup and restore MySQL databases using the mysqldump command CASE statement in SQL Overview of SQL RANK functions Understanding the SQL MERGE statement INSERT INTO SELECT statement overview and examples SQL multiple joins for beginners with examples Understanding the SQL Decimal data type DELETE CASCADE and UPDATE CASCADE in SQL Server foreign key SQL Not Equal Operator introduction and examples SQL CROSS JOIN with examples The Table Variable in SQL Server SQL Server table hints – WITH (NOLOCK) best practices

Trending

SQL Server Transaction Log Backup, Truncate and Shrink Operations Six different methods to copy tables between databases in SQL Server How to implement error handling in SQL Server Working with the SQL Server command line (sqlcmd) Methods to avoid the SQL divide by zero error Query optimization techniques in SQL Server: tips and tricks How to create and configure a linked server in SQL Server Management Studio SQL replace: How to replace ASCII special characters in SQL Server How to identify slow running queries in SQL Server SQL varchar data type deep dive How to implement array-like functionality in SQL Server All about locking in SQL Server SQL Server stored procedures for beginners Database table partitioning in SQL Server How to drop temp tables in SQL Server How to determine free space and file size for SQL Server databases Using PowerShell to split a string into an array KILL SPID command in SQL Server How to install SQL Server Express edition SQL Union overview, usage and examples

Solutions

Read a SQL Server transaction logSQL Server database auditing techniquesHow to recover SQL Server data from accidental UPDATE and DELETE operationsHow to quickly search for SQL database data and objectsSynchronize SQL Server databases in different remote sourcesRecover SQL data from a dropped table without backupsHow to restore specific table(s) from a SQL Server database backupRecover deleted SQL data from transaction logsHow to recover SQL Server data from accidental updates without backupsAutomatically compare and synchronize SQL Server dataOpen LDF file and view LDF file contentQuickly convert SQL code to language-specific client codeHow to recover a single table from a SQL Server database backupRecover data lost due to a TRUNCATE operation without backupsHow to recover SQL Server data from accidental DELETE, TRUNCATE and DROP operationsReverting your SQL Server database back to a specific point in timeHow to create SSIS package documentationMigrate a SQL Server database to a newer version of SQL ServerHow to restore a SQL Server database backup to an older version of SQL Server

Categories and tips

►Auditing and compliance (50) Auditing (40) Data classification (1) Data masking (9) Azure (295) Azure Data Studio (46) Backup and restore (108) ►Business Intelligence (482) Analysis Services (SSAS) (47) Biml (10) Data Mining (14) Data Quality Services (4) Data Tools (SSDT) (13) Data Warehouse (16) Excel (20) General (39) Integration Services (SSIS) (125) Master Data Services (6) OLAP cube (15) PowerBI (95) Reporting Services (SSRS) (67) Data science (21) ►Database design (233) Clustering (16) Common Table Expressions (CTE) (11) Concurrency (1) Constraints (8) Data types (11) FILESTREAM (22) General database design (104) Partitioning (13) Relationships and dependencies (12) Temporal tables (12) Views (16) ▼Database development (418) Comparison (4) Continuous delivery (CD) (5) Continuous integration (CI) (11) Development (146) Functions (106) Hyper-V (1) Search (10) Source Control (15) SQL unit testing (23) Stored procedures (34) String Concatenation (2) Synonyms (1) Team Explorer (2) Testing (35) Visual Studio (14) DBAtools (35) DevOps (23) DevSecOps (2) Documentation (22) ETL (76) ►Features (213) Adaptive query processing (11) Bulk insert (16) Database mail (10) DBCC (7) Experimentation Assistant (DEA) (3) High Availability (36) Query store (10) Replication (40) Transaction log (59) Transparent Data Encryption (TDE) (21) Importing, exporting (51) Installation, setup and configuration (121) Jobs (42) ►Languages and coding (686) Cursors (9) DDL (9) DML (6) JSON (17) PowerShell (77) Python (37) R (16) SQL commands (196) SQLCMD (7) String functions (21) T-SQL (275) XML (15) Lists (12) Machine learning (37) Maintenance (99) Migration (50) Miscellaneous (1) ►Performance tuning (869) Alerting (8) Always On Availability Groups (82) Buffer Pool Extension (BPE) (9) Columnstore index (9) Deadlocks (16) Execution plans (125) In-Memory OLTP (22) Indexes (79) Latches (5) Locking (10) Monitoring (100) Performance (196) Performance counters (28) Performance Testing (9) Query analysis (121) Reports (20) SSAS monitoring (3) SSIS monitoring (10) SSRS monitoring (4) Wait types (11) ►Professional development (68) Professional development (27) Project management (9) SQL interview questions (32) Recovery (33) Security (84) Server management (24) SQL Azure (271) SQL Server Management Studio (SSMS) (90) SQL Server on Linux (21) ▼SQL Server versions (177) SQL Server 2012 (6) SQL Server 2016 (63) SQL Server 2017 (49) SQL Server 2019 (57) SQL Server 2022 (2) ▼Technologies (334) AWS (45) AWS RDS (56) Azure Cosmos DB (28) Containers (12) Docker (9) Graph database (13) Kerberos (2) Kubernetes (1) Linux (44) LocalDB (2) MySQL (49) Oracle (10) PolyBase (10) PostgreSQL (36) SharePoint (4) Ubuntu (13) Uncategorized (4) Utilities (21) Helpers and best practices BI performance counters SQL code smells rules SQL Server wait types © 2022 Quest Software Inc. ALL RIGHTS RESERVED. GDPR Terms of Use Privacy
Share:
0 comments

Comments (0)

Leave a Comment

Minimum 10 characters required

* All fields are required. Comments are moderated before appearing.

No comments yet. Be the first to comment!

Enhanced PolyBase SQL 2019 Installation and basic overview | Trend Now | Trend Now