Wow, my brain hurts! However, I am excited about all the networking and learning opportunities I received a couple weeks ago at the PASS Summit in Seattle. For those of you that don’t know, the PASS Summit is a leading three-day event attended by over 4,000 individuals that want to enhance their career with in-depth topics on Data Architecture, Management, and Analytics. It is an amazing event every year, and this was no exception.
The big news was the release of SQL Server 2019, which was announced during Ignite. An exciting new feature of SQL Server is Accelerated Database Recovery. It is included in standard edition and allows for faster reboots and instant rollback by using versioning with a concept called Persisted Version Store. One of the main points during the keynote was SQL from Edge to Cloud. SQL Server can now be installed on any ARM device, like a Raspberry Pi, with a footprint of less than 300 MB. Azure Arc allows deployment of Azure services to on-premises and multi-cloud providers and be managed from the Azure portal.
Here is a brief overview of some of the sessions I attended:
SQL Server 2019 Big Data Clusters
What are they? They allow you to deploy scalable clusters of SQL Server, Spark, and HDFS containers running on Kubernetes. This provides data virtualization to combine data from many sources without moving or replicating it. I is not like linked servers. The cluster contains a managed SQL Server, Spark, and Data Lake, for a scalable HDFS storage pool. Once data is stored, you can analyze it with an integrated AI and Machine platform that includes Spark and the built-in AI tools in SQL Server. It is included in standard edition and is an exciting technology I anticipate using.
Azure Storage Options for Analytics
This session discussed Data Lakes and Data Warehouses as some of the choices to store data for analytics. Data Lakes in Azure allow you to store almost everything with cheaper storage costs, supports massive scale for big data, and storage and compute are charged separately. Three cloud options to store data is in Azure Blob Storage, and ADLS Gen 1 and Gen 2. You can import data using Azure Databricks, Azure Data Factory, AzCopy, and Azure Storage Explorer.
ETL in Azure Made Easy with Data Factory Mapping Data Flows
This session looked at using Azure Data Factory and integrating it with Azure Databricks to transform data. What is an Azure Data Factory? It is a service that allows you to perform ETL/ELT processes against a variety of disparate data sources that are either on-premises or in the cloud, somewhat like SSIS. The use of the Mapping and Wrangling Data Flows were discussed along with design patterns. If you are interested in moving terabytes of data by scaling up and out, Azure Data Factory is one services to look at.
Inside SQL Server on Kubernetes
This session was with Bob Ward, and if you know who he is, you are familiar with how in-depth his sessions can be. This was a deep dive and covered a lot. He explained that SQL Containers allow for portability, are lightweight, consistent, and efficient. He also walked through a basic deployment of a SQL Server pod in Kubernetes. One thing I took away from this session? It is a technology that is the future of SQL Server, and I need to learn more about it.
Best Practices for Branching Database Code in Git
We have been talking about some different approaches to source controlling database changes and implementing a DevOps approach to deploying them at the office for some time, so I was curious to see what Kendra Little had to say about the subject. She went over the state and migration approaches for storing database code in source control along with three of the major branching strategies like Git Flow, GitHub Flow, and Release Flow, which the Azure DevOps team uses.
That wraps up my brief recap of SQL Pass Summit 2019. I had a great time networking and learning. I look forward to the next one.