Will Koalas Replace PySpark?

One of the first of many big announcements at the 2020 Spark and AI Summit was the official release of Koalas 1.0, the pandas API on top of Apache Spark. Tori Tompkin explores how Koalas differs from PySpark.

Watch and Win! Centralize Backups for Scattered Data — Without License Fees

Data is scattered across platforms – PCs, VMs, cloud apps. See how Active Backup centralizes data protection in one place. Plus, enter to win a Synology NAS!

Using NUnit to Automate the Testing of Data Factory Pipelines

In this video, Paul Andrew and Richard Swinbank talk about how you can use an NUnit project in Visual Studio to automate the testing of Data Factory pipelines.

Exploring Azure Synapse

Ginger Grant discusses the biggest architectural design change and the most interesting components of Azure Synapse.

Data Quality from First Principles

If you’ve spent any amount of time in business intelligence, you would know that data quality is a perennial challenge. It never really goes away. Cedric Chin talks about the right way to think about data quality.

The What, Why, When, And How of Incremental Loads

When moving data in an extraction, transformation, and loading (ETL) process, the most efficient design pattern is to touch only the data you must, copying just the data that was newly added or modified since the last load was run. This pattern of incremental loads usually presents the least amount of risk, takes less time to run, and preserves the historical accuracy of the data. In this post, Tim Mitchell shares what an incremental load is and why it is the ideal design for most ETL processes.

Prediction in Azure Machine Learning

After discussing the basic features of Azure Machine Learning and how to clean the data from Azure Machine learning, Dinesh Asanka looks at how to perform prediction in Azure Machine Learning. Prediction is one of the important aspects of machine learning as it will help to make strategic decisions.

Getting Requirements Right with KickOffs and Desk Checks

As any developer and QA engineer can tell you, building something that doesn’t match the requirements requires rework and retesting, which impacts a team’s ability to meet sprint commitment. Maria Rios explains how you can minimize wasted time by adopting the following practices: Story Kick-Off and Story Desk-Check.

Making Buildings Smarter with Azure IoT

Dandy Weyn reflects on how smart buildings provide insights that enable real estate developers, commercial building owners, facilities managers, and tenants to save energy, reduce operational expenses, increase occupant comfort, and meet regulatory and sustainability goals.

Upcoming Virtual Events

Data Modeling with Data Lakes and Power BI - Ike Ellis

Wednesday, September 09. 18:00 UTC.

 


We are recruiting community contributors for all PASS Insights newsletter editions:

DBA, BI, Developer, and Analyst.
We’re looking for engaging, in-depth, and technically-focused ideas to feature each month. If you are interested, submit your ideas to insights@pass.org.
You just might see yourself featured in an upcoming edition!

Looking for more than just BI educational and technical content?
Log in to your PASS account and subscribe to the new Analyst, DBA, and Developer PASS Insights newsletters.