Work with DataFrames in Azure Databricks

Work with DataFrames in Azure Databricks

Your data processing in Azure Databricks is accomplished by defining DataFrames to read and process the Data. Learn how to perform data transformations in DataFrames and execute actions to display the transformed data.

Data Engineer
Databricks

Module Objectives

In this module, you will:

  • Use the count() method to count rows in a DataFrame
  • Use the display() function to display a DataFrame in the Notebook
  • Cache a DataFrame for quicker operations if the data is needed a second time
  • Use the limit function to display a small set of rows from a larger DataFrame
  • Use select() to select a subset of columns from a DataFrame
  • Use distinct() and dropDuplicates to remove duplicate data
  • Use drop() to remove columns from a DataFrame

Prerequisites

None