Work with DataFrames in Azure Databricks
Your data processing in Azure Databricks is accomplished by defining DataFrames to read and process the Data. Learn how to perform data transformations in DataFrames and execute actions to display the transformed data.
Data Engineer
Databricks
Module Objectives
In this module, you will:
- Use the count() method to count rows in a DataFrame
- Use the display() function to display a DataFrame in the Notebook
- Cache a DataFrame for quicker operations if the data is needed a second time
- Use the limit function to display a small set of rows from a larger DataFrame
- Use select() to select a subset of columns from a DataFrame
- Use distinct() and dropDuplicates to remove duplicate data
- Use drop() to remove columns from a DataFrame
Units
Prerequisites
None