1 Answer

Danblack · Answer 1 · 2024-05-11T13:14:45+0000

Final answer:

To create a Delta table in Databricks with PySpark, you need to import PySpark SQL functions, create a SparkSession, define the schema, create a DataFrame with data and save it as a Delta table using the .format('delta').save() method.

Step-by-step explanation:

To create a Delta table in Databricks using PySpark, follow the steps outlined below:

Start by importing the necessary PySpark SQL functions and types:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, IntegerType
Next, create a SparkSession:
spark = SparkSession.builder.appName("DeltaTableExample").getOrCreate()
Define the schema for your table:
schema = StructType([
StructField("name", StringType(), True),
StructField("age", IntegerType(), True),
StructField("city", StringType(), True)
])
Create a DataFrame with some data:
data = [("John Doe", 30, "New York"), ("Jane Smith", 25, "Los Angeles")]
# Apply the schema to the RDD and create a DataFrame
df = spark.createDataFrame(data, schema)
To save the DataFrame as a Delta table, use the .format() method with 'delta' and call the .save() method:
df.write.format("delta").save("/delta/events")

After performing these steps, you will have successfully created a Delta table in Databricks using PySpark.

0 Comments

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Final answer:

Step-by-step explanation:

0 Comments

Please log in or register to add a comment.

Related questions

Other Questions