Module: Organize a Fabric lakehouse using medallion architecture design
Lab/Demo: 03b - Create a medallion architecture in a Microsoft Fabric lakehouse
Task: Transform data and load to silver Delta table
Step: 09
Description of issue
Cell execution fails
---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
Cell In[17], line 21
3 from pyspark.sql.types import *
4 from delta.tables import *
6 DeltaTable.createIfNotExists(spark) \
7 .tableName("sales.sales_silver") \
8 .addColumn("SalesOrderNumber", StringType()) \
9 .addColumn("SalesOrderLineNumber", IntegerType()) \
10 .addColumn("OrderDate", DateType()) \
11 .addColumn("CustomerName", StringType()) \
12 .addColumn("Email", StringType()) \
13 .addColumn("Item", StringType()) \
14 .addColumn("Quantity", IntegerType()) \
15 .addColumn("UnitPrice", FloatType()) \
16 .addColumn("Tax", FloatType()) \
17 .addColumn("FileName", StringType()) \
18 .addColumn("IsFlagged", BooleanType()) \
19 .addColumn("CreatedTS", DateType()) \
20 .addColumn("ModifiedTS", DateType()) \
---> 21 .execute()
File /usr/hdp/current/spark3-client/jars/delta-core_2.12-2.4.0.8.jar/delta/tables.py:1330, in DeltaTableBuilder.execute(self)
1321 @since(1.0) # type: ignore[arg-type]
1322 def execute(self) -> DeltaTable:
1323 """
1324 Execute Table Creation.
1325
(...)
1328 .. note:: Evolving
1329 """
-> 1330 jdt = self._jbuilder.execute()
1331 return DeltaTable(self._spark, jdt)
File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw)
171 converted = convert_exception(e.java_exception)
172 if not isinstance(converted, UnknownException):
173 # Hide where the exception came from that shows a non-Pythonic
174 # JVM exception message.
--> 175 raise converted from None
176 else:
177 raise
AnalysisException: [SCHEMA_NOT_FOUND] The schema `sales` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
Repro steps:
- Follow Lab steps
- Run cell
- Error generated
- Remove schema from tableName
Task: Transform data for gold layer
Step: 03
Description of issue
Cell execution fails
---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
Cell In[8], line 2
1 # Load data to the dataframe as a starting point to create the gold layer
----> 2 df = spark.read.table("Sales.sales_silver")
File /opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py:471, in DataFrameReader.table(self, tableName)
437 def table(self, tableName: str) -> "DataFrame":
438 """Returns the specified table as a :class:`DataFrame`.
439
440 .. versionadded:: 1.4.0
(...)
469 >>> _ = spark.sql("DROP TABLE tblA")
470 """
--> 471 return self._df(self._jreader.table(tableName))
File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw)
171 converted = convert_exception(e.java_exception)
172 if not isinstance(converted, UnknownException):
173 # Hide where the exception came from that shows a non-Pythonic
174 # JVM exception message.
--> 175 raise converted from None
176 else:
177 raise
AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `Sales`.`sales_silver` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.;
'UnresolvedRelation [Sales, sales_silver], [], false
Repro steps:
- Follow Lab steps
- Run cell
- Error generated
- Remove schema from tableName
Step: 04
Description of issue
Cell execution fails
---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
Cell In[14], line 13
2 from delta.tables import*
4 # Define the schema for the dimdate_gold table
5 DeltaTable.createIfNotExists(spark) \
6 .tableName("sales.dimdate_gold") \
7 .addColumn("OrderDate", DateType()) \
8 .addColumn("Day", IntegerType()) \
9 .addColumn("Month", IntegerType()) \
10 .addColumn("Year", IntegerType()) \
11 .addColumn("mmmyyyy", StringType()) \
12 .addColumn("yyyymm", StringType()) \
---> 13 .execute()
File /usr/hdp/current/spark3-client/jars/delta-core_2.12-2.4.0.8.jar/delta/tables.py:1330, in DeltaTableBuilder.execute(self)
1321 @since(1.0) # type: ignore[arg-type]
1322 def execute(self) -> DeltaTable:
1323 """
1324 Execute Table Creation.
1325
(...)
1328 .. note:: Evolving
1329 """
-> 1330 jdt = self._jbuilder.execute()
1331 return DeltaTable(self._spark, jdt)
File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw)
171 converted = convert_exception(e.java_exception)
172 if not isinstance(converted, UnknownException):
173 # Hide where the exception came from that shows a non-Pythonic
174 # JVM exception message.
--> 175 raise converted from None
176 else:
177 raise
AnalysisException: [SCHEMA_NOT_FOUND] The schema `sales` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
Repro steps:
- Follow Lab steps
- Run cell
- Error generated
- Remove schema from tableName