Comments (2)
🤖
Hello @spbansal,
I'm Dosu, a bot here to assist you while a human maintainer is not available. I'm designed to help resolve issues, answer questions, and guide you on how to contribute to our projects. I'm here to help, so let's work together to find a solution to your issue.
Thank you for reporting this issue. It seems like the problem is due to the '\n' and single quote characters not being properly escaped when the pandas dataframe is converted to a JSON string. This can be fixed by modifying the _pandas_df_to_json
method in the SendPandasDfToSparkCommand
class to properly escape these characters.
Here's a patch that should fix the issue:
from pandas.io.json import dumps
class SendPandasDfToSparkCommand(Command):
# ...
def _pandas_df_to_json(self, records):
return dumps(records, ensure_ascii=False).replace('\n', '\\n').replace("'", "\\'")
This patch modifies the _pandas_df_to_json
method to replace newline characters ('\n') with their escaped version ('\n') and single quotes with escaped single quotes ('\''). The dumps
function from pandas.io.json
is used to convert the dataframe to a JSON string.
Please apply this patch to your local copy of sparkmagic and test it with your use case. If it works as expected, you may want to submit it as a pull request to the sparkmagic repository so that it can be included in future versions of the software.
As for your question about how the '%%send-to-spark' command is implemented in the sparkmagic codebase, it is implemented in the SendToSparkCommand
class in the sendtosparkcommand.py
file. This class extends the Command
class and overrides the execute
method. The execute
method takes a session as an argument and tries to execute a command that is returned by the to_command
method. The to_command
method takes the session kind, input variable name, input variable value, and output variable name as arguments. Depending on the session kind, it returns a PySpark, Scala, or R command. These commands are created by the _pyspark_command
, _scala_command
, and _r_command
methods respectively. These methods are marked as abstract and need to be implemented in a subclass. They raise a NotImplementedError
if they are not overridden.
I hope this helps! If you have any other questions, feel free to ask.
Best,
Dosu
Sources
- sparkmagic/sparkmagic/tests/test_sendpandasdftosparkcommand.py
- sparkmagic/sparkmagic/livyclientlib/sendtosparkcommand.py
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
from sparkmagic.
Based on the error message, I think that a cell containing double quotes would also be a problem.
I prefer the human suggestion to encode everything in base64 and decode on the other end. It would be safer.
from sparkmagic.
Related Issues (20)
- Plotly scala HOT 6
- Publish sparkmagic Docker images regularly HOT 1
- Run Tests Nightly to Catch Upstream Dependency Issues Earlier
- [BUG] Sparkmagic errors out using iPython 7.33.0 HOT 1
- pip deprecation warning when installing hdijupyterutils and autovizwidget HOT 1
- [QST] How to automatically load sparkmagic.magics when open a new ipython kernel tab HOT 1
- Document extending SparkMagic HOT 2
- how to pass python variable to %%sql cell ?
- [BUG] Default Docker container got broken HOT 4
- Support >= Pandas 2.0.0 HOT 6
- [BUG] error when first client connects HOT 1
- Jupyterlab 4.0.2 python 3.10 HOT 1
- [BUG] Cannot build Dockerfile.jupyter HOT 3
- [BUG] SparkMagic pyspark kernel magic(%%sql) hangs when running with Papermill. HOT 18
- Use variables in %%configure HOT 4
- [BUG] launcher issue using jupyterlab 3.6.3 / sparkmagic 0.21.0 HOT 5
- Support notebook >= 7 HOT 1
- Does sparkmagic support dual scala/python spark session? HOT 3
- [BUG] spark magic cannot be installed on python3.11 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparkmagic.