Comments (15)
I think @Hamaze was specifically asking why the API was changed that way. Maybe you can point to the API review?
I know that I'd rather write:
df["Int3"] = df["Int1"] * 2 + df["Int2"];
As opposed to:
df.Columns["Int3"] = df.Columns["Int1"] * 2 + df.Columns["Int2"];
from corefxlab.
There's enough support here to consider bringing back the column name indexer on DataFrame.
I agree. Personally I like the ease of use of df["Int1"]
as well, so I'm glad I'm not alone.
It should be pretty easy to add the API back as a wrapper over the .Columns[string]
indexer, and a test or two. Anyone want to make a PR for that?
from corefxlab.
Hello! Which API are you talking about? I still see this:
from corefxlab.
I referred above the change I'm referring to
from corefxlab.
Ah I see. It still exists! We just moved that to the DataFrameColumnCollection
class. See this for an example:
from corefxlab.
Yeah, the former is more pandas-esque and comfortable IMO.
from corefxlab.
Agreed.
I suspect the reason is for the row filter to work.
But you are more often interested in column selection than row selection so it's better to have the penalty there instead.
df.Rows[2 .. 12]
df["Int3"] = df["Int1"] * 2 + df["Int2"];
from corefxlab.
Just out of curiosity, are you using DataFrame in a notebook? Reason I ask is that we've worked on a cool extension for DataFrame in notebooks that'll let you write df.Int1 * 2 + df.Int2
. To be specific, with the new extension you can now refer to a column as a field of a DataFrame
object. With intellisense enabled in notebooks, this will be very discoverable.
from corefxlab.
I'm not using it in a notebook context for this exercise. Using it in an asp.net app.
from corefxlab.
Haven't tested the new notebook support yet but will do.
I would say that even though notebooks are very useful, I much prefer the experience to be the same when doing normal software and when doing notebooks.
Usually I prototype in notebooks and then structure and copy stuff to some kind of software that is more production-like. So I would avoid using any extensions in notebook except for ones that are interactive such as plotting.
from corefxlab.
I concur with @MikaelUmaN.
For instance, I would expect code that runs in F# kernel notebook can be run in FSI under Visual Studio directly. I would also expect it to be compiled to be part of a bigger production system mixed with C# and F#. That's how I explore my problems in Ifsharp notebook and put them in production all the time.
However, if syntax involving dataframe relies on an extra notebook extension that only works in notebook, the beauty of production-ready scripts is no longer feasible.
cc @cartermp @dsyme to chime in.
from corefxlab.
Just tagging @eerhardt for visibility here. This is great feedback! We're busy helping out with .NET 5 stuff this week, but I'll revisit this next week. There's enough support here to consider bringing back the column name indexer on DataFrame.
from corefxlab.
Just out of curiosity, are you using DataFrame in a notebook? Reason I ask is that we've worked on a cool extension for DataFrame in notebooks that'll let you write df.Int1 * 2 + df.Int2. To be specific, with the new extension you can now refer to a column as a field of a DataFrame object. With intellisense enabled in notebooks, this will be very discoverable.
However, if syntax involving dataframe relies on an extra notebook extension that only works in notebook, the beauty of production-ready scripts is no longer feasible.
Yes, we need to be very careful about promoting non-standard extensions to the programming model for C# or F# which are only deployed only through select channels. Notebook programming should ideally not be using variations of these programming languages, though these things are subtle
This is a tricky area because there is a notable tendency to use the incremantal-dynamicity of notebook programming
@pgovind What APIs are you using to craft this language variation? Please discuss this with @MadsTorgersen, @jaredpar and myself. We can't have random variations on C# and F# floating around that fragment the overall programming experience.
from corefxlab.
So, just to be clear, the extension I'm talking about here is only a prototype to explore the dotnet-interactive extensions APIs. There's no immediate plans to productize it right now, and we definitely don't want to create fragmentation. It lives here: https://github.com/dotnet/interactive/blob/main/src/Microsoft.DotNet.Interactive.ExtensionLab/DataFrameTypeGeneratorExtension.cs
What APIs are you using to craft this language variation?
It's not really a variation. It's a prototype right now (and not part of the type itself). Basically, given aDataFrame
object, it looks at the types of the columns and spits out code to create a newSomeNameDataFrame
type with the column names as properties. This code is then compiled on demand and the dotnet-interactive shell then exposes this type for use in the notebook.
from corefxlab.
@pgovind The problem is that this sort of "generating API from dynamic data" is a completely new thing in the .NET universe (the closest thing is F# type providers, and then source generators, though those are normally part of the static toolchain).
It doesn't really fit any existing part of the existing C#/F#/.NET programming model and can never really be incorporated into project-based programming, for example. It can only be done in notebook-like environments that assume a complete compiler toolchain at each stage of execution, even in production scripts.
It's a powerful thing to be sure but we have to be aware of the direction this is going. I understand why you're thinking of doing this but yes, fragmentation of the programming experience is an intrinsic part of this direction, as tempting as it is.
An approach that does fit within existing norms is to drive the code generation off some kind of static schema (declared or acquired).
from corefxlab.
Related Issues (20)
- unable to get header row HOT 5
- Adding `Microsoft.Data.Analysis` 0.2 to .NET Framework 4.7.1 project adds lots of packages
- Update Spark after 0.3.0 is released HOT 2
- support for skipRows or skipRowPattern
- Support for crosstabs
- "Long to wide" and "wide to long" helper functions
- Passing in DataTypes to DataFrame.LoadCsv loses column name information
- DataFrame.LoadCsv doesn't handle `null` values when it attempts to guess the column types
- Writing dataframe to txt doesn't work properly HOT 7
- DataFrame status HOT 1
- Nuget Package Query (System.IO.FileSystem.Watcher.Polling)
- Question: What is the status of System.Reflection.TypeLoader? HOT 1
- Question about supported platforms in polling watcher HOT 6
- Suggestion - Add constructor for MultiDictionary that takes IEnumerable of KeyValuePair
- A tsv file wouldn't load properly HOT 4
- [DataFrame] Add a LoadCsv overload or other method that takes CSV content directly HOT 1
- dataframe row escape characters HOT 2
- CoreFxLab is now archived! See dotnet/machinelearning and dotnet/runtimelab instead!
- OrderedDictionary indexer conflict HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from corefxlab.