> Integrating the Jupyter notebook cells inside the Excel grid was a mistake. 100% agreed. this seems like the typical Microsoft behavior where the team A was pushing their product and won over team B, even though it's not really what users needed
> Python in Excel isn’t suitable for Python beginners nor for interactive data analysis. exactly. it is my humble opinion that Microsoft doesn't really understand how Excel is used IRL
also this bit is gold:
> What I find interesting though is the multiline editing experience of PY cells. Why not give the native Excel formula language an upgrade, so I don’t have to write LET expressions such as the following one (that I still find very hard to read):
=LET(x, 1, y, 2, x + y)
> Instead, allow me to write it like so? let x = 1
let y = 2
x + y
and we haven't even talked about =LAMBDA()!> Also, why not turn Excel tables into a native Excel DataFrame? Give them attributes instead of sticking to the functional approach and you’d be looking at something like this:
> =MyTable[#All].GROUPBY(...)
> Lot’s of possibilities to integrate the pandas functionality in a way that feels more native to Excel!
Jackpot. I'm literally building this formula language in a new spreadsheet app (it's early days but I'm eager to share it on HN when the MVP is ready!) and am writing a paper on dataframes vs. spreadsheets, so reading this has warmed my heart on this rainy afternoon
also as a former Django lover (I still love it, i just don't use it as much), `=MyTable[#All].GROUPBY(...)` reminds me of its ORM...
"hmm emoji" indeed.....
=LET(
x,1, // assign 1 to y
y,2, // assign 2 to y
x+y // add x and y
)Same with LAMBDAs
[0] https://www.ablebits.com/office-addins-blog/formula-bar-exce...
I'd love to read your paper when it's ready.
> We wanted an alternative to VBA, but got an alternative to the Excel formula language
Optimistically, I would guess that the powers that be inside MSFT wanted to show they "integrated" Python into Excel but didn't really want to mess too much with a product that has stabilized over the past 30 years. Cynically, I would say that they've messed up the implementation on purpose so they can put a bullet through the "Python in Excel" idea without actually doing anything useful.
A much better way to add Python (or any modern scripting language really) to Excel is to 1. Make a special library that can be called from the language and which only works with Excel. 2. Bundle a minimal interpreter so that the scripts can be run locally without pushing everything to some godforsaken Azure datacenter. Make this a downloadable plugin for all currently supported versions to ensure backwards-compatibility.
This is a much better way to ensure that Python becomes the glue language for Excel. There are so many applications where reporting, analysis and visualization can be better achieved through automation using scripting languages. VBA is old, limited and is yet another thing for someone to learn if they want to script any Office product. All of these are hurdles that can be easily removed. I guess though that the part that really sucks is that these are all things that the MSFT of the 80s/early 90s would have done in a heart beat if it meant more people would use Excel (they built in bugs from Lotus 1-2-3 for God's sake) but I guess it's a different story today when everybody does use Excel.
This describes the feeling of all major features added to the native app incumbents that built their foundations in the 90s: Office, Creative Cloud, etc
You really get the feeling the core dev teams of these apps have a poor understanding of their own foundations and even adding a button is an ordeal for them.
It probably is, though..
There are plenty of those libraries without the "only works with Excel" part, if you mean to have Excel running on the machine. As a Python example, a customer of mine is using XlsxWriter and openpyxl for .xlsx files, xlrd for .xls plus python-docx for dealing with docx files.
I don't remember why the two modules for xlsx files and not just one. My customer runs that software on Linux inside a Django app. I expect that Microsoft is interested only in Excel on Windows and in running Python inside Excel. What I expected was a VBA editor for Python and maybe a library for Windows to access Excel objects from a Python script in a cmd or powershell prompt.
This is exactly it. The killer feature for including any modern scripting language in Excel. Both XlxsWriter and openpyxl can r/w from Excel files but I have to manipulate the data using another library like `pandas`. Instead if MSFT gave a library which I can import into a Python script and use like
import msft_excel_lib as xl
data = xl.get('A1:A3')
sum = xl.sum(data)
xl.write("B3", sum)
would be much better than whatever it is they have shipped today without having to make much changes to anything else. I wouldn't even grudge them if they say that this library can run under some weird virtual environment found only within Excel to maintain product retention.Last time I checked OpenPyXl doesn't deal correctly with .xlsm files - there's a parameter for that but I believe it's still experimental. In my case this meant that, on a Mac, Excel would complain that a file generated with OpenPyXl was corrupt and then successfully "recover" every generated file.
My wild guess is: your customer reads the files with OpenPyXl, processes the data with Pandas, and then uses XlsxWriter as the custom Excel writing engine.
Sounds exactly right. Embrace/extinguish. It's The Way.
[1] https://notes.iunknown.com/python-in-excel/Book+of+Python+in...
Incidentally I've worked on many products in the past, and I've never seen anything that approaches the level of product-market-fit that this feature has.
Also, this is the work of many people at the company. To them go the real credit of shipping and getting it out the door to customers.
Thanks for the post. Very insightful!
I'm looking forward to trying out your pywasm module, or should it be ESP (Excel Subsystem for Python)?
The spreadsheet paradigm is immensely intuitive and arguably the only alternative to the standard procedural programming currently in use in number crumching.
But therein lies also a major weakness when used for important tasks: hard to validate.
Once you further combine it with API calls and whatnot, the situation gets totally out of hand: how do you reproduce anything?
The landscape around user interfaces, computational capability and (most importantly) the ever deeper embedding of such tools in decision making suggests to start taking the humble spreadsheet seriously and maybe that requires going back to the drawing board.
We’re working on a more approachable implementation of that paradigm at https://causal.app
Oh, no. This is MS Excel 4.0 Macro sheets (Ctrl+F11) all over again, except somehow with a worse execution order.
> It prevents you from referencing a cell with a Python object directly, as you’re always running the risk that someday, someone switches that cell to Values mode, which would break any formula that references this cell in object mode.
That's not a major concern. Excel already has a dozen footguns painted exactly this colour.
> So which output is my df1, and which one is my df2? I have no idea unless I look up the code that is sitting in cells L2 and K2
Excel still lets you name cells, does it not?
> In fact, I agree that the cloud is the best way to bring Python to Excel.
Heretic!
It does, but that too is a half assed feature so users struggle with it IRL
Half the features of Excel cannot deal with named cells. Try for example to use named cells in Conditional Formatting. Doesn't work at all, ranges cannot be named cells or tables, and condition formulas can only reference named cells with extremely ugly contortions.
There are lots and lots of additional examples of similar problems.
Excel is a loose agglomeration of unrelated features hidden under a pretty GUI.
- make a python library/driver for Excel sheets that imitates Pandas Dataframe API but (a) has reactive cells (b) imports equations from Excel
- make a notebook IDE that integrates excel spreadsheets. Make everything reactive. Have a split notebook/spreadsheet view.
A little off-topic but it would be rad if Excel had more Airtable/Baserow/Grist features.
I rarely want to access Python from my Excel. I quite often want to access Excel from my Python.
A spreadsheet is a great GUI for a lot of things. A lot of people are employed creating "shitty version of Excel but can be driven from any of <web, desktop, application, websockets, WASM, etc.>"
Being able to easily drive Excel from an external Python program would make for a nice cross-platform application substrate.
I agree that Excel as a GUI can be phenomenal.
Being able to steadily enhance a spreadsheet into a real application without rewriting from scratch could be a real game changer.
I’ve used it at work some years ago, and it’s a great product.
:)
i am baffled. power query / M is taking over the corporate world and its extremely, extremely useful. its one of the most useful things that nobody in open source world has tried to copy (that im aware of).
I attempted it once and stop as I found error checking impossible, source control impossible and ability to see code changes impossible - compared to a simple python script that is just text and can be source controlled.
Interested in a different opinion as maybe I missed something.
Or more broadly, the inner-platform effect. I'm left wondering "why would you use python within the lousy coding interface of excel, when Pandas already exists?"
Students graduating today are comfortable in Jupyter notebooks but not a CLI.
Also on HN front-page is a full implementation of llm.c in Python-superset Mojo.
Once Python takes off in the browser, either by compiling to WASM (via something like Mojo) or interpreted by PyScript, it's over.
Total GvR victory.
I would not either, given the attitude of Python core towards security and correctness.
F2 should enable the user to trace calculations back to the original data, not only the current calculation step.
In basis spreadsheet, this would yield a jumble of steps that are undocumented and hard to parse. But that's where the python, lambdas and custome functions come in: They enable legibility of turboF2.
I have used excel in the past, and I am a long term python user. But if you asked me today what I really wanted to make my life easier and ultimately a product or business better using only excel? I would ask for lua or scheme. I don’t need a batteries included environment embedded into a spreadsheet. I just want sane syntax for common functionality which does not require arcane knowledge and long forgotten wisdom.
I don't get your "shouldn't need batteries-included environment" objection; MSFT is bundling Anaconda distribution libraries with Excel. I'd expect it works seamlessly online and offline, as far as everything supported by Python stdlibs. (Can you actually point to any real problem with the batteries?) Really the only part I see you can quibble is things that are currently only implemented in uncommon third-party libraries, i.e. not stdlibs/numpy/scipy/scikit-learn/pandas/polars and the main plotting, data-science, ML, DB and web libraries.
> I just want sane syntax for common functionality which does not require arcane knowledge and long forgotten wisdom.
Show us some Python syntax for common functionality in Excel which does require arcane knowledge and long forgotten wisdom. Otherwise, this is purely your conjecture.
(If anything, bundling Python with Excel will stimulate healthy discussion towards which Python stdlibs need to be added/enhanced/changed, and which third-party libraries should be upgraded to stdlibs.)
I personally don’t think python is the problem here, but if their users can learn python they can certainly learn lua.
We can't all be special snowflakes, python and excel are lingua francas.
I get why they chose Python for this and it's not all that hard to embed, well the interpreter anyway, compiled modules are another story.
Scheme? Did somebody say... Scheme?
https://apexdatasolutions.com/home2/acce%CE%BBerate/
It's a paid addon, but still...
Excel's changes will end up eating their reason to exist.
And anyway, Microsoft did this in reaction to increasing demand. The pie will grow faster than Microsoft's pie slice will take from this guy.
1. Sandboxing and dependencies. Python is extremely unsafe to share, so you need to sandbox execution. There's also the environment/package management problem (does the user you're sharing your workbook with have the same version of pandas as you?). We run workbooks in the cloud to solve both of these.
2. The type system. You need a way to natively interop between Excel's type system and Python's much richer type system. The problem with Excel is there are only two types - numbers and strings. Even dates are just numbers in Excel. Python has rich types like pandas Dataframes, lists, and dictionaries, which Excel can't represent natively. We solved this in a similar way to how Typescript evolved Javascript. We support the Excel formula language and all of its types and also added support for lists, dictionaries, structs, and dataframes.
3. Performance. Our goal was to build a spreadsheet 1000x faster than Excel. Early on we used Python as our formula language but were constantly fighting the GIL and slow interpreter performance. Instead we implemented the spreadsheet engine in Rust as a columnar engine and seamlessly marshal Python types to the spreadsheet type system and back.
It's the hardest systems problem our team's ever worked on. Previously we wrote the S3 file system, so it's not like this was our first rodeo. There's just a ton of details you need to get right to make it feel seamless.
You can try it free here: https://rowzero.io/new?feature=code
I wanted a spreadsheet interface, which my business partners need, but with a way for power users (me) to do more complicated stuff in Python instead of VBA.
To borrow your phrasing, our thesis is that it has to be Excel-compatible spreadsheet + something, not necessarily Excel + something. It's early days for us, but we've seen a couple publicly traded companies switch off Excel to Row Zero to eliminate the security risks that come with Excel's desktop model.
I would say Typescript is a more obvious choice, or potentially Dart. Maybe even something more obscure like Nim (though I have no experience of that).
I get that you want compatibility with Pandas, Numpy, etc. but you're going to pay for that with endless pain.
do you have a desktop app in the works?
Show HN: I've built a C# IDE, Runtime, and AppStore inside Excel
The pro version has quite a price tag, so I’m assuming they have a big value add.
They seem to be pouring money into LAMBDA.
Yes, we do need some kind of tabular data UIs but SMALL UIs elements, not the base of the UI. Beside that the obscene "wrapping" of Python to make it "user-safe" makes also it next to useless.
A less ugly approach to tabular UIs is the one from R-Studio, while it's limited, another is org-mode tables while again they are limited in UI terms. Both are a bit better than Jupyter REPL model witch is itself far better than a spreadsheet.
But why?
The most basic error is WYSIWYG, it can work to a certain extent, like CAD systems, where you can "free draw" something but any line have defined parameters, we have snap concept to state a line is connected to another, defined properties like length and so on so in the end we have "a model", a free draw witch is also a set of data constrained/in relation with a set of functions but in general it's a nightmare. The second basic error is try to compensate the lack of integration, a system designed for commercial purpose in witch any software is a standalone closed product, by adding features that in theory can only grow to create "the complete thing" witch is an impossible goal and makes in practice just bloatware. The third basic error is consider users a bunch of imbeciles who can only learn a thing "click around on some pictogram and enter text like a chimp plunging fingers slow and hard". If you craft something simple and explain the principle behind people will learn and act. Oh, of course you need a bit of INITIAL training but after it they are ready, like teaching how to fish vs giving a fish to a hungry person. Oh, of course commercially hungry people are nice, they are tied to the vendor and they keep spending, being locked where they are, but locked people are not productive, like slaves do not work well compared to free employees.
That's the spreadsheet mistakes and modern IT mistake in general: the will to keep users ignorant while trying to made them productive anyway instead of teaching them, giving them powerful tools. It's not transforming anyone in an engineer, it's just the same concept of schooling for all, ensuring anyone have a minimum culture needed to be a Citizen in a society.
Also, there are a few excel handling packages for Python. I've even made graphs and everything. Because sometimes other people like to use Excel and that's fine. I'll just stick with those instead.
I suggest you do what my brother does and unanswer the question! He works for quite a large firm and is surrounded by quite a lot of data. He is a Windows user by routine and asked me for some help. I got Python + MS Visual Code installed and integrated for him and off he went.
He is an expert with the data and its "knowledge" and was willing to roll up his sleeves and get to grips with a grubby data processing facility. He decided on Python and he is my "customer" so I did the best I could too hence anaconda and VS code.
He grabs data out of SAP (he's senior enough to get IT to do his bidding) and then passes that through Python scripts and then passes that on to Excel for reporting.
What many seem to forget is that most apps are chainable. It's often referred to as the "unix philosophy" - each component should do one job and do it well. That's nice but also bollocks when abused as I have just done.
If Python is not integrated properly within Excel then do it yourself via whatever interfaces are available. Pass in and out with .csv or whatever. You just need some imagination.
For me: I use Libre Office and despite owning my company, I don't require everyone else to do so. I believe in freedom and expression of choice.
If there are some self service BI, most likel self service is only by name.
Add some variants related to security, max number of licenses allowed (hilarity ensures if IT bears the cost on behalf of the business without being able to charge back), etc
As it stands excel is a better presentation layer than almost all BI tools once you're past the modelling and analysis stages.
I've worked for other firms and then my own for the last 24 years. My job title is Managing Director but I am under no illusion that my word is final. It is really final but only when I say so and I never do.
Oh a BI thingie. Yes that will fix everything. No it wont.
See the SFDocuments.Calc service there. ScriptForge makes it more convenient to work with the API.
Using it with Python: https://help.libreoffice.org/latest/en-US/text/sbasic/shared...