Back when I worked in LA my CTO used to joke that most places use Microsoft Outlook as a database and Excel as BI tool.
[Source: I was friends with the guy who wrote it as well as other EToys employees. God that was a trainwreck.]
a32c/4214/585e/9cb7/a554/74133a5fc986
a32c/
4214/
585e/
9cb7/
a554/
74133/a5fc986
The advantage of this kind of structure is that you never need to manually scan a directory since you know exactly what path you're trying to open. You still incur the OS lookup time for the inode-equivalent in the directory entry, but a deeper heirarchy keeps that faster. You can trade off time to traverse the heirarchy versus number of entries in the final directories by adjusting the length of the hash chunk you use at each level. Two characters will put vastly fewer entries at a given level, but vastly increase your directory depth.Basically if you're manually scanning the heirarchy for anything but a consistency check or garbage collection you've already lost.
18:35 $ tree .git/objects/
.git/objects/
├── 02
│ └── 9581d0c8ecb87cf1771afc0b4c2f1d9f7bfa82
├── 3b
│ └── 97b950623230bd218cef6aebd983eb826b2078
(...)
├── info
└── pack
├── pack-b1fe2364423805afb6b1c03be0811c93b19dedc9.idx
└── pack-b1fe2364423805afb6b1c03be0811c93b19dedc9.pack
10 directories, 10 filesWould love to talk to anyone on the EToys team or anyone who has done something similar.
I'm @akamaozu on twitter.
I had to work on a tool that shows what's wrong with an assembly line: missing parts, delays, etc... So that management can take corrective action. Typical "BI" stuff but in a more industrial setting.
The company went all out on new technologies. Web front-end, responsive design, "big data", distributed computing, etc... My job was to use PySpark to extract indicators from a variety of data sources. Nothing complex, but the development environment was so terrible it turned the most simple task into a challenge.
One day, the project manager (sorry, "scrum master") came in, opened an excel sheet, imported the data sets, and in about 5 minutes, showed me what I had to do. It took me several days to implement...
So basically, my manager with Excel was hundreds of times more efficient than I was with all that shiny new technology.
That experience made me respect Excel and people who know how to use it a lot more, and modern stacks a lot less.
I am fully aware that Excel is not always the right tool for the job, and that modern stacks have a place. For example, Excel does not scale, but there are cases where you don't need scalability. An assembly line isn't going to start processing 100x more parts anytime soon, and one that does will be very different. There are physical limits.
The devil is in the details, and software is nothing but details. The product owner at the company I work for likens it (somewhat illogically, but it works) with constructing walls. You can either pick whatever stones you have lying around, and then you'll spend a lot of time trying to fit them together and you'll have a hell of a time trying to repair the wall when a section breaks. Or you can build it from perfectly rectangular bricks, and it will be easy to make it taller one layer at a time.
Using whatever rocks you have lying around is like building a prototype in Excel. Carefully crafting layers of abstraction using proper software engineering procedures means taking the time to make those rectangular bricks before building the wall. End result more predictable when life happens to the wall.
Unfortunately which specific features of Excel are acceptable to remove are unknown until you have already way over invested into the project.
The best I've seen this done is having Excel as a client for your data store. Where read access is straightforward and write can be done via csv upload (and heavy validation and maybe history rollback).
That way the business can self-service every permutation of dashboard/report they need and only when a very specific usecase arises do you need to start putting engineering effort behind it.
I suppose you can also supplement the Excel workflow with a pared down CRUD interface for the inevitable employee allergic to excel.
Here is another option that we use instead of CSV import.
Our applications support custom reports and custom fields.
Users can define new reports and run them on demand.
They can also define custom field types with validation, data entry support, etc.
This combination provides some of the extensibility of Excel while retaining the advantages of an application.
Edited for wording changes.
You can complain about their solution or see it as an opportunity.
I posted elsewhere[0] in this thread about my employer's practice of replacing shared spreadsheets with web applications.
This approach works quite well for us and I would encourage you to consider it as an option.
Confluent, the company behind Kafka, are 100% serious about Kafka being a database. It is however a far better database than MongoDB.
Many of my employer's applications started out as a shared spreadsheet or Access database.
Our development team worked with the users and built a web application to solve the same problem.
This approach has a lot of advantages:
* The market exists and has an incumbent. There's a lower risk of a write-off.
* The users are open to process changes. You still have to migrate people off of the spreadsheet, though.
* It's easy to add value with reporting, error checking, concurrent access, and access control.
* You can import the existing data to make the transition easier. This will require a lot of data cleaning.
Edited to add the following text from another post.
You can cover most of the requirements with a set of fixed fields.
The last 10% to 20% of the use cases requires custom reports and custom fields.
Users should be able to define their own reports and run them without your involvement.
They should also be able to define custom field types with validation, data entry support, etc.
If your web application has these two features and other advantages then you should be able to replace Excel.