It's definitely not the problem of the resources, but only of the architecture of applications.
(Indeed, the right application is a DBMS, not a spreadsheet.)
I use something like this to fix up column names and import:
sed -i '' -e '1 s/ \/ /_/g; 1 s/[() ]/_/g' $csv_file
sqlite3 $db_file <<SQL
.mode csv
.import $csv_file $table_name
SQL
[1] https://sqlitebrowser.org/Because developers handling gigabyte size data, and wanting to reliably manipulate it in a GUI, cannot possibly be expected to pay/afford the $7/month to Microsoft.
That said, the recommended solution is probably the best option for developers, not bedside because it's free, but for the ability to run complex SQL statements, and visualize the results.
If I were to edit this article, that'd be my takeaway: use tool X for snappy vitalization of SQL queries, even on multi gigabyte sized CSVs
Especially after paying $2000+ for a mac.
It requires a lot of sed awk skills to merge two columns or delete a column. If at all possible
- comma separated, nothing escaped (crash when 1 column contains a comma)
- comma separated, quotes around all elements, quotes not escaped
- comma separated, double-quotes around all elements, double-quotes not escaped
- comma separated, quotes around some elements, quotes not escaped
- comma separated, double-quotes around some elements, double-quotes not escaped
- comma separated, quotes around all elements, quotes escaped (using '')
- comma separated, double-quotes around all elements, double-quotes escaped (using "")
- comma separated, quotes around some elements, quotes escaped (using '')
- comma separated, double-quotes around some elements, double-quotes escaped (using "")
- comma separated, quotes around all elements, quotes escaped (using \')
- comma separated, double-quotes around all elements, double-quotes escaped (using \")
- comma separated, quotes around some elements, quotes escaped (using \')
- comma separated, double-quotes around some elements, double-quotes escaped (using \")
And the Microsoft format (where comma == semi-colon):
- semi-colon separated, nothing escaped (crash when 1 column contains a comma)
- semi-colon separated, quotes around all elements, quotes not escaped
- semi-colon separated, double-quotes around all elements, double-quotes not escaped
- semi-colon separated, quotes around some elements, quotes not escaped
- semi-colon separated, double-quotes around some elements, double-quotes not escaped
- semi-colon separated, quotes around all elements, quotes escaped (using '')
- semi-colon separated, double-quotes around all elements, double-quotes escaped (using "")
- semi-colon separated, quotes around some elements, quotes escaped (using '')
- semi-colon separated, double-quotes around some elements, double-quotes escaped (using "")
- semi-colon separated, quotes around all elements, quotes escaped (using \')
- semi-colon separated, double-quotes around all elements, double-quotes escaped (using \")
- semi-colon separated, quotes around some elements, quotes escaped (using \')
- semi-colon separated, double-quotes around some elements, double-quotes escaped (using \")
And I'm not talking about some weird custom CSV variants to support multi-lines for example or any other "I want to fit a circle in a square" mentality.
I don't know why people doesn't simply create TSV file (Tab-separated). No characters espacing mess. MUCHHH easier to parse.
EDIT: Formatting
A fast viewer a friend of mine created to view large CSVs in a GUI - might be useful to someone.
That said, worth mentioning CSV syntax highlighters for text editors. I’ve found rainbow csv quite helpful (https://github.com/mechatroner/sublime_rainbow_csv)
Seems like a real gap in the software ecosystem atm:
- fast
- no limit on file size
- spreadsheet style layout
- command line
- easily edit and update individual cell => save
I've tried VIM's CSV plugins many times and have never been satisfied.[1] https://stackoverflow.com/questions/41213869/where-to-find-t...
https://www.tadviewer.com/, its a desktop app
edit: one note after seeing other comments based on cli (xsv, sed, awk, etc) — the OP’s use case is something that marketing/pm/business stakeholders can use. my favorite tools are cli-based, however this does not fly with business teams so forget that option.
Use cat, pipe, grep, awk. Problem solved.
The first tools I reach for when dealing with CSVs of these and larger magnitudes are less, cut, awk, etc. They also tend to be the last tools I end up needing.
Having to build a table in a database and import the CSV into that feels a bit like hitting a house fly with a sledgehammer, but it's the most effective way I've seen.
You should be using the Data -> From Text importer, not just double clicking. Also use the latest 64 bit version of Excel.