Pair it with a better, more modern, and safer native-compiled language and get the same effect. Zig, Nim, Go, hell even Carp.
I love how trollish it is not to talk about Rust in that context.
There’s been more than one time where I’m in some large auto tools based project trying to figure something out and there’s a call out to some dependencies I have no idea of.
Also many of the projects lack and sort of documentation or source code commenting. These aren’t someones pet project either. One of them was from a notable name in the open source community and the other one was a de-facto driver in a certain hardware space.
Then there are tools like SourceGraph, CppDepend among others.
That being said, I love seeing a push for simple stacks like this.
*gasp!* Such lack of symmetry... it disturbs something deep in my soul.
https://github.com/antirez/sds
However the moment you call into other C libraries, they naturally only expect a char *.
You will have to define "good". My string library[1][2] is "good" for me because:
1. It's compatible with all the usual string functions (doesn't define a new type `string_t` or similar, uses existing `char *`).
2. It does what I want: a) Works on multiple strings so repeated operations are easy, and b) Allocates as necessary so that the caller only has to free, and not calculate how much memory is needed beforehand.
The combination of the above means that many common string operations that I want to do in my programs are both easy to do and easy to visually inspect for correctness in the caller.
Others will say that this is not good, because it still uses and exposes `char *`.
[1] https://github.com/lelanthran/libds/blob/master/src/ds_str.h
[2] Currently the only bug I know of is the quadratic runtime in many of the functions. I intend to fix this at some point.
Using go, I thought I was getting back to low level stuff but this C experience made me appreciate strings in Go. Web servers in C are crazy bad idea, especially if they are spitting out html. Lisp would be better. Node would be better. Go would be better.
Otherwise, yes using anything safer, where lack of bounds checking isn't considered a feature is a much better option.
At least if you are going to use C, you (should) know to be extremely paranoid about how you process anything received from the user. That doesn't remove the risk but at least you are focused on it.
Will generate Rust and typescript if ya want.
Unfortunately, in the flux of user requirements—each addition or modification of a table column changing select routines, insertions, validation, exporting, regression tests, and even (especially?) front-end JavaScript—we make mistakes. What BCHS tools beyond the usual can help in this perennial test of our patience?
;)
I checked BCHS a few years back, the key piece is that it's Openbsd, if it's Linux it might have caught on, due to linux's popularity, good or bad. This could be useful for embedded device for example, but not so many embedded devices running OpenBSD, if any at all.
I can't speak to why you'd want to use C in a web stack, but I can weigh in in the more general sense:
A while ago I thought I'd try my hand at the Cryptopals challenges, and I figured, hey all the security guys know C (and python, but ugh) so I'll use this as an opportunity to really learn C. Prior to starting that project, I "knew" C, in the sense that I took CS036 which was taught in that small subset of C++ that might as well be C.
So I jumped in and it felt really liberating, at first. You want to compare chars? It's just a number, use >= and <= ! Cast it back and forth to different types! Implement base64 by just using a character array of the encoding alphabet and just walk through it like an array and do magic casts! No stupid boilerplate like namespaces or static class main{}!
Then by about the 2nd set where you have to start attacking AES ECB I realized I was spending more time debugging my mallocs/frees and my fucking off-by-one errors than I was spending on actually learning crypto. I stuck with it until I think part way through the third set but by that point I couldn't take it any more.
So I bailed out of C and never looked back. I can see how a certain type of programmer (who is more practiced with malloc/fastidious with their array indices and loop bounds than I am) can really enjoy C for a certain type of work. But I can actually say now, hand on heart, that I know C; and I don't like it.
Is there any language (other than assembly) that is faster at runtime than C today?
In what way C makes you pay for what you don't use?
[1] Wapp - A Web-Application Framework for TCL:
[2] EuroTcl2019: Wapp - A framework for web applications in Tcl (Richard Hipp):
(They do have "pledge" but even in the most restricted case, this still leaves full access to database)
It's definitely contrary to modern assumptions about web app security, but it's interesting to see web apps that are secure because they use OS security features as they were designed to be used, rather than web apps that do things that are insecure from an OS-perspective, like handling requests from multiple users in the same process, but are secure because they do it with safe programming languages.
So no, the web apps cannot be made secure via OS support alone, because the OS security features are not adequate for high-level problems. Any sort of code exploit allows attacker to trivially access the entire database -- either to read anything, or to overwrite anything.
"pledge" and "unveil" can prevent new processes from being spawned, but they cannot prevent authentication bypass, database dumpling or database deletion.
C is a tremendous tool, but I don't think it's the best for customer facing web apps.
Not only was “not so long ago” kind of at the very beginning of meaningful web history but it was also for a very brief moment in time ( if we are talking pre-Perl ). Pre-Perl CGI may have never been a thing though as Perl is older than CGI.
I recall PHP being the next wave after Perl. One could argue it never lost its place even if it now has many neighbours.
Not a Perl advocate by the way though it did generate some pretty magical “dynamic” web pages from text files and directory listings back in the day. Similar story with PHP.
By 1999 I was already using our own version of mod_tcl and unfortunely fixing exploits every now and then in our native libs called by Tcl.
There's nothing fundamentally insecure about allowing C or any arbitrary code to execute on behalf of a user -- this is basically what cloud computing (especially "serverless") is.
As you identify, though, you need a Controlled Interface (CI) which accounts for this model for all resources and all kinds of resources and many tools do not (yet) allow for it.
Compare it with C, where the bugs are likely unique per app, and require non-trivial effort to detect and fix.
Execution of user-specific code by serverless services requires non-trivial isolation, and is predicated on "each user has its own separated area" to work. This is not the case with most websites. Take HN for example -- there is a shared state (list of posts) and app-specific logic of who can edit the posts (original owner or moderator). No OS-based service can enforce this for you.
For uptime.is I’ve used a stack which I’ve started calling BLAH because of LISP instead of C.
There's other examples of this kind of approach, too, writing straight C Common Gateway Interface web applications in public-facing production use - What comes to mind is the version control system web frontend that the people who write wireguard use, cgit [2] - If it's really so crazy then how come the openbsd and wireguard people - presumably better hackers than you - are just out there doing it?
Other places you see C web application interfaces include in embedded devices (SCADA, etc) and even the web interfaces for routers, which unfortunately ARE crazy because check out all the security problems! Good thing people at our favorite good old research operating system have done the whole pledge(2)[3] syscall to try and mitigate things when those applications go awry - understanding this part of the whole stack is probably key to seeing how any of it makes any sense at all in 2022. It sure would be nicer if those programs just crashed instead of opening up wider holes. Maybe we can hope these mitigations and a higher code quality for limited-resource device constraints all become more widespread.
[1] http://undeadly.org/src/ [2] https://git.zx2c4.com/cgit/ [3] https://learnbchs.org/pledge.html
Probably precisely because they're better? I can see why people who are struggling with malloc and off-by-ones (https://news.ycombinator.com/item?id=29990985) would think it's crazy.
pkg_add sqlite3
Can't get away.
Berkeley DB with a header date of 1994 :) In base, and of course it still works.
Sqlite was removed from base, again, in 6.1 (2019) --https://www.openbsd.org/faq/upgrade61.html
with this BSDCAN '18 pdf briefly explaining the issues (unmaintainable) -- https://www.openbsd.org/papers/bsdcan18-mandoc.pdf
We like seeing what we can get away with using what's available in the base distribution and a few well-chosen, well-audited packages
I think the correct pronunciation is “Breaches”. Using C in this place as other have mentioned is very, very likely to lead to security issues. Even C++, with its better string handling would be a step up.
Database stuff took a good deal of doing, but with little in terms of abstraction, it was also quite fast.
I would like to see a rennescance of using different protocols than HTTP and different content markup than HTML.
I've been reading about / hacking on CGI recently, and it's been kinda fun!
Question: One thing I keep reading is how inefficient it is to start a new process for each incoming connection. Could someone explain to me why that's such a bottleneck? I imagine it being an issue back when CGI was used everywhere, people moving away from CGI, and forgetting about it. But hasn't there been improvements in the meantime? Computers from today can run circles around those from a few decades back. Has everything improved except the speed / efficiency of starting a new process?
(I don't have a computer science background, but I guess you could already tell from the above.)
>
>I've been reading about / hacking on CGI recently, and it's been kinda fun!
>
>Question: One thing I keep reading is how inefficient it is to start a new process for each incoming connection. Could someone explain to me why that's such a bottleneck? I imagine it being an issue back when CGI was used everywhere, people moving away from CGI, and forgetting about it. But hasn't there been improvements in the meantime? Computers from today can run circles around those from a few decades back. Has everything improved except the speed / efficiency of starting a new process?
>
It's not as bad as you think it is; just change the webserver to pre-fork. From this link[1], and the nice summary table in this link[2] - I note the following:
1. pre-forked servers perform very consistently (the variation before being overwhelmed) and appears at a glance to only be less consistent than epoll.
2. For up to 2000 concurrent requests, the pre-forked server performed either within a negligible margin against the best performer, or was the best performer itself.
3. The threaded solution had the best graceful degradation; if a script was monitoring the ratio of successfull responses, it would know well beforehand that an imminent failure was coming.
4. The epoll solution is objectively the best, providing both graceful degradation as well as managing to keep up with 15k concurrent requests without complete failure.
With all of the above said, it seems that using CGI with a pre-forked server is the second best option you can choose.
I suppose that you then only have to factor in the execution of the CGI program (don't use Java, C#, Perl, Python, Ruby, etc - very slow startup times).
[1] https://unixism.net/2019/04/linux-applications-performance-i...
[2] https://unixism.net/2019/04/linux-applications-performance-p... 1.
Currently the CGI stuff I'm working on is to run stuff on a cheap shared host, so I'll have to check which category of servers that Apache falls in.
Once an application I'm running on a shared host becomes successful enough, I'm probably going to want to move to a different environment, but I'm still interested in what that would mean for performance :)
You might want to look at using FastCGI:
https://en.wikipedia.org/wiki/FastCGI
Basically, the CGI processes stay alive and the servers supporting FastCGI ( like Apache and nginx ) communicate with an existing FastCGI process that's waiting for more work, if available.
For my current use-case* that wouldn't be an issue, so CGI could probably be OK there, then!
* A side project that uses SQLite (1 file per user), and no other external resources.
Yes, it’s less efficient than having a persistent server, but as all things are, it exists in a spectrum.
The load time for one of these processes is going to be almost trivial. I’m on mobile right now, but I would guess that it would be in a handful of milliseconds, especially when the binary is already in cache (due to other requests).
But if you want to compare this against a lot of the prevailing systems, it’ll still probably win on single request efficiency. Network hops, for example, are frequently quite slow and, if efficiency is your primary metric, should be avoided as much as possible. Things like Serverless go the opposite way and tore both your incoming request through a complex set of hops, and also your backend database requests.
I guess I should do some benchmarks comparing different technologies.
> Things like Serverless go the opposite way and tore both your incoming request through a complex set of hops, and also your backend database requests.
I didn't know about that, thanks. If you know some good resources on the topic, feel free to put them in a reply to this message!
- I feel that they are linux only. On my MacOS system I can’t rely on man x being the man page for the right version of x. I know that in principle there are environment variables that make sure i’m getting the gnu core utils version or the base homebrew version rather than the system BSD version, but it’s too many moving parts. Furthermore even if I get it right, I can’t expect people I’m working with or mentoring to get it right, hence I can’t recommend man to them for documentation. God knows about man pages on Windows.
- I feel that a small amount of plain text documentation should be stored in the executable, not separately. Isn’t it a holdover from the vastly more constrained computing environments of the 70s and 80s that we’re keeping man pages separate from the executable? Its just asking to get out of sync / incorrectly paired up.
Also, man pages are for more than just system utilities (man(1)). Which binary should hold pledge(2) (https://man.openbsd.org/pledge), exactly?
Your man pages should be updated when the associated tool is updated.
You are describing a MacOS issue, with its terrible package management, and frustrating toolchains.
In fact MacOS has an excellent package manager -- it's called homebrew. I don't really want to argue about it but you're the one who made an unjustified assertion about an OS which I bet you don't use. People like you insist that it's bad but no-one who uses it knows why. I maintained my own Linux laptop for 10 years, and for the last 10 years I've used homebrew on a Mac. It has literally never given me any problems! I've never even searched the issues on Github for a problem as far as I can remember.
Honestly I think that the thought processes of most Linux/Unix enthusiasts like you who criticize homebrew are
1. We hate MacOS because childish anti-capitalist ideologies
2. Therefore we will not admit that a nice command-line development environment can be created on MacOS
3. Therefore homebrew is bad
They're actually better on Free/Open BSD in my experience. As stupid as it makes me sound, I often struggle to parse Linux man pages, but the BSD's I've had no trouble with for a variety of topics.
> - I feel that a small amount of plain text documentation should be stored in the executable,
Isn't this how --help usually works? I would also rather have more documentation embedded, at least for some executables.
Comparatively, I’ve found NetBSD documentation to be lacking, although NetBSD seems to take the cake on code quality and legacy architectures (a feature I find my delve right now.
On the wider discussion of doc, I’ve found Linux kernel documentation to be a pain in the ass, and sometimes ever worse than windows kernel documentation (which I won’t even both to get into)
But isn't that an issue with macOS, not an issue with man pages?
Why don’t more folks use NIM for web development. Seems like the perfect blend of performance, ergonomics and productivity.
I am sure the D and V guys are asking themselves the same question.
I _just_finished_ my own comparative benchmarks to (re)check my projects from ~7 years ago, all in similar stack.
Back then I wrote the logic as Apache modules, in C. It was using Cairo to draw charts (surprisingly, the traces of trigonometry knowledge was enough for me to code that :-), and I had absolutely crazy "hybrids" of bubble charts with bars, alpha channel overlays etc. It was extremely useful for my projects back then and I never seen any library, able to produce what I "tailored" ...)
The 7-years-ago end-to-end page generation time was ~300 mcs (1e-6 sec), with graphics, data store IO and request processing, preparing the "bucket brigade" and passing it down the Apache chain.
This Jan I re-visited my code and implemented logic for OpenBSD httpd as:
** 1) Open BSD httpd "patch" to hijack the request processing internally, do necessary data and graph ops and push the result into Bufferevent buffer directly, before httpd serves it up to the client.
** 2) FCGI responder app, talking to httpd over unix socket. BTW: this is most secure version I know of, I could chroot / pledge / unveil and, IMO, it beats SELinux and anything else.
3) CGI script in ksh<=>slowcgi<=>FCGI=>httpd
4) CGI program (statically linked) in pure C<=>slowcgi<=>FCGI=>httpd
5) PHP :-) page (no frameworks)<=>php-fpm (with OpCache)<=>FCGI=>httpd
To my extreme surprise, the outcome was clear - it did not matter what I wrote my logic in, _anything today_ (including CGI shell script) is so fast, that 90% of time was spent on Network communication between the WebServer and the Browser. (And with TLS it is like 2x penalty ...)
All options above gave me end-to-end page generation time about 1-1.5 ms.
Guess what? Beyond "Hello World", with page size of 500Kb+, PHP was faster than anything else, including native "httpd patch" in C.
As side effect, I also confirmed that Libevent-based absolutely gorgeous OpenBSD httpd works slightly slower than standard pre-fork Apache httpd from pkg_add. (It gave me sub-ms times, just like 7 years ago)
Who would say ...
What also happened is that any framework (PHP or I even tried nodejs) or writing CGI in Python increased my end-to-end page generation time 10x, to double-digit ms.
I remember last week someone here was talking about writing business applications / servers for clients in C++, delivering them as single executable file.
I would be very interested to hear how that person's observations correlate with mine above.
G'day everyone!
Hello world apps don't mean much.
"Just because you can, doesn't mean you should."
Ironically I could say the same about the JS ecosystem.