All of these things took me some serious time to learn. Partly because of my sheer incredulity at the stringliness of it all. Finding good GL tutorials is hard, after all: "surely", I thought, "these are just bad examples, and I should keep looking for a better way".
A quick, frank throwdown of "this is the way it is" like this would've gotten me over that hump much faster. Even if it's crushing, I'm filing this link away for "must read" when someone asks me where to get started in GL programming.
To me this low boilerplate API implies something closer to Processing, Openframeworks, THREE.js or Cinder, but I rarely see those recommended as options in OpenGL threads like this. Is this a marketing problem? Is it because programmers are taught that real programmers use mostly raw OpenGL if they want to take advantage of the graphics pipeline? Is it because HN users are curious and this is a learning exercise? I'm asking because I'd like to look into writing a library, framework or set of tutorials that appeals to programmers that are interested in graphics programming, find modern OpenGL too verbose or intimidating and whose needs are not met by an existing library, framework or game engine. This might end up being an educational resource that helps people unfamiliar with graphics or game technology understand their options based on their use case. I imagine that many of the people that think they need to learn OpenGL would be completely happy with Unity3D, for example.
You can download just the Unity shaders via the download dropdown: https://unity3d.com/get-unity/download/archive
Those shaders give you (battle tested) base cases for really common functionality.
"This library's sole purpose is to make using the WebGL API less verbose."
It's really good, and I wish that I'd known about it years ago.
In gfx-rs [1] one defines a Pipeline State Object with a macro, and then all the input/output data and resources are just fields in a regular Rust struct [2]. So assignment pretty much works as one expects. All the compatibility checks are done at run (init) time, when the shader programs are linked/reflected.
In vulkano [3] the Rust structures are generated at compile time by analyzing your SPIR-V code.
[1] https://github.com/gfx-rs/gfx
[2] https://github.com/gfx-rs/gfx/blob/2c00b52568e5e7da3df227d415eab9f55feba5a9/examples/shadow/main.rs#L314
[3] https://github.com/tomaka/vulkanoI say this as someone who's written ~2 commercial renderers from some various degrees from scratch. Normally it's 1-2 days to just setup all the various infra for handling shaders/uniforms/etc.
https://github.com/tomaka/glium/blob/master/examples/tutoria...
That includes: Window creation and event handling, fragment shader, vertex shader, varying declaration and binding. Uniforms are pretty trivial on top of this(also done via macros so there's no raw strings).
The whole tutorial is really solid: http://tomaka.github.io/glium/book/tuto-01-getting-started.h...
Next time I'm starting up a GL project it's going to be hard for me to argue C/C++ given how nice some of the semantics around Rust's OpenGL libraries are.
[1] http://blog.theincredibleholk.org/blog/2012/12/05/compiling-...
Although I'm not a big fan of CUDA, it does have an advantage in that it combines both worlds in a single language. This does permit optimisations that cross devices boundaries, but I have no idea whether the CUDA compiler does any of this.
Apologies for tooting my own horn, buI have been working on a language that can optimise on both sides of the CPU-GPU-boundary: http://futhark-lang.org/ (Although it is more high-level and less graphics-oriented than what I think the author is looking for.)
Case in point: I recently learned that "#line 0" is a spec violation. A single OS revision of a single Android device refused to compile it. This is after shipping that line in every shader in multiple games for multiple years to millions of happy customers.
Apparently, Unreal Engine's boot process involves compiling a series of test shaders to determine which common compiler bugs the game needs to work around for this particular run.
This doesn't have anything to do with shaders as strings. It's been true since the beginning of shader programming, whether bytecode shaders or otherwise.
I can confirm this happens a LOT. (At least with ESSL) In it's defense, it usually is a mistake I made, but some implementations are definitely more lenient than others.
But it has no authors! I assume it was a double-blind submission to ICFP, which required removing the author names. When posting it publicly, though, I think you should put the author names back on. You don't want your work floating around the internet without your names attached.
(I'm currently working on a Python hpc codebase with pyopencl and would be interested in reducing some of the manual labor required for cpu-gpu coordination)
If your asset can be built, then your asset should be built. Content pipelines.
> Shaders are Strings
If you're using Vulkan, compile your shaders to SPIR-V alongside your project. If you're using OpenGL you're mostly out of luck - but there's still no reason to inline the shaders even with the most rudimentary content manager. Possibly do a syntax pass while building your project.
> Stringly Typed Binding Boilerplate
If you build your assets first then you can generate the boilerplate code with strong bindings (grabbing positions etc. in the ctor, publicly exposed getters/setters). I prototyped this in XNA but never really made a full solution.
Graphics APIs are zero-assumption APIs - they don't care how you retrieve assets. Either write a framework or use an opinionated off-the-shelf engine. Keeping it this abstract allows for any sort of imaginable scenario. For example: in my XNA prototype, the effects (shaders) were deeply aware of my camera system. In the presence of strong bindings I wouldn't have been able to do that.
Changing the way that the APIs work (to something not heterogeneous) would require Khronos, Microsoft and Apple to define the interaction interface for every single language that can currently use those APIs.
It's loosely defined like this for a reason - these are low-level APIs. It's up to you to make a stronger binding for your language.
https://www.opengl.org/sdk/docs/man/html/glShaderBinary.xhtm...
The API was built so that the cost of compilation by the driver would only be done once (likely at install, or at driver update time). So the idea is that when you are installing your PC game, it is compiling all the shaders against the current driver rev and hardware in your machine. Then when you play the game, it just loads the locally compiled binaries so your game doesn't hitch at random times when the driver compiler decides to kick off a compilation pass.
However, the binary is very sensitive to the driver version (as it owns the compiler), and the hardware itself. The binaries will differ across vendors, vendor architectures, and even sub-revisions inside of a single architecture.
Bonus: This API was provided as part of the GL_ARB_ES2_compatibility extension, in order to ease porting from OpenGL ES 2 to (big) OpenGL. OpenGL has a different API/extension, glProgramBinary, that is preferred for OpenGL. glShaderBinary might not even really work on any desktop implementation.
Probably the best source for this, though this information is relatively...obfuscated: https://community.amd.com/thread/160440
This whole article is whinging against things that people have known for a while and have been working to fix. Sometimes it's not easy.
CEPL demos: https://www.youtube.com/watch?v=2Z4GfOUWEuA&list=PL2VAYZE_4w...
I get it, there are many good reasons to stick to C++ if you want to ship a game today.
Granted that's not the case across the board, but people have been pushing for data driven C++ game engines for years.
Then there's a rich variety of third party tooling for specific things you become aware of when you start to have studio levels of money to buy licenses, and those typically don't have Lisp wrappers.
Still, CEPL is pretty sweet, and in the related bin it's neat to see things like Flappybird-from-the-ground-up-in-the-browser through ClojureScript and Figwheel. For a beginner who doesn't have access to the best tooling in the industry, or even an amateur who just wants to make a simple game, there are a lot of good (and good enough) options. Even behemoths IDEs that force you to relaunch a scene to load your newly compiled changes aren't so bad, given that because the language you're working in probably starts with C it's also a given that state management is harder than with more functional-programming-friendly languages so it's much easier to reason about your game scene when you're always starting with a fresh state. Raw OpenGL with C++ should probably be discouraged unless the person is explicitly aiming to become a pro at a big studio (in which case they might be better off learning DirectX)...
I am not enough of a gpu guy to know, but I thought that was what Harlan [1] was trying to address. It had used miniKanren originally as a region and type inferencer.
[1] https://github.com/eholk/harlanAnd clearly storing C in strings is a saner way to handle code than as Lisp stored in S-expressions. Who could possibly doubt it?
What is required is a language where the compiler can analyse the code and decide what to do on the CPU and what to do on the GPU as part of it's optimisation. It could even emit different versions for different CPU / GPU combinations or JIT compile on the target machine once it knows what the actual capabilities are (maybe in some cases it makes sense to do more on the CPU if the GPU is less powerful). You could possibly also run more or even all code on the CPU to enable easier debugging.
The language could be defined at high enough level that it can be statically analysed and verified, which I think would answer the criticisms in the article.
There's not a lot of documentation around it at the moment but the author explains a little about it in this video https://youtu.be/-WeGME_T9Ew?t=31m49s
Source code on github https://github.com/ncannasse/heaps/tree/master/hxsl
It might solve some other problems with GPU programming, though?
The key is that you're dealing with the shader language by using a language, rather than strings. You can use static type information. You can refactor more easily. You can create abstractions over the minor variations in card behaviors more easily.
While hxsl doesn't directly address debugging, it greatly improves your ability to create code examples to isolate the problem.
hxsl does about as much as you could hope for with shader languages, without requiring changes to the cards themselves (e.g. using byte code, which hasn't been standardized yet, and I wouldn't hold my breath for).
In some sense, the arrangement between app logic and shader logic is still a bit like client-server serialization. You still have to pass the data and the commands in a payload, since the two platforms typically do not share memory or have useful common architecture.
Ideally this sort of system would be part of the standard graphics pipeline but looks like we could be waiting a while to see that.
My only qualm is that he treats these aspects that he deems outdated as unnecessary, which is just not true. For a long time there was no other alternative. Introducing High Level languages for programmable shading was a huge deal and it actually decreased a lot of complexity. In reality it simplified a lot of stuff that was quite difficult before GLSL/HLSL came along.
He seems to be making some kind of rallying call for change but the next generation is already here. We'll have to keep supporting that old approach for a while longer but the problem really has been solved (to some extent).
Also, old person rant: "Back in my day, we only had 8 register combiners and 4 blend states. And we liked it!"
I think in order to solve this, we need three things: 1) Intermediate representation for compiled GPU code. Bytecode mentioned in the article sounds like it. 2) Cross-platform GPU programming support in our programming languages' standard libraries. 3) Compiler plugins and DSLs that output the IR and link to the library.
Huh, actually the amount of boilerplate required for getting a triangle to the screen with OpenGL is minimal.
- Create a context - fill a buffer with the triangle vertex data - load buffer to OpenGL - create vertex shader implementing vertex to position transformation - create vertex shader implementing pixel coloirization - issue draw commands
The fact that future iterations of OpenGL and OpenGL ES have nuked the fixed function pipeline is sort of annoying. I'm not saying there should be more codepaths in the implementation, but if there were a default compiled fixed function vertex shader and fragment shader, it would be helpful.
What you've just described for rendering one primitive includes compiling two shaders, linking arguments, and specifying byte arrays before issuing commands, when this task used to be as simple as:
glBegin(GL_TRIANGLES);
glColor3f(1.0f, 0.0f, 0.0f);
glVertex3f(0.0f, 1.0f, 1.0f);
glColor3f(0.0f, 1.0f, 0.0f);
glVertex3f(-0.866f, -0.5f, 1.0f);
glColor3f(0.0f, 0.0f, 1.0f);
glVertex3f(0.866f, -0.5f, 1.0f);
glEnd();
Being able to draw a triangle in eight lines of code is much more encouraging to someone wanting to make a 3D game for mobile or the web than an estimated 50 lines, including two shaders they had to borrow from a textbook.I would understand if immediate mode were gone, which might be better regardless. Still, things should be as simple as a glDrawArrays call without compiling shaders.
Not to mention, when working as a solo game developer, shaders feel like an orphan between artists and programmers.
- spend a few hours working out why your display window just shows black
> Compiler plugins and DSLs
Name me a non-lisp language that supports that. A lot of LLVM based languages are making progress in that direction (by providing an ability to hook into the LLVM IR at least). But this problem is all programming languages, graphics is just a pain point because of it.
I suppose this is one of the biggest main paint points. Not a verbose API, but the fact that the drivers are so complex that they often break the expectation that you can trust the underlying implementation if it's provided by a vendor.
No linguistic overlay is going to fix the underlying drivers provided by graphics card providers.
Before figuring out how to make the interface elegant I claim it should first become robust. If it's not robust then wrapping it in an elegant interface will only make things worse.
I've always sort of suspected AMD's Mantle initiative, which grew into the above two technologies, was at least partly motivated by their inability to ship a driver with the same performance as nVidia's...
Somebody in this thread mentioned CUDA, which is great, but it also has the downside that you practically have to use C++ on the CPU side. You also can use e.g. python, but when you do you are back to compiling CUDA code at runtime.
Sure, you could argue that we can simply create a new programming language for applications that use the graphics card (Or use C++ like CUDA). The problem with this is that few people would use it just because its a bit easier to do CPU-GPU communication. An there is a second much larger problem: Different graphics API vendors would create different programming languages which makes it much harder for a application to support multiple graphics API as a lot of games do today with DirectX and OpenGL.
Perhaps there is another better solution, but I can't see it right now.
That isn't an attempt at snark--RISC-V is interesting. It isn't competitive. GPU technology is very interesting. This, without dump trucks of cash, won't be competitive, either.
There is a question of cui bono here, and you need to answer that to make what you advocate make sense.
How about $30,000[0]? Seriously, it's not the 90s anymore. The cost of making a chip is not insane and it's going to go down every year. As Moore's Law slows to a halt, we enter the golden age of computer architecture[1].
Time to go learn a new API :)
Direct3D and the next generation of graphics APIs—Mantle, Metal, and Vulkan—clean up some of the mess by using a bytecode to ship shaders instead of raw source code. But pre-compiling shader programs to an IR doesn’t solve the fundamental problem: the interface between the CPU and GPU code is needlessly dynamic, so you can’t reason statically about the whole, heterogeneous program.
I'm not really sure what kind of API the author wants. Sure, it's annoying that data in GPU-land and CPU-land are difficult to get to work together. But that's not the API's fault, it's because they are physically very far removed from each other. They don't share memory (and if they did you'd still have to lock and manage it). You could make an API that made them seem more transparent to the programmer, but then you're back to the GL 1.0 mess where you have zero control and the driver does dumb things at random times because it doesn't know anything about the program that is executing.
That said it doesn't really change a whole lot, each platform has it's own unique performance characteristics and I don't think you'll ever correctly represent that at the API level. APIs tend to move much slower than the hardware platforms that power them.
Once you get to shipping IR instead of strings, which is nice I guess in that it should make initialization somewhat faster and the GPU drivers somewhat smaller, I'm not sure what you're going to get from being able to treat a CPU/GPU program as a single whole. Typically the stuff running on the GPU or any other sort of accelerator does not call back into the CPU - these are "leaf functions", optimized as a separate program, pretty much. I guess it'd be nice to be able to optimize a given call to such a function automatically ("here the CPU passes a constant so let's constant-fold all this stuff over here.") The same effect can be achieved today by creating a GPU wrapper code passing the constants and having the CPU call that, avoiding this doesn't sound like a huge deal. Other than that, what big improvement opportunities am I missing due to the CPU compiler not caring what the GPU compiler is doing and what code the GPU is running?
(Not a rhetorical question, I expect to be missing something; I work on accelerator design but I haven't ever thought very deeply about an integrated CPU+accelerator compiler, except for a high-level language producing code for both the CPU and some accelerators - but this is a different situation, and there you don't care what say OpenGL or OpenCL do, you generate both and you're the compiler and you use whatever opportunities for program analysis that your higher level language gives you. Here I think the point was that we miss opportunities due to not having a single compiler analyzing our C/C++ CPU code and our shader/OpenCL/... code as a single whole - it's this area where I don't see a lot of missed opportunities and asking where I'm wrong.)
It's also handy for the compiler to be able to check for errors in shader invocations. Looking up parameters by name and setting them dynamically adds a whole class of potential runtime errors that don't exist in straight c++ programs.
As to runtime errors because of misspelling a parameter name, this is going to be fixed the first time you run the program and you live happily ever after; I don't mind these errors in Python very much and I don't mind it in C programs using strings to refer to variable names in whatever context that may happen.
Overall the amount of "dynamism" in CPU/GPU programming IMO does not result in wasting almost any optimization opportunities nor does it make the program significantly harder to get right, but I'm prepared to change my mind given counterexamples (certainly in Python it's really easy to demonstrate missed optimization opportunities relatively to C due to its dynamism... The amount of dynamism in CPU/GPU programming however is IMO trivial and hence the issues resulting from it are also rather trivial.)
The hardware developers (AMD/nVidia/Intel) have an interest in not changing their device drivers, the software vendors (OpenGL and DirectX) have little interest in redeveloping technology, and the software developers with the most capital, game engine developers, have already found workarounds and hire enough engineers to plug leaks in their lifeboats. The state of tools for game developers is so shoddy as well that trying to retrofit language compilers and shader compilers to work together seems like a drawn out task.
It's sort of a David and Goliath situation if you think you can change the graphics programming landscape on your own. Plus, we all know how poorly these standards are developed over time.
Hrm...I've thought about this a lot, and I don't think it's as much as not having interest, as much as not having time. Or at least...not being able to monetarily demonstrate to their employers that it would be worth the engineers time to generate educational resources and better tools.
I do think AMD is trying to get better in this area with their GPUOpen stuff, though of course, you never know what might happen if AMD suddenly becomes top dog.
Anyway, as far as I know it is limited to 2D raster images so far, yes. Since it's part research language with some very interesting features I do hope that it gets extended to more general cases, and at least make it easier to work with 3D use cases (image volumes and such) too.
Author just says, that OpenGL and "shaders as strings" are bad, without explaining why. Then author speaks about some mysterious "Heterogeneity" without specifying what it means.
"We need programming models that let us write one program that spans multiple execution contexts" - what does it mean? How should that model look like, how should it be different? It is like saying "cars with 4 wheels are bad, cars with different number of wheels would be better".
> [...] doesn’t solve the fundamental problem: the interface between the CPU and GPU code is needlessly dynamic, so you can’t reason statically about the whole, heterogeneous program.
It's the same thing as raw SQL vs abstraction (could be ORM, but not necessarily). Not only are your assumptions not verified (is there table X, will I receive the columns I expect, etc.), you cannot tell if the constructs are even valid until runtime (is "SELECT %s from %s" valid?). Abstractions let you do things like `table.where(abc=123)` which is guaranteed to be a valid construct and can be verified against schema at compile time. It can still fail at runtime, but has much fewer failure modes and is clearer when reading.
PGSQL(dbh) "insert into employees (name, salary, email)
values ($name, $salary, $?email)"
and it (at compile time) creates the correct PREPARE+EXECUTE pair and checks types across the languages. (It's not obvious but it is also free of SQL injections.)For example if your OCaml 'salary' variable was a string but the column is an int then the above program wouldn't compile. If you change the type of the db column after compiling the program then you get a runtime error instead.
Still around although now maintained by a group of authors: http://pgocaml.forge.ocamlcore.org/ (git source: https://github.com/darioteixeira/pgocaml)
It's only possible because OCaml has real macros. It wouldn't be easy to do this in C/C++ (I guess? - maybe with an LLVM-based pre-processing parser or something)
This paper is from a few years ago about verifying programs that span multiple languages / runtimes:
http://www.ccs.neu.edu/home/amal/papers/verifcomp.pdf
We need to get these concepts into mainstream tools. SQL certainly seems to be the low-hanging fruit (simple, straightfoward typing, widely used and understood, frequent errors lead to security problems).
The shader API is an API where the developer provides source code at runtime and then modifies inputs to the program compiled from the source code that runs on the user's GPU. It's a source-code level interface because it makes nearly zero assumptions about what capabilities the underlying hardware will have; interpretation and compilation of the source is left largely up to the vendor (literally, in the sense that the compiler is part of the GPU's driver kit and not part of the OpenGL library---an issue that has caused me HOURS of headache when encountering a bug in a closed-source graphics card's compiler, let me tell ya ;) ).
I strongly suspect a more compile-time-checkable API would lack the flexibility needed to capture all shader behavior---both when the API was codified and in the future. It's a pain in the ass to use, but I'm not convinced it's avoidable pain.
It's a useful exercise in understanding why such a thing wasn't done.
"DreemGL is an open-source multi-screen prototyping framework for mediated environments, with a visual editor and shader styling for webGL and DALi runtimes written in JavaScript." [3]
[1] https://github.com/dreemproject/dreemgl
[2] http://docs.dreemproject.org/docs/api/index.html#!/guide/dre...
Made by the guy behind MathBox.
> To define an object’s appearance in a 3D scene, real-time graphics applications use shaders... Eh, the shaders are just a part of the GPU pipeline that transforms your vertices, textures, and shaders into something interesting on the screen.
This is already oversimplifying what GPUs are trying to do for the base case of graphics.
> the interface between the CPU and GPU code is needlessly dynamic, so you can’t reason statically about the whole, heterogeneous program.
Ok, so what is the proposed solution here? You have a variety of IHVs (NV, AMD, Intel, ImgTec, ARM, Samsung, Qualcomm, etc). Each vendor has a set of active architectures that each have their own ISA. And even then, there are sub-archs that likely require different accommodations in ISA generation depending on the sub-rev.
So in the author's view of just the shader code, you already have the large problem of unifying the varieties of ISA under some...homogenous ISA, like an x86. That's a non-trivial problem. What's the motivation here? How will you get vendors to comply?
I think right now, SPIR-V, OpenCL, and CUDA aren't doing a _bad_ job in trying to create a common programming model where you can target multiple hardware revs with some intermediate representation, but until all the vendors team up and agree on an ISA, I don't see how to fix this.
On top of that, that isn't even really the only important bit of programming that happens on GPUs. GPUs primarily operate on command buffers, of which, there is nary a mention of in the article. So even if we address the shader cores inside the GPU, what about a common model for programming command buffers directly? Good luck getting vendors to unify on that. Vulkan/DX12/Metal are good (even great) efforts in exposing the command buffer model. You couldn't even _see_ this stuff in OpenGL and pre-DX12 (though there were display lists and deferred contexts, which kinda exposed the command buffer programming model).
> To use those parameters, the host program’s first step is to look up location handles for each variable...
Ok, I don't blame the author for complaining about this model, but this is an introductory complaint. You can bind shader inputs to 'registers', which map to API slots. So with some planning, you don't need to query location handles if you specify them in the shader in advance. I think this functionality existed in Shader Model 1.0, though I can't find any old example code for it (2001?).
That being said, I certainly don't blame the author for not knowing this, as I think this is a common mistake made by introductory graphics programmers, because the educational resources are poor. I don't think I ever learned it in school...only in the industry was this exposed to me, to my great joy. Though I am certain many smarter engineers figured it out unprompted.
> OpenGL’s programming model espouses the simplistic view that heterogeneous software should comprise multiple, loosely coupled, independent programs.
Eh, I don't think I want a common programming model across CPUs and GPUs. They are fundamentally different machines, and I don't think it makes sense to try to lump them together. I don't think we just assume that we can use the same programming methodologies for a single vs multi-threaded program. I know that plenty tried, but I thought the best method of addressing the differences was education and tools. I'd advocate that most effective way that GPU programming will become more accessible will be education and tools. I have hope that the current architecture of the modern 'explicit' API will facilitate that movement.
1) Fixed-function utterly failed to capture the explosion of features as graphics card capability took off, having to fall back on nasty APIs like the OpenGL extensions system (where at a certain level of sophistication, more of your code is calling Glext functions than core OpenGL functions---when everything is an extension, nothing is)
2) Compile-time-checkable behavior was accomplished via function calls, but it turned out that making a lot of function calls created significantly non-trivial runtime overhead, and performance is king in graphics. It's hard to get good static checking on a model of "Rip a bunch of bits into a binary buffer and then ship that whole buffer to the GPU," but that model is a lot faster than many-function-call-based alternatives.
Well, except for APU's, which I think is an unstated target for this discussion. Or on-board graphics chips.
In these cases, there is almost certainly extra work being done, regardless of your programming philosophy.
Thanks
I agree that it's a very nice presentation of docs+code.
[0] - https://github.com/sampsyo/tinygl/blob/master/Makefile#L25