Libre-SoC 180nm Power ISA v3.0 ASIC Submitted to IMEC MPW (opens in new tab)

(openpowerfoundation.org)

91 pointslkcl4y ago67 comments

67 comments

For SW type people ...

GCC's impact was possible because it was (with GAS - the assembler) 100% feasible to have an open source toolchain. Yes more software was necessary for a complete system (linker, libc, etc), but GCC made it possible to build from the ground floor up.

Also, yes, the initial GCC was worse than any proprietary decent tool chain at the time, but it got better and better because each improvement built on all the earlier open sourced efforts.

Think about how hard Linux kernel development would have been if it had to rely on different proprietary tool chains for every target architecture (and possibly chip version).

Hardware definition languages (Verilog/VHDL, etc) enable high level chip design like high level programming languages, but making the physical chip requires a PDK (process design kit) that encodes how each critical silicon feature is built.

So a chip built for TSMC 28nm contains TSMC proprietary material and is essentially unportable. It can take several years to move a major chip from one foundry to another (or even a shrink at the same foundry), and the proprietary tool chains preclude a development process that can incrementally improve portability.

This announcement is a a major step toward a similar foundation being available for silicon design. It is very important that it is a large complex chip, rather than just a research development vehicle.

[disclaimer - past life as OpenPOWER participant]

Taniwha4y ago

I've worked on big chips designed to be taped out to multiple (3) fabs - you have to either build your own libraries that have some minimum performance on all processes, or recompile with a new fab's libraries - my experience is that if you plan for it it's more a matter of a few months than years

lkclOP4y ago

you'll be fascinated to know that we picked a python-based (Object-Orientated) HDL - nmigen - for exactly this reason.

we've developed a dynamically SIMD-partitionable-maskable set of "base primitives" for example, so you set a "mask" and it automatically subdivides the 64-bit adder into two halves. but we didn't leave it there, we did shift, multiply, less-than, greater-than - everything.

https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/iee... https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/iee...

can you imagine doing that in VHDL or Verilog? tens of engineers needed, or some sort of macro-auto-generated code (treating VHDL / Verilog as a machine-code compiler target).

the reason for doing this - planning it well in advance - is because we're doing Cray-style Vectors (Draft SVP64) with polymorphic element-width over-rides. yes, really. the "base" operation is 64-bit, but you can over-ride the source and destination operation width.

the reason why we're using our own Cell Library is actually down to transparency. we want customers to be able to compile the GDS-II files themselves, fully automated, no involvement from us, no manual intervention.

ironically, as an aside: Staf's Cells are 30% smaller (by area) than the Foundry equivalents.

nickik4y ago

Google has done a lot of effort in that direction. The first ever chips have already been produced that are fully open source from the tools used to make to the complete tool chain need to manufacture them.

There is a huge amount of great stuff going on this this area.

Tim Ansell - Skywater PDK: Fully open source manufacturable PDK for a 130nm process

https://www.youtube.com/watch?v=EczW2IWdnOM

lkclOP4y ago

interestingly, Libre-SOC and NLnet's funding pre-dates the google-sponsored Skywater 130nm process. also, because it's funded by NLnet we're not dependent on google, don't have to pass "conditions", and in particular were not forced to use OpenLane and were not limited to 48 pins controlled by a "Management Engine".

Staf actually developed actual IOpad Cells (from scratch), actual Standard Cells and a 4k SRAM block: we did not use the NDA'd TSMC Cell Libraries, here.

if we had used Skywater 130nm we would have been forced to ditch LIP6.fr (i cannot express enough how hard Jean-Paul Chaput has worked on coriolis2 for the past 18 months), we would not have been able to test the IOpads that Staf developed... yeah.

bottom line is we used a complete independent VLSI toolchain - fully automated - that has nothing to do with the USA or DARPA Military funding - and was developed with European expertise.

2 more replies

dragontamer4y ago

A fully open source chip, from Verilog to Fabrication is cool!

It may be 180nm (1999-era technology), but that's still hugely important. The world of semiconductor design is incredibly closed source and secretive.

zozbot2344y ago

Note that Google has open sourced a full set of design rules for a 130nm process (codenamed SkyWater), making fully open chip designs also possible for this finer process. 130nm was current in the very early 2000s, so it should be possible to achieve interesting results with it.

swiley4y ago

Didn't they require some closed logic between your stuff and all of the I/O?

1 more reply

throwawaysea4y ago

What about the tools and processes to manufacture this? Are those open source or broadly available? For instance, is it possible to have a small scale "community" fab for 1999-era chip technology?

lkclOP4y ago

yes, Chips4Makers http://chips4makers.io will help anyone who wants to do a 360nm ASIC, the costs are ridiculously cheap. like... EUR 1750 for 20 MPW samples, something mad, who would have ever thought it.

Staf will also "protect" you from the Foundry NDAs. you develop with a "symbolic" version of the Cell Library, he runs the "Real" one and sends it to IMEC on your behalf. here's Staf's "symbolic" Cell Library, it's based on FreePDK45 https://gitlab.com/Chips4Makers/c4m-pdk-freepdk45/-/releases

Coriolis2 - http://coriolis.lip6.fr/ - is entirely Libre-Licensed. it's fully automated, you don't have to do any "hand-editing", it has unit tests (so you have demos you can look at and also check you installed everything right). we have some automated setup scripts for it if you're interested: https://git.libre-soc.org/?p=dev-env-setup.git;a=blob;f=cori...

LIP6 have a Silicon-proven ENTIRELY Libre Cell Library called nsxlib, if you really want to go that route. it's Silicon-proven in 360nm and 180nm.

Also, LIP6 have a relationship with a small town in Japan, they have 2 micron fab which is used for "training" of employees of the town. submission for that is entirely free. i know this exists but have not used it, and don't know more details, but i can probably put you in touch with Sorbonne University if you're serious.

and if you really really want to do "at home" stuff, Libre-Silicon is developing a 2in wafer fab, using Ultra-Violet DLPs and high-accuracy stepper motors, that you'll be able to buy and operate from your garage or lab. think "3D printing", i think they're aiming for 2000 nm or something (20 micron)? really big, but proves the concept.

1 more reply

wallacoloo4y ago

I've been keeping an eye out for anything like this. There's Sam Zeloof, doing one-offs in his home lab [1], and there's Libre Silicon [2] putting together their fab too, but the info there's more scarce.

Neither one has published an easily-replicable process, meaning I can't really repeat what they've done. IMO what this space needs is an open source build plan/BoM, with a cottage industry of people selling DiY and pre-assembled kits. Once the 3d printing community got there, that's when things took off -- before kits or at least build guides with proper BoMs, it was just disparate individuals doing their own thing.

Connect me with anyone who's got a good approach to building some sort of replicable open-source fab though, and I'll quit my job and join the project full-time (that's not a joke: I'm serious).

[1] http://sam.zeloof.xyz/category/semiconductor/ [2] https://libresilicon.com/

1 more reply

marktangotango4y ago

Or any other options for "small" batch sizes?

lkclOP4y ago

we use nmigen (python-based OO HDL) which through yosys generates verilog as an automatic step.

180nm is still by far and above the world's most heavily-used geometry, because the price-performance (bang per buck, however you want to put it) is so extremely high.

an 8in wafer is USD 600 and that's extremely low. any power MOSFET, power transistor, diode or other high current semiconductor you absolutely don't want small "things" (detailed tiny tracks) you want MASSIVE ones.

why on earth would you waste money on tiny features, it's like using the latest 0.15mm 3D printing nozzles to 3D print a massive 300x300x300 mm cube that's going to be used for nothing more than a foot-stool. you want a 1.2mm nozzle for that!

then any processor below 300 mhz, you can get away with 180nm. need only an 8 mhz 8-bit or 4-bit washing machine or microwave processor, or something to go in a cheap digital watch? 180nm is your best bet: you'll get tens of thousands of < 1 mm^2 ASICs on a single wafer which means you're well below $0.05 per individual die.

a 28nm 8in wafer would be about... 10x that cost, you'd end up with exactly the same transistor (or 8 mhz 8-bit processor), why would you pay more money for what you don't need?

btw the real reason why there's a chip shortage: the Automotive industry, who are cheap bar-stewards, wanted even lower than $600 per 8in wafer so they went with 360nm and cruder geometry. that's equipment that's even older than the 1990s, like 40+ years in some cases.

so then the stupidity hit, and they stopped ordering. then 18 months later they phone up these old Foundries and say, "ok, we're ready to start ordering again". and the Foundries say, "oh, we switched off the equipment, and it cooled down and got damaged (just like that massive Electric plant in S. Australia that was de-commissioned, the concrete cracked when they switched it off, and it's completely unsafe to start up again). you were our only customer for the past 30 years, so we scrapped it all. you'll have to now compete with the consumer-grade smaller geometry Fabs like everyone else".

which is something that none of the Automotive companies have told their Governments, because then they can't go crying "boo hoo hoo, we can't make chips any more at the price that we demand, waaa, waaaa, i wannnt myyy monneeeeey"

and now of course they can't use the old masks, because those were designed for 360nm and cruder geometries, they have to redesign the entire ASIC for 180nm and that's why you can't now get onto 180nm and other MPW Programmes because the frickin Automotive Industry has jammed them all to hell.

marcodiego4y ago

This is a very important step. I don't understand how this is not on the first page. Maybe a more click-baity title is needed?

cdcarter4y ago

Who is this important for? Is there a lot of software still being developed for POWER? It seems niche to me, but maybe I'm the one in a niche.

addaon4y ago

The POWER/PowerPC ISA is still widely used in safety-critical avionics, where a mature tool-chain exists for supporting DO-178 objectives.

In my opinion, an area of interest going forward into the next decade of more safety-critical software written by smaller and smaller orgs (e.g. eVTOL companies, sensor companies, etc) is continuing to push forward which objectives can be accomplished by formal means instead of primarily through testing.

An NXP or IBM processor might be great, and might be mature, and might be very well tested -- but I, as a safety-critical software developer, have little way of demonstrating that to certification authorities. The availability of open-source processor designs and, in the future, traceable and accountable conversion from those HDL designs to RTL, to masks, and then to silicon, gives a path to showing that portions of a processor are correct-by-design, and thus a path to the goal of showing that my machine-code-as-authored(-by-an-assembler) and machine-code-as-executed(-by-a-processor) semantics match.

2 more replies

Seirdy4y ago

Many hyperscalar server setups use POWER8/POWER9 CPUs. 4 logical processes per core (and 8 with the upcoming 15-core POWER10 configurations) are pretty useful when measuring perf-per-watt.

The Talos is currently the only fully libre computer available for high-perf computing, and it uses POWER9 CPUs. If you want a fully free CPU, your choices are either very dated CPUs or POWER.

Many distros (inc. Debian, and most source-based ones) support ppc64/POWER officially quite well and go out of their way to ensure a high degree of portability.

2 more replies

marcodiego4y ago

AFAIK this is a libre soc developed using libre software tools, some of which were developed by the group members themselves, free from royalties and independent from any for-profit institution. This is probably "librier" than RISCV.

The fact that the POWER architecture may be niche is not a problem since so much software can be compiled for it. See the thalos workstations: https://www.raptorcs.com/TALOSII/ and the powerpc notebook: https://www.powerpc-notebook.org/en/

For people who are willing to use niche hardware for more control on what is running, this is seems like a very important step.

1 more reply

swiley4y ago

IMO: the underlying architecture is mostly relevant to kernel/compiler authors and people doing aggressive optimization. For most application devs it's about as irrelevant as you can get (unless your language has a very hard to port compiler cough rust.)

What's good about this is that the source is available and can be verified to some degree against the hardware (by decapping it.) That puts a log of constraints on what kinds of secret back doors people can build that we didn't have before.

2 more replies

phendrenad24y ago

Many things were ported to power over the last ~3 decades, and that code is still valuable today.

swiley4y ago

Commenting on articles early in their life weights them down significantly, if you want something on the front page you should absolutely not comment on it until it gets there.

marcodiego4y ago

Thanks for the advice. Didn't know that. Actually, I'm answering only because because it finally got to the first page.

Off topic: where did you get this rule?

UncleOxidant4y ago

That seems counterintuitive. Do you have any data to support this?

KirillPanov4y ago

Because it isn't really open source.

https://news.ycombinator.com/item?id=27777223

KirillPanov4y ago

> Symbolic (ghost) versions of FlexLib allowed Libre-SOC developers to not have to sign a Foundry NDA during the development of the ASIC Layout

In other words, this chip isn't even remotely open-source.

What they sent to the foundry isn't the "ghost cells" (which don't have transistors in them and therefore don't work).

This fails the most basic requirements of being open source.

lkclOP4y ago

HDL source code: https://git.libre-soc.org/?p=soc.git;a=summary

Coriolis2 source code: http://coriolis.lip6.fr/

Chips4Makers FlexLib Cell Library based on FreePDK45: https://gitlab.com/Chips4Makers/c4m-pdk-freepdk45/-/releases

Automated Layout scripts for generation of GDS-II Files: https://git.libre-soc.org/?p=soclayout.git;a=summary

please do try to get your facts right and not mislead people by making false claims, eh?

test_epsilon4y ago

What is the problem if they could be translated to a working chip? A C program contains no instructions the machine can use and yet you can compile an open source program with a closed source compiler.

lkclOP4y ago

we used an entirely Libre-licensed VLSI "compiler", which takes HDL as input and spits out fully-completed GDS-II Files.

the problem with this particular irate individual is that he's assumed that because TSMC's DRC rules are only accessible under NDA that automatically absof*** everything was also "fake open source".

idiot.

sigh.

clearly didn't read the article.

whilst both Staf Verhaegen and LIP6.fr signed the TSMC Foundry NDA, we in the Libre-SOC team did not. we therefore worked entirely in the Libre world, honoured our committment to full transparency, whilst Staf and Jean-Paul and the rest of the team from LIP6 worked extremely hard "in parallel".

the ASIC can therefore be compiled with three different Cell Libraries:

* LIP6.fr's 180nm "nsxlib" - this is a silicon-proven 180nm Cell Library * Staf's FreePDK45 "symbolic" cell library using FlexLib (as the name says, it uses the Academic FreePDK45 DRC) * the NDA'd TSMC 180nm "real" variant of Staf's FlexLib

i was therefore able to "prepare" work for Jean-Paul, via the parallel track, commit it to the PUBLIC REPOSITORY (the one that's open, that our resident idiot didn't bother to check existed or even ask where it is), which saved Jean-Paul time whilst he focussed on fixing issues in coriolis2.

it was a LOT of work.

phendrenad24y ago

I can't wait to see the Vulkan implementation for this. Apparently it should be somewhat hardware-accelerated due to the vector capabilities of the core?

lkclOP4y ago

yes, so the "normal" way that GPUs work is: the architecture and the ISA are so staggeringly optimised they're completely incompatible and incapable of running standard (general-purpose) workloads. no MMU, vast wide SIMD engines, massive numbers of parallel memory interfaces that run really slowly but can handle (when added up) vast bandwidth far in excess of "normal" processor memory, and so on.

on top of that, because it's an entirely separate processor, to get it to do anything you actually have to have a Remote Procedure Call system, operating over Shared Memory!

oink.

so the process for running a GPU shader binary is as follows:

step 1: fire up a compiler (in userspace) step 2: compiler takes the shader IR and turns it into GPU assembler step 3: the userspace program (game, blender, whatever) triggers the linux kernel (or windows kernel) to upload that GPU binary to the GPU step 4: the kernel copies that GPU binary over Shared Memory Bus (usually PCIe) step 5: now we unwind back to userspace (with a context-switch) and want to actually run something (OpenGL call) step 6: the OpenGL call (or Vulkan) gets some function call parameters and some data step 7: the userspace library (MESA) "packs" (marshalls) those function call parameters into serialised data step 8: the userspace library triggers the linux (windows) kernel to "upload" the serialised function call parameters - again over Shared Memory Bus step 9: the kernel waits for that to happen step 10: the userspace proceeds (after a context-switch) and waits for notification that the function call has completed...

... i'm not going to bother filling in the rest of the details, you get the general idea that this is completely insane and goes a long way towards explaining why GPU Cards are so expensive and why it takes YEARS to reverse-engineer GPU drivers.

in the Libre-SOC architecture - which is termed a "Hybrid" one, the following happens:

step 1: the compiler is fired up (in userspace, just like above) step 2: compiler takes the shader IR and turns it into *NATIVE* (Power ISA with Cray-style Vectors and some custom opcodes) assembler step 3: userspace program JIT EXECUTES THAT BINARY NATIVELY RIGHT THERE RIGHT THEN

done.

did you see any kernel context-switches in that simple 3-step process? that's because there aren't any needed.

now, the thing is - answering your question a bit more - that "just having vector capabilities" is nowhere near enough. the lesson has been learned from Nyuzi, Larrabee, and others: if you simply create a high-performance general-purpoes Vector ISA, you have successfully created something that absolutely sucks at GPU workloads: about TWENTY FIVE PERCENT (one quarter) of the capability of a modern GPU for the same power consumption.

therefore, you need to add SIN, COS, ATAN2, LOG2, and other opcodes, but you need to add them with "reduced accuracy" (like, only 12 bit or so) because that's all that's needed for 3D.

you need to add Texture caches, and Texture interpolation opcodes (takes 4 pixels @ 00 01 10 11 square coordinates, plus two FP XY numbers between 0.0 and 1.0, and interpolates the pixels in 2D).

you need to add YUV2RGB and other pixel-format-conversion opcodes that are in the Vulkan Specification...

and many more.

but, we first had to actually, like, y'know, have a core that can actually execute instructions at all? :) and that's what this first Test ASIC is: a first step.

phendrenad24y ago

Awesome job. I tried to make a simple GPU in chisel w/ hardfloat. I also came to the conclusion that Larrabee was a joke and dedicated triangle interpolation hardware was necessary, but I didn't consider the half-float(?) or caches or other additions you had to make.

1 more reply

gnufx4y ago

Interesting as this is, I'll look forward to version two, to see how the vector processing works.

lkclOP4y ago

you can get a pretty good idea right now, the simulator is functional and the unit tests include explanations in english:

https://git.libre-soc.org/?p=openpower-isa.git;a=tree;f=src/...

i'm currently in the middle of a rabbit-hole exploration of being able to do in-place RADIX-2 FFT, DCT and DFT butterflys, the target is a general purpose function to cover each of those, in around 25 Vector instructions.

not 2,000 optimised loop-unrolled instructions specifically crafted for RADIX-8, another for RADIX-16, another for RADIX-32 ..... RADIX-4096 (as is the case in ffmpeg): 25 instructions FOR ANY 2^N FFT.

btw if you're interested in "real-world" SVP64 Vector Assembler we have the beginnings of an ffmpeg MP3 CODEC inner loop:

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=medi...

that's under 100 instructions, more than 4x less assembler for the same job in PPC64. and 6.5 times less assembler than ffmpeg's optimised x86 apply_window_float.S

you will no doubt be aware of the huge power savings that brings due to reduced L1 cache usage.

Narishma4y ago

I didn't see any specs for this SoC in the article, did I miss it?

lkclOP4y ago

no, it's pretty basic, and implicit: it's the (newly-created) "Scalar Fixed-Point Compliancy Subset) - i added a bit to the wikipedia page last month about them https://en.wikipedia.org/wiki/Power_ISA#Compliancy

it's 64-bit, LE/BE, and it's implementing a "Finite State Machine" (similar technique to picorv32, if you know that design). this because we wanted to keep it REALLY basic, and also very clear as a Reference Design, none of the "optimised pipelined decoders and issuers" that you normally find, which make it really, really difficult to see what the hell is going on.

bear in mind this includes SVP64: https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple...

if you go back several revisions, the non-Vectorised version is like... 400 lines?

vfclists4y ago

What does this mean to noobs like me?

insulanus4y ago

Here are a few implications:

* In a few years (maybe 5?), it might be possible to build a computer that you can trust has no intentional back doors in the CPU, but is modern enough to run software from within the last decade.

* If this catches on, and is used by enough people, economies of scale might kick in, and bring costs for advanced custom chips down by an order of magnitude (if the cpu is small enough, and if more fab capacity is built). Not Intel/AMD/ARM parts - those prices will remain stable, at first.

* Maybe we can have another decent consumer-grade router? No, this is a pipe-dream.

* Our Amiga accelerator boards will become SMOKING fast.

vfclists4y ago

Is the chip in question a complete CPU?

1 more reply

fithisux4y ago

Congratulations.

lkclOP4y ago

thanks :)

j / k navigate · click thread line to collapse

67 comments

cjsplat4y ago

For SW type people ...

Also, yes, the initial GCC was worse than any proprietary decent tool chain at the time, but it got better and better because each improvement built on all the earlier open sourced efforts.

Think about how hard Linux kernel development would have been if it had to rely on different proprietary tool chains for every target architecture (and possibly chip version).

[disclaimer - past life as OpenPOWER participant]

Taniwha4y ago

lkclOP4y ago

you'll be fascinated to know that we picked a python-based (Object-Orientated) HDL - nmigen - for exactly this reason.

https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/iee... https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/iee...

can you imagine doing that in VHDL or Verilog? tens of engineers needed, or some sort of macro-auto-generated code (treating VHDL / Verilog as a machine-code compiler target).

ironically, as an aside: Staf's Cells are 30% smaller (by area) than the Foundry equivalents.

nickik4y ago

There is a huge amount of great stuff going on this this area.

Tim Ansell - Skywater PDK: Fully open source manufacturable PDK for a 130nm process

https://www.youtube.com/watch?v=EczW2IWdnOM

lkclOP4y ago

Staf actually developed actual IOpad Cells (from scratch), actual Standard Cells and a 4k SRAM block: we did not use the NDA'd TSMC Cell Libraries, here.

bottom line is we used a complete independent VLSI toolchain - fully automated - that has nothing to do with the USA or DARPA Military funding - and was developed with European expertise.

2 more replies

dragontamer4y ago

A fully open source chip, from Verilog to Fabrication is cool!

It may be 180nm (1999-era technology), but that's still hugely important. The world of semiconductor design is incredibly closed source and secretive.

zozbot2344y ago

swiley4y ago

Didn't they require some closed logic between your stuff and all of the I/O?

1 more reply

throwawaysea4y ago

What about the tools and processes to manufacture this? Are those open source or broadly available? For instance, is it possible to have a small scale "community" fab for 1999-era chip technology?

lkclOP4y ago

LIP6 have a Silicon-proven ENTIRELY Libre Cell Library called nsxlib, if you really want to go that route. it's Silicon-proven in 360nm and 180nm.

1 more reply

wallacoloo4y ago

Connect me with anyone who's got a good approach to building some sort of replicable open-source fab though, and I'll quit my job and join the project full-time (that's not a joke: I'm serious).

[1] http://sam.zeloof.xyz/category/semiconductor/ [2] https://libresilicon.com/

1 more reply

marktangotango4y ago

Or any other options for "small" batch sizes?

lkclOP4y ago

we use nmigen (python-based OO HDL) which through yosys generates verilog as an automatic step.

180nm is still by far and above the world's most heavily-used geometry, because the price-performance (bang per buck, however you want to put it) is so extremely high.

a 28nm 8in wafer would be about... 10x that cost, you'd end up with exactly the same transistor (or 8 mhz 8-bit processor), why would you pay more money for what you don't need?

marcodiego4y ago

This is a very important step. I don't understand how this is not on the first page. Maybe a more click-baity title is needed?

cdcarter4y ago

Who is this important for? Is there a lot of software still being developed for POWER? It seems niche to me, but maybe I'm the one in a niche.

addaon4y ago

The POWER/PowerPC ISA is still widely used in safety-critical avionics, where a mature tool-chain exists for supporting DO-178 objectives.

2 more replies

Seirdy4y ago

Many hyperscalar server setups use POWER8/POWER9 CPUs. 4 logical processes per core (and 8 with the upcoming 15-core POWER10 configurations) are pretty useful when measuring perf-per-watt.

The Talos is currently the only fully libre computer available for high-perf computing, and it uses POWER9 CPUs. If you want a fully free CPU, your choices are either very dated CPUs or POWER.

Many distros (inc. Debian, and most source-based ones) support ppc64/POWER officially quite well and go out of their way to ensure a high degree of portability.

2 more replies

marcodiego4y ago

For people who are willing to use niche hardware for more control on what is running, this is seems like a very important step.

1 more reply

swiley4y ago

2 more replies

phendrenad24y ago

Many things were ported to power over the last ~3 decades, and that code is still valuable today.

swiley4y ago

Commenting on articles early in their life weights them down significantly, if you want something on the front page you should absolutely not comment on it until it gets there.

marcodiego4y ago

Thanks for the advice. Didn't know that. Actually, I'm answering only because because it finally got to the first page.

Off topic: where did you get this rule?

UncleOxidant4y ago

That seems counterintuitive. Do you have any data to support this?

KirillPanov4y ago

Because it isn't really open source.

https://news.ycombinator.com/item?id=27777223

KirillPanov4y ago

> Symbolic (ghost) versions of FlexLib allowed Libre-SOC developers to not have to sign a Foundry NDA during the development of the ASIC Layout

In other words, this chip isn't even remotely open-source.

What they sent to the foundry isn't the "ghost cells" (which don't have transistors in them and therefore don't work).

This fails the most basic requirements of being open source.

lkclOP4y ago

HDL source code: https://git.libre-soc.org/?p=soc.git;a=summary

Coriolis2 source code: http://coriolis.lip6.fr/

Chips4Makers FlexLib Cell Library based on FreePDK45: https://gitlab.com/Chips4Makers/c4m-pdk-freepdk45/-/releases

Automated Layout scripts for generation of GDS-II Files: https://git.libre-soc.org/?p=soclayout.git;a=summary

please do try to get your facts right and not mislead people by making false claims, eh?

test_epsilon4y ago

lkclOP4y ago

we used an entirely Libre-licensed VLSI "compiler", which takes HDL as input and spits out fully-completed GDS-II Files.

the problem with this particular irate individual is that he's assumed that because TSMC's DRC rules are only accessible under NDA that automatically absof*** everything was also "fake open source".

idiot.

sigh.

clearly didn't read the article.

the ASIC can therefore be compiled with three different Cell Libraries:

it was a LOT of work.

phendrenad24y ago

I can't wait to see the Vulkan implementation for this. Apparently it should be somewhat hardware-accelerated due to the vector capabilities of the core?

lkclOP4y ago

on top of that, because it's an entirely separate processor, to get it to do anything you actually have to have a Remote Procedure Call system, operating over Shared Memory!

oink.

so the process for running a GPU shader binary is as follows:

in the Libre-SOC architecture - which is termed a "Hybrid" one, the following happens:

done.

did you see any kernel context-switches in that simple 3-step process? that's because there aren't any needed.

therefore, you need to add SIN, COS, ATAN2, LOG2, and other opcodes, but you need to add them with "reduced accuracy" (like, only 12 bit or so) because that's all that's needed for 3D.

you need to add Texture caches, and Texture interpolation opcodes (takes 4 pixels @ 00 01 10 11 square coordinates, plus two FP XY numbers between 0.0 and 1.0, and interpolates the pixels in 2D).

you need to add YUV2RGB and other pixel-format-conversion opcodes that are in the Vulkan Specification...

and many more.

but, we first had to actually, like, y'know, have a core that can actually execute instructions at all? :) and that's what this first Test ASIC is: a first step.

phendrenad24y ago

1 more reply

gnufx4y ago

Interesting as this is, I'll look forward to version two, to see how the vector processing works.

lkclOP4y ago

you can get a pretty good idea right now, the simulator is functional and the unit tests include explanations in english:

https://git.libre-soc.org/?p=openpower-isa.git;a=tree;f=src/...

btw if you're interested in "real-world" SVP64 Vector Assembler we have the beginnings of an ffmpeg MP3 CODEC inner loop:

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=medi...

that's under 100 instructions, more than 4x less assembler for the same job in PPC64. and 6.5 times less assembler than ffmpeg's optimised x86 apply_window_float.S

you will no doubt be aware of the huge power savings that brings due to reduced L1 cache usage.

Narishma4y ago

I didn't see any specs for this SoC in the article, did I miss it?

lkclOP4y ago

bear in mind this includes SVP64: https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple...

if you go back several revisions, the non-Vectorised version is like... 400 lines?

vfclists4y ago

What does this mean to noobs like me?

insulanus4y ago

Here are a few implications:

* In a few years (maybe 5?), it might be possible to build a computer that you can trust has no intentional back doors in the CPU, but is modern enough to run software from within the last decade.

* Maybe we can have another decent consumer-grade router? No, this is a pipe-dream.

* Our Amiga accelerator boards will become SMOKING fast.

vfclists4y ago

Is the chip in question a complete CPU?

1 more reply

fithisux4y ago

Congratulations.

lkclOP4y ago

thanks :)

j / k navigate · click thread line to collapse