Parallella Community

by **Ben.F.Rayfield** » Tue Apr 28, 2015 3:09 am

A complete redesign of networks, motherboards, and chips is coming, something that will change our concepts of these separate devices and levels into a continuous global computer or at least within each server room or device so your games run faster.

Computers today are extremely bottlenecked by 1 big set of wires, the bus, that runs between all the major devices, especially the computing chips and memory.
http://en.wikipedia.org/wiki/Address_bus

Only 2 devices can use the bus at a time, which send bits to eachother. A bus normally is 32 or 64 wires wide, which is how many bits it can send per wavelength of the motherboard clock wave or in many designs it takes a few wavelengths. On a 1 ghz computer, the clock wave vibrates 1 billion times per second.

Since GPUs do trillions of things per second instead of a few billion for a few ghz, they are extremely bottlenecked in how many bits they can send and receive to the main memory per second.
http://en.wikipedia.org/wiki/Graphics_processing_unit
http://en.wikipedia.org/wiki/Stream_processing

Any bottlenecked device can run local calculations much faster, feeding its outputs back into its inputs in some form probably with local cached memory, but this is useful only in limited ways since it prevents the calculations from talking to eachother except rarely. Any large structure in memory many of them need will amplify that bottleneck since they cant all cache it completely.

This is the motivation for a complete redesign of networks, motherboards, and chips specialized in parallel computing, a paradigm shift making huge waves in the computing industry.

Cell processors are the most parallel. They're a computing chip with a little memory and a network router all in one. They compute as they route bits to eachother. But network routing in randomly shaped networks, which are called mesh networks, is not easy.
http://en.wikipedia.org/wiki/Cell_%28microprocessor%29
http://en.wikipedia.org/wiki/Mesh_networking

Cell processors:
http://en.wikipedia.org/wiki/Adapteva#P ... la_project Parallella opensourced some motherboard designs which include proprietary parallella chips which are specialized in 32 bit float math and routing between eachother.
http://electronics.howstuffworks.com/pl ... three1.htm Playstation3 used a small network of cell processors.

The level below cell processors is blade servers, thin motherboards that plug into larger grandmotherboards. They're normally used in rooms of stacks of these blocks of blades. The blocks send bits to eachother like separate computers, through the network.
http://en.wikipedia.org/wiki/Blade_server

USB is Universal Serial Bus, the opposite of parallel. Thats why its only for hooking small networks together. They have to take turns on the wire.
http://en.wikipedia.org/wiki/USB

A huge change in the computing industry is in progress, and we're far enough into it to know its here to stay. Sequential computing will always be available but is expected to advance much slower than parallel systems.

Many years ago programmers had to spend lots of time rebuilding their programs in faster ways. Over the years automated ways of optimizing were found which looked at a description of what we wanted a computer to do and found a fast way to do it automatically. That doesnt exist yet for parallel processes, maybe a little in Hadoop and bigdata tech, but its just getting started. We are moving toward a new kind of parallel systems that should be fully backward compatible, but for now it appears its not because the optimizing processes havent been invented yet so we have to manually rewrite sequential code as parallel code. Compilers were invented for sequential code, and they will be for parallel systems too, taking the sequential code as what we want it to do and finding a way to do the bottlenecks in parallel automatically.

What will this merging of networks, motherboards, blades, cell, stream, GPU, and CPUs look like? I would be interested in exploring such designs in opensource prototypes andOr business models, with anyone who takes this paradigm shift seriously and wants to advance and simplify the cuttingedge.

by **piotr5** » Mon May 11, 2015 9:46 am

does that mean we can write a play-station emulator on epiphany?

but seriously, (older) amd cpus actually are cell processors too, they have a cache which prevents this bottleneck if your output is small enough. as far as I know on (older) intel processors there is a bottleneck between cache and computing-cores. the real problem with designing new motherboards is, according to my own very limited knowledge, that you cannot scale up the bus! no matter how you connect bus with chip, you'll always have limited bandwidth and eventually the processor will have too many cores for that connection. only solution is to integrate memory into the cpu-chip. but memory isn't the only bottleneck. if there was an epiphany with thousands of cores they would share only 4 elinks connecting them with other epiphany chips. at that point of time some more efficient bus needs to be found, with less pins...

as for usb or whatever connection to your discs, blade computers indeed have an advantage here. but interestingly there was a time where on-chip space (few MB) was bigger or equally big as the storage devices in use (floppy disc with 1.4MB) -- I guess these times will come again with the advent of 3d-memory. in the mean time, there is a blade-computer solution using the raspberry pi, so why not for parallella?

Parallella Community

Future of parallel compute? net mb blade cell stream GPU CPU

Future of parallel compute? net mb blade cell stream GPU CPU

Re: Future of parallel compute? net mb blade cell stream GPU

Who is online