Tuesday, 3 February 2015

Raspberry Pi 2 - Colated FAQ's


What’s the new CPU and how does it compare?

It’s a BCM2836, which is a Quad Core Cortex-A7 with a Videocore4 GPU. It is almost exactly the same in all respects to the BCM2835, but with those quad cores instead of the single core Arm11. The one other main difference is the memory caches. Each core on the 2836 has a 32KB L1 instruction cache and a 32KB L1 data cache. The 2835 caches were only 16KB. In addition the 2836 has a 512KB L2 cache specifically for the ARM cores, on the 2835 the L2 was shared with the VC4. An exclusive use Vicodecore L2 cache is retained on the 2836.

What clock speed does it run at, and can I overclock?

By default the clock speed is 900Mhz. There is some overhead available for overclocking, and also overvolting, although you risk the warranty with large overvolts. A 1000Mhz overclock option is available in raspi-config.

What this about more memory?

The new Pi has 1GB of RAM, so double that of the Mk1. This is an upper limit - the 2836 cannot address more than 1GB, so there will not be a memory upgrade.

How much faster is it?

The A7 instruction set of the new cores is more efficient than the Arm11 so at the same clock speed a single core will be 1.6-2.0 times faster. Add on the extra cores and with a multithreaded application you should get as much as a 6 times speed increase. This isn’t the only story though. Even a single threaded application will run much faster because a lot of background tasks will be able to be moved to another core. For example, when running under X windows (the desktop), a lot of the graphics processing will run on another core, a lot of USB and SD card processing will move to another core, so a single threaded app will generally have a whole core to itself.

Do I need to rebuild to get this extra performance?

Whilst applications built for the B+ will run faster on the Pi2, for the reasons explained above, by rebuilding with the correct compiler options to target the specific processor architecture, extra speed can be found. Quite a lot of extra speed if the applications can use NEON! Note you can run either types of build on the P2 at any time, but A7/NEON optimised code will not run on P1’s.

What’s this about NEON?

NEON is an ARM processor extension to allow vector operations on data. The 2836 does have single issue NEON extensions built in and these can provide a huge increase in performance for some tasks (often graphical). For example, the x264 encoder library showed upto a 30x increase in performance when built with Cortex-A7 and NEON optimisations.

What about the USB and ethernet?

The USB and ethernet connections are the same as the B+ (2835). This means a maximum of 100Mbits/s for the ethernet, and USB2.0. However, since the USB can be constrained by the speed of the CPU, the added processing capacity means the work required to run the USB will be a lower proportion of total CPU time, so higher network and USB speeds are expected.  For example, for instance a USB SSD goes from about 27MB/s on Pi1 to 31MB/s on Pi2.

How big an SD card can I use?

The SD card Interface is the same, so this is the same as the B+, cards up to 64GB have worked successfully.

Do I still need CODEC licences?

Yes, if you want to use the HW decoders. However, the higher speed of the device MIGHT mean a SW decoder can be used. This will depend on the resolution of the video you are trying to display.

What about H265 (HEVC)

H265 is currently being tested, but this would be entirely SW decoded, as there is no HW support,  so will not be capable of much over 720p25, if that., without extensive optimisation that will take some time to do.

What about the compute module and Model A?

The CM is under development, but there is no release date as yet. Model A wil follow after the CM.

Can I use an SD card from my Pi1?

Yes, but ensure your SD card is fully up to date and the Pi2 needs a specific kernel build that is only available on the latest releases. To update, sudo apt-get update && sudo apt-get upgrade

Is it really 100% backwards compatible?

Sort of! The only area that may cause problems is that the memory mapped registers are now in a different place, so code that accesses these registers will need to have a new base address set. This address has been made available using a call at run time, so code can be modified to read this address and use it rather than having hardwired numbers. It a simple change that ensures code runs on any model of Pi.

What about Android?

Whilst the Foundation has no plans for port Android themselves, the new processor may mean Android will work a bit better than before, but there is still work to be done to use the GPU.

And what on earth is Windows 10?

Microsoft have been busy porting a version of WIndows 10 to the Pi2. This is the Windows On Devices version, for Internet of Things applications. It has no desktop, and a normal Windows 10 version will not install on the Raspberry Pi. In no way is this a replacement for the existing Linux OS, it’s an additional OS that people may wish to use. Note that the Foundation have not been sponsored in any way on this, it's all from Microsoft with infrequent tech support from the RPF.

What’s going to happen to the older models?

These will continue to be built as long as there is demand for them. Software will continue to be produced and optimised for the 4.5M Raspberry Pi’s already out there.

Is Power over Ethernet/Wake on LAN supported?

No, everything else is the same as the B+.

So, why didn't the RPF add Gigabit ethernet or SATA

That would require a new chip with both of the features build in to the hardware in order to take advantage of the speed. It wasn't possible to do that on the BCM2836 in the timescale - it's a LOT of work, both in HW design, implementation, and in software to support it. The only other option would have been to use a  chip with it already on, but there are no appropriate chips in the Videocore range that have them. So a new chip would mean a complete break with the Pi1 architecture, destroying any backward compatibility - no camera for example.

Monday, 2 February 2015

Raspberry Pi 2 B - First Impressions

The recently announced Raspberry Pi 2, at first look, appears exactly the same as its version 1+ predecessor. All the sockets and LED’s appear in the same place, which means most B+ cases and third party add-ons should work fine. (One exception is the Pibow Coupe, which is being modified as we speak to ensure it works on the 2B). Only when you look a little closer do you see the Broadcom logo on top of the slightly larger CPU, and realise that the device no longer uses a PoP memory chip. Looking underneath, and the expanded 1GB memory chip comes in to view.

The new SoC, the BCM2836, is a souped up version of the original BCM2835 chip used on the original Pi. It uses the same GPU (the Videocore4), but instead of the single ArmV6 device, now has four A7 cores with NEON instruction support. The cores are the recent P5 release, so very up to date. This combined with the higher memory capacity (upped from 512MB to 1GB) means a lot more capability. The clock speed of the device is set by default to 900, which is a bit quicker than the 2835. However, the added performance of the A7 cores means even single threaded applications will run faster on the new chip. Multithreaded applications that take advantage of the 4 cores should see up to a 6x improvements in pure speed. At this stage, the overclocking capabilities of the new CPU are relatively untested, but 1000 is certainly available. and I suspect that if you are willing to risk damage through overvolting, similar overclocks to the 2835 will be possible.

That’s not all though, there have also been improvements to the processors caches. The BCM2835 has a 16K instruction, and a 16K data Level 1 caches to help with memory accesses, but the BCM2836 now has 32KB L1 data and instruction per core, and in addition, a 512K Level 2 cache used by all 4 cores, but that is for the ARM onl;y and is no longer shared by the GPU.. So a total of 640K cache dedicated to the ARM’s compared to 32K. This should make a huge difference to the average memory access speed.

It’s worth comparing the A7 cores with the A5 cores used on some other SBC’s. The A5 is a dramatically cut down version of the A7. It uses the same Armv7 instruction set but has a much smaller silicon area and has a lot of performance features removed, resulting in capabilities not much better than an ARMv6 device. They are capable of higher clock speeds, but this benefit is offset by the poor instructions execution speed. So a A7 usually outperforms an A5, even one running at a higher clock speed.

The processor and memory change is the big headline, all other peripherals remains the same, which means the same GPIO’s, the same camera, same SD card, same interoperability with no changes required to the software (for 99% of applications). Given the educational nature of the device, this backwards compatibility is essential to ensure a smooth transition for schools and other learning establishments.

Everything that works on the current Pi will (or should!) work on the new one, because the A7 is backwards compatible with the v6. ALmost all the current software packages available in Raspbian should work fine on the new Pi. The kernel has been recompiled with all the new features required for the A7s (multi core etc) but appears to the end user in the same way. At the moment, there is no intention by the Foundation to move to a specific Armv7 repository as this would mean maintaining two huge repos, so for real performance users you’ll want to recompile your applications and libraries with the A7 as the target - this should give a small amount of extra speed. It should be possible to use something like the Ubuntu Armv7 repo to download precompiled packages, but I have not attempted to do this.

So, what difference does the new processor make? Well, first impressions are very favourable indeed. During boot, instead of the single Raspberry Pi appearing at the top of the console, you now have 4, one for each core. Boot time of Raspbian is about the same, perhaps a little quicker than previously. Response on the console is quicker and programs start up with much less delay than before. Running the standard LXDE as supplied shows it is immediately snappier to use. Applications again start up faster, and running more than one application at the same time doesn’t slow the device down, the faster multiple cores, and extra memory make a huge difference here. Web browsing is considerably faster, especially with the optimised browser and Scratch shows definite improvements in speed. Keeping an eye on the CPU meter shows a much lower CPU usage overall than on the original device. Compiling the Chapel compiler (C++) took just under 10 minutes on the B2+, but well over an hour on the B+, so the extra horsepower, and I think this is what makes the real difference, double the memory, makes compiling a pleasure rather than a chore.

A quick camera test using the Raspberry Pi camera with the standard demo apps, as well as some gstreamer pipelines, showed this works fine. The extra faster cores should mean faster performance in image processing with the right software. We also tried a software only H264 encode using the x264 library (compiled for ARMv6 so not completely optimised with A7 instructions and NEON etc), and managed over 20fps at VGA resolution, which is pretty impressive.

Anecdotal tests of Sonic Pi done by the Foundation show CPU usage dropping from 80% to 3% (yes, 3%). Another impressive speed improvement.

In precis, this is a great upgrade. A quad core device, with decent graphics and multimedia capabilities, completely backwards compatible with the 4M Raspberry Pi’s already out there. Is this now a desktop replacement? Well, for some, yes, I think it is. For more power hungry users, perhaps not, but for $35, there is nothing to touch the Raspberry Pi 2 Model B+, and I suspect an awful lot more Raspberries are about to be sold!