The SKA - How to Process a Universe of Data
Nick Gilbert
at 12:49 PM 20 Feb 2020
Comments 2
Artist's impression of the Australian SKA Pathfinder
Artist's impression of the Australian SKA Pathfinder
IMAGE BY Swinburne University/CSIRO

There's been a lot of talk over the last week about the Square Kilometre Array, and the final showdown in the bid for rights to host the primary array, now just between Australia and South Africa. A team involving a number of scientists in Western Australia, however, are already in the throes of planning how to deal with the mass of data the SKA will generate, possibly reaching up to an exabyte a day through the front end antennae, with the final array likely requiring a supercomputer faster than any currently in existence.

We spoke to Professor Andreas Wicenec, Head of Computing at the International Centre for Radio Astronomy Research, one of two bodies developing the project. He said that the rollout of the Array infrastructure will be in a number of stages, with each stage requiring increasingly more complex systems in order to deal with the vast flow of data.

“The third stage is the central processing,” he told PopSci.com.au “This will turn the complex numbers [these come from the previous two stages] into 3-dimensional image cubes and other data products. It will also perform the required calibrations and perform source-finding, that means it will run one or more detection algorithms on the data in order to find the actual astronomical objects. 

“For the full SKA this will potentially require one of the fastest supercomputers at the time when it goes operational. It definitely will require exascale computing.”

This vast amount of data, however, will not be stored locally for any length of time. As Professor Wicenec says, “the data rate does not allow this”. Instead, the data will be moved around the world to various data centres in near real-time, each of these also requiring supercomputers just to handle the massive volume of data. Each of these centres, says Professor Wicenec, will be built and maintained using funding from other regional institutions, along the lines of CERN's approach to handling data from the Large Hadron Collider.

Apart from the computing requirements though, is the simple issue of power. Supercomputers, sadly, don't run on joy and goodwill, instead sucking down vast amounts of electrical energy. Professor Wicenec told us that at this point in time, there simply does not exist a system that can provide enough power to the kinds of computers they need for this work.

“Currently one of the major challenges is the power consumption of the overall system, but in particular the computing system,” he said. “Essentially we would not be able to pay the power bill, if we would extrapolate from currently available computers, even if we would take technological progress comparable to the last years into account. We will need to improve the power consumption by at least a factor of 10 to 100.”

At this stage, it might sound stupidly complicated, and that actually trying to do anything with the data is almost impossible. That's actually almost true. Professor Wicenec also told us that a lot of compromises will have to be made in terms of what data to just cut out of the system as soon as possible, because in some cases it might actually be easier to simply turn the SKA back towards that same sliver of sky for a second look, instead of scrambling to find disk space to store the information the first time around.

“We will have to compromise on what to keep, since we will simply not be able to keep all the data for a long time. This will change with time as well and we may be able to keep more after a few years of operations. Apart from transient and variable sources it might be cheaper and more effective to simply re-observe the sky.”

The thing that boggles our minds, in the end, is the simple fact that supercomputers around the world will be joined together in massive, distributed network, all working to keep a lid on the vast flow of information coming in from this single array of telescopes. The scale of the project might prove daunting for some, but for scientists in this kind of field, it's old hat.

“I had been working for the European Southern Observatory (ESO) for 13 years and our observatories had all been build in Chile, not in Europe, mostly based on a purely scientific site selection. For me and quite a number of my colleagues it would be normal to work in such a situation. “

So what, then, is his preference for the SKA site? South Africa, or Australia?

“Foremost we are scientists and we would like to build and operate this magnificent observatory on the best possible continent,” he said, before adding, “Obviously we all believe that we DO have the best site on this planet in WA! So, lets keep our fingers crossed.”

ICRAR entered into an agreement last week with DataDirect Networks to develop one part of the system.

Read on after the break to see a more complete run through from the Professor on each stage of the SKA data processing pathway.

[via UWA press release]

---------------------------------------------------------------------

"The processing will be done in several stages and will dramatically reduce the data volume along that path. The first stage right after the antennas and depending a bit on the antenna technologies, will be a so-called beam-former. It is used to electronically phase the signal of a number of receiver elements into a single beam pointing to a certain direction of the sky. Since this combines many individual beams into a single one, this also means that the data rate is reduced by the same factor after the beam-former. In general there will be one such beam-former per antenna or aperture array and they will be located very close to the antenna or even inside the antenna itself, also because of the very high data rate. The beam-formers might be implemented using a variety of technologies ranging from ASICs and FPUs to GPUs. The decision about what's going to be used will be part of the initial design and architecture phase starting this year.

"The next stage is the central correlator, which combines the signals of every beam with every other beam, essentially producing a stream of complex numbers. Same story here, it could be implemented using various technologies like the beam-formers. There is also a possible scenario of having correlators for every station separately, but this would require detailed mathematical research about the implications for the final products.

"The third stage is the central processing. This will turn the complex numbers into 3-dimensional image cubes and other data products. It will also perform the required calibrations and perform source-finding, that means it will run one or more detection algorithms on the data in order to find the actual astronomical objects. For the full SKA this will potentially require one of the fastest supercomputers at the time when it goes operational. It definitely will require exascale computing. The SKA1 requirements [SKA1 is the first phase of the array system, comprising 10% of the final array] are more moderate, but still very challenging.

"After this step the data will most probably be persisted for the first time. The amount of data at this stage is very much dependent on the actual science project and can vary substantially. Could well be between several 100 TB and several PB/day."

RELATED
"Wonders of the Universe" App Is Your Space Textbook of the Future
The potential of tablets to transform the way we learn is pretty extraordinary. The first really "wow" app we saw for the iPad was a re-imagining of the periodic table. Wonders of ... more >
Astronomical Studies Tap into the SkyNet - Literally
We all knew it had to happen eventually, but few expected it this early: Skynet has arrived.Only, instead of launching nukes, enslaving mankind, and more or less being a total pain... more >
Australia's SKA Bid Falters, New Report Favours South Africa
Australia’s chances of hosting the Square Kilometre Array have been plunged into doubt by a report from the SKA Site Advisory Committee which recommends South Africa be chose... more >
New Supercomputer Adds More Grunt To Aussie Square Kilometre Array Bid
In another plus for Australia's bid for the Square Kilometre Array radio telescope, Australia now plays home to another big supercomputer, one specially geared for crunching the am... more >
 
2 COMMENTS
Russ
03 March, 2012, 11:54 PM
You could always ask people to donate time similar to what the SETI organisation did.
Nick Gilbert
05 March, 2012, 12:00 PM
They have talked about it in regards to using the CPU cycles of home computers to do processing, like folding@home, (you can read about it here: http://www.popsci.com.au/science/astronomy/astronomical-studies-tap-into-the-skynet-literally ) but it seems there still needs to be a lot of processing done at the site itself because of the sheer amount of raw data that will come in from the array.

Leave a comment

Please provide your details to leave a comment.

The fields marked with (*) are required.


Display Name: *
Email *:
Comments *:
(Max 750 characters)
Characters remaining:
*

(letters are not case-sensitive)

Enter the text in the image above
 
Editor's Picks
BY Jon Alain Guzik POSTED 10.04.2020 | 1 COMMENT
BY Kyle Wagner/Gizmodo POSTED 10.04.2020 | 0 COMMENTS
BY Rebecca Boyle POSTED 06.04.2020 | 0 COMMENTS