homehome Home chatchat Notifications


'Data Smashing' algorithm might help declutter Big Data noise without Human Intervention

There’s an immense well of information humanity is currently sitting on and it’s only growing exponentially. To make sense of all the noise, whether we’re talking about apps like speech recognition, cosmic body identification or search engine results, highly complex algorithms that use less processing power by hitting the bull’s eye or as close as […]

Henry Conrad
October 29, 2014 @ 8:34 am

share Share

There’s an immense well of information humanity is currently sitting on and it’s only growing exponentially. To make sense of all the noise, whether we’re talking about apps like speech recognition, cosmic body identification or search engine results, highly complex algorithms that use less processing power by hitting the bull’s eye or as close as possible are warranted. In the future, such algorithms will be comprised of machine learning technology that gets smarter and smarter after each information parse; this will most likely employ quantum computing as well. Until then, we have to make use of conventional algorithms and a most exciting paper detailing such a technique was recently reported.

Smashing data – the bits and pieces that follow are the most important

Big Data

Credit: 33rd Square

Called ‘data smashing’, the algorithm tries to fix one major flaw in today’s information processing. Immense amounts of data are currently being fed in and while algorithms help us declutter, at the end of the day companies and governments still need experts to oversee the process and grant a much need human fine touch. Basically, computers are still pretty bad at solving complex patterns. Sure, they’re awesome for crunching the numbers, but in the end, humans need to compare the outputted scenarios and pick out the most relevant answer. As more and more processes are being monitored and fed into large data sets, however, this task is becoming ever more difficult and human experts are in low supply.

[ALSO READ] Breakthrough in computing: brain-like chip features 4096 cores, 1 million neurons, 5.4 billion transistors

The algorithm, developed by Hod Lipson, associate professor of mechanical engineering and of computing and information science, and Ishanu Chattopadhyay, a former postdoctoral associate with Lipson now at the University of Chicago, is nothing short of brilliant. It works by estimating the similarities between streams of arbitrary data without human intervention, and even without access to the data sources.

Basically, data is being ‘smashed’ with one another to tease out unique information by measuring what remains after each ‘collision’. The more info stands, the less likely it is it originated from the same streams.

Data smashing could open doors to a new body of research – it’s not just helping experts sort through data easier, it might also actually identify anomalies that are impossible to spot by humans in virtue of pure computing brute force. For instance, the researchers demonstrated data smashing using data from real-world problems, including detection of anomalous cardiac activity from heart recordings and classification of astronomical objects from raw photometry. Results showed that the info was on par with the accuracy of specialized algorithms and heuristics tweaked by experts to work.

share Share

This 5,500-year-old Kish tablet is the oldest written document

Beer, goats, and grains: here's what the oldest document reveals.

A Huge, Lazy Black Hole Is Redefining the Early Universe

Astronomers using the James Webb Space Telescope have discovered a massive, dormant black hole from just 800 million years after the Big Bang.

Did Columbus Bring Syphilis to Europe? Ancient DNA Suggests So

A new study pinpoints the origin of the STD to South America.

The Magnetic North Pole Has Shifted Again. Here’s Why It Matters

The magnetic North pole is now closer to Siberia than it is to Canada, and scientists aren't sure why.

For better or worse, machine learning is shaping biology research

Machine learning tools can increase the pace of biology research and open the door to new research questions, but the benefits don’t come without risks.

This Babylonian Student's 4,000-Year-Old Math Blunder Is Still Relatable Today

More than memorializing a math mistake, stone tablets show just how advanced the Babylonians were in their time.

Sixty Years Ago, We Nearly Wiped Out Bed Bugs. Then, They Started Changing

Driven to the brink of extinction, bed bugs adapted—and now pesticides are almost useless against them.

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

Couple Finds Giant Teeth in Backyard Belonging to 13,000-year-old Mastodon

A New York couple stumble upon an ancient mastodon fossil beneath their lawn.

Worms and Dogs Thrive in Chernobyl’s Radioactive Zone — and Scientists are Intrigued

In the Chernobyl Exclusion Zone, worms show no genetic damage despite living in highly radioactive soil, and free-ranging dogs persist despite contamination.