Tech News

Why you should use Presto for ad hoc analytics

Presto! It’s not only an incantation to excite your audience after a magic trick, but also a name being used more and more when discussing how to churn through big data. While there are many deployments of Presto in the wild, the technology — a distributed SQL query engine that supports all kinds of data sources — remains unfamiliar to many developers and data analysts who could benefit from using it.

In this article, I’ll be discussing Presto: what it is, where it came from, how it is different from other data warehousing solutions, and why you should consider

Read More

Oracle open-sources Java machine learning library

Looking to meet enterprise needs in the machine learning space, Oracle is making its Tribuo Java machine learning library available free under an open source license.

With Tribuo, Oracle aims to make it easier to build and deploy machine learning models in Java, similar to what already has happened with Python. Released under an Apache 2.0 license and developed by Oracle Labs, Tribuo is accessible from GitHub and Maven Central.

Tribuo provides standard machine learning functionality including algorithms for classification, clustering, anomaly detection, and regression. Tribuo also includes pipelines for loading and transforming data and provides a suite of

Read More

Amazon, Google, and Microsoft take their clouds to the edge

It might surprise you to learn that the big three public clouds – AWS, Google Could Platform, and Microsoft Azure – are all starting to provide edge computing capabilities. It’s puzzling, because the phrase “edge computing” implies a mini datacenter, typically connected to IoT devices and deployed at an enterprise network’s edge rather than in the cloud.

The big three clouds have only partial control over such key edge attributes as location, network, and infrastructure. Can they truly provide edge computing capabilities?

The answer is yes, although the public cloud providers are developing their edge computing services via strategic partnerships

Read More

How to create crosstab reports in R

Hi. I’m Sharon Machlis at IDG Communications, here with Episode 52 of Do More With R: Crosstabs.

Crosstab reports summarize data by two or more variables. For example: How did people vote on Ballot Question 1 broken down by gender and age group. There are a few ways to generate a crosstab report in R; I’d like to show you some of my favorites.

For this demo I’ll use a subset of the Stackoverflow Developers survey, with columns for Languages, Gender, and if they code as a hobby. I also added a LanguageGroup column for whether a developer reported using

Read More

How to count by group in R

Counting by multiple groups — sometimes called crosstab reports — can be a useful way to look at data ranging from public opinion surveys to medical tests. For example, how did people vote by gender and age group? How many software developers who use both R and Python are men vs. women?

There are a lot of ways to do this kind of counting by categories in R. Here, I’d like to share some of my favorites.

For the demos in this article, I’ll use a subset of the Stack Overflow Developers survey, which surveys developers on dozens of

Read More

What is quantum computing? Solutions to impossible problems

There’s no lack of hype in the computer industry, although even I have to admit that sometimes the technology does catch up to the promises. Machine learning is a good example. Machine learning has been hyped since the 1950s, and has finally become generally useful in the last decade.

Quantum computing was proposed in the 1980s, but still isn’t practical, although that hasn’t dampened the hype. There are experimental quantum computers at a small number of research labs, and a few commercial quantum computers and quantum simulators produced by IBM and others, but even the commercial quantum computers still have

Read More