An analytical approach to data mining

We’re sorry, this feature is currently unavailable. We’re working to restore it. Please try again later.

Advertisement

This was published 20 years ago

An analytical approach to data mining

The fledgling Institute of Analytics Professionals of Australia is hoping its work on a new data-mining certification will soon distinguish trained professionals from cowboys.

Founded last October, the IAPA has members working with PricewaterhouseCoopers, Ernst & Young, the Australian Tax Office, Telstra, AAPT, banks, insurance companies and universities.

IAPA chairwoman Dr Inna Kolyshkina, who works with PricewaterhouseCoopers' global risk management solutions group, says proper certification should reduce marketplace confusion.

"Someone who is doing OLAP (online analytical processing) is only doing two-dimensional processing," she says. "At best this is data reporting - not analytical - because it's not building a predictive model.

"You want to give something a client can run their data through and have results predicted. This is just one example of people being cowboys in the area."

Advertisement

Kolyshkina says because data mining is all very new - only about two years old - its boundaries aren't yet clearly defined. As a profession, it's only now starting to emerge as "analytics". So the chairwoman is keen to distinguish those with true data-mining expertise from programmers who simply generate reports, and says there is an order-of-magnitude difference between them.

"Our methods are less dependent on statistical methods," she says. "These say: 'Let's assume we are in an ideal world and a number of assumptions about our data are correct. Then we can use an elegant theory to model it.'

"But with data mining, we are saying 'This is my data. Let's apply some brute-force computing to choose the best possible model from hundreds of thousands of possible combinations.' The computer will then choose one automatically."

This type of search-driven modelling started when companies realised that with widely available high-powered computers they could analyse and detect patterns in large bodies of data. For example, with customer churn, data mining can predict not only how many people will leave a bank or insurance company, but also who are the most likely customers at risk. The same methods can be used to trawl through millions of financial transactions to single out fraudsters or even terror suspects.

This is possible because data mining allows researchers to throw a thousand possible factors into an equation, and the data-mining application itself will find the most important ones. Traditional reporting methods choke when considering even 20 variables over large data sets. The IAPA says likening data mining to traditional reporting is akin to comparing a warplane with a rifle.

"Data mining is the answer to the industry need of managing large data sets," Kolyshkina says. "But it doesn't mean any monkey can press the data-mining button. If a warplane landed in a remote place where the people are uneducated, they wouldn't know how to fly it. But in the hands of a knowledgeable person, it can do much more than a gun."

The IAPA has so far recruited about 50 members - mainly industry specialists, researchers and academics. Only two members are students. But IAPA secretary Eugene Dubossarsky, who works as a senior consultant in Ernst and Young's actuarial business, is bullish about their prospects.

"Data mining as a profession is definitely growing because data is growing," he says. "And data is becoming more and more usable because of data warehousing (where information from many locations can be centrally mined). So the only way is up."

Pricewaterhouse Coopers has 50 people working in its data-mining area, and anticipates it will hire more as industry demand increases. IAPA committee member Warwick Graco, confirms his employer, the Tax Office, might also soon be hiring.

"The ATO intends having a network of about 30 data miners working with another 70 or so analytics staff, such as OLAP people," he says.

"And these numbers do not include those who perform less quantitatively oriented intelligence analysis and risk analysis. The ATO has recognised that it needs skilled staff that can extract value and meaning from its large holdings of data."

The good news is these new data-mining jobs will probably stay in Australia. This is because privacy considerations limit the distribution of data, while an intimate knowledge of business conditions is advantageous to making sense of it.

However, being a new field in computing, people with formal data-mining qualifications are practically non-existent.

One proposal before the IAPA is for a new "Accredited Data Mining Professional" certification to be based on holding a related university degree, in conjunction with demonstrable field experience.

"We want to be very inclusive," Kolyshkina says. "We don't want to make it a power game or use it for leverage. But we want to make sure someone off the street can't claim to be the data miner of the year. There is no degree in data mining, so it's very hard to say who is a data miner and who is not."

To promote better understanding of the field, the institute aims to help students find their way into the industry by providing mentoring.

Most Viewed in Technology

Loading