Machine Learning 1.0 over Coffee

Article aimed at anyone (technical or non technical) who wants to understand the steps in Machine Learning at a high level. Readable in five minutes over coffee. I think.

What is Machine Learning?

Today we live in a world of seemingly infinitesimal connected devices, in both personal and commercial environments. The currency associated with these devices is data, which whizzes around in near real time, is stored locally and in cloud environments. The types of data vary greatly, with text, audio, video and numerical data just a sample of the data modalities generated.

As this data is a currency, there is value associated with it, but how do we extract this value? A high growth area is called data science which is used to extract value and insight from this data. It has numerous ingredients in the recipe, with data mining, data optimization, statistics and machine learning key to generating any successful flavor. And like an good recipe, you need a good chef. These chefs in data terms are called Data Scientists, who use a wide variety of tools to glean insight from the data to deliver impact for your business. The data-sets themselves can either be uni-variate (single variable or feature), or multivariate (multiple variables or features). A persons age would be an example of uni-variate, whereas multivariate would expand a person’s feature set to include age, weight and waist size for example.

Why do it?

Machine learning (ML) is born out of the perspective that instead of telling computers how to perform every task, perhaps we can teach them to learn themselves. Examples include predicting the sale price of your house based on a set of features (sq. feet, number of bedrooms, area), to try to determine if an image is of a dog rather than a cat to determining the sentiment of a set of restaurant reviews to be positive or negative. There are a host of applications across many industries, some of these are shown below (source Forbes)

Before the magic is induced from the algorithms, perhaps the most important step in any machine learning problem is the upfront data transformation and mining, towards optimization. Optimization is required as most of the algorithms that “learn” are sensitive to what they receive as an input, and can greatly impact the accuracy of the model that you build. It can also ensure you have a thorough understanding of your data-set and the challenge you are trying to solve. Some of the data transformation and mining techniques include record linkage, feature derivation, outlier detection, missing value management and vector representation. All this is sometimes called “Exploratory Data Analysis”.

Techniques once Optimized

Once data is presented in the right manner, there are a number of machine learning techniques one can apply. They are broken in supervised and unsupervised techniques, with supervised learning taking an input data set to train your model on, and with unsupervised no data-sets are provided. Unsupervised techniques include learning vector quantification and clustering. Supervised techniques include nearest neighbors and decision trees. Another techniques is Reinforcement learning, and this type of algorithm allows software agents and/or machines to automatically determine the ideal behavior within a specific context, to maximize its performance.

Verifying your model is also an important step, and we often use confusion matrices to do that. This involves building a table of four results – true positives, true negatives, false positive and false negatives. A set of test data is applied to the classifier and the result are analysed to assess performance. Sometimes, the result of the model is still questionable. When this happens, machine learning has an answer in the form of ensemble methods, which essentially you build a series of models that you build your final prediction from. Examples here include boosting and bagging on the training data. Bagging splits the training data into multiple input sets, boosting works by building a series of increasingly complex models.

There are complimentary techniques used in any successful machine learning problem – these include data management and visualization, and software languages such as python and java have a variety of libraries that can be used for your projects.

Going further

Taking a step further from machine learning, you are into a complimentary area called artificial intelligence (AI), which leans more on methods such as neural networks and natural language processing which look to mimic the operation of the human brain. This is showing how human centric design in technology is evolving, and how much excitement there is for how humans and technology will work together in the future. It can be said this excitement is born from revealing that as we evolve our understanding of what it means to be human, it outweighs anything that technology alone can deliver. People have always been at the core of innovation, and this has led to an evolution in how improved our lives are.

Published by

deniscanty

DENIS CANTY IS EXCITED TO BEGIN IN JULY 2017 WITH MCKESSON, A FORTUNE 5 COMPANY – AS THEIR SENIOR DIRECTOR OF CYBER SOFTWARE ENGINEERING IN CORK. HIS LAST ROLE (TO JUNE 2017) WAS AS THE LEAD TECHNOLOGIST FOR IOT WITH JOHNSON CONTROLS INNOVATION GROUP BASED IN CORK, IRELAND. THAT ROLE MEANT COLLABORATING EXTENSIVELY BETWEEN HIS TECHNICAL AND SALES TEAMS TO DRIVE FURTHER COMMERCIALISATION OPPORTUNITY THROUGH TECHNOLOGY (BOTH OUR OWN AND PARTNERS/STARTUPS) INTO OUR SALES CHANNELS, SPECIFICALLY LOOKING AT THE EMERGING SMART BUILDING MARKET. THE PROJECTS INCLUDE OUR EXISTING TECHNOLOGIES – BUILDING SECURITY, RETAIL, HVAC AND BUILDING ENERGY – AND EMERGING TECHNOLOGIES SUCH AS IOT, AR AND MACHINE LEARNING. A KEY COMPONENT WAS TAKING KEY INPUT FROM NUMEROUS STAKEHOLDERS AND PROCESSES TO DELIVER ROI FOR CUSTOMERS AND PARTNERS. HE THEN LED THE TEAM TO BUILD AND DEPLOY THE SOLUTIONS IN AN LEAN AGILE MANNER. DENIS SPOKE ON THE NATIONAL AND INTERNATIONAL CIRCUIT FOR JOHNSON CONTROLS AT NUMEROUS TECHNOLOGY CONFERENCES. HIS LEADERSHIP STYLE IS LEADERSHIP THROUGH TRUST AND DELIVERY, AND I TAKE RESPONSIBILITY FOR MY TEAM, COMPASSION AND HUMILITY ARE ALSO IMPORTANT AS A LEADER IN MY OPINION. I LIKE TO BUILD A BALANCED CULTURE, WITH THE PEOPLES PERSONALITIES IMPORTANT INPUTS INTO THAT. DENIS HAS A DEGREE IN ELECTRONIC ENGINEERING (2H) FROM CORK INSTITUTE OF TECHNOLOGY, A MASTERS IN MICROELECTRONIC CHIP DESIGN (1H) FROM UNIVERSITY COLLEGE CORK AND A MASTERS IN COMPUTER SCIENCE (1H) FROM DUBLIN CITY UNIVERSITY. PRIOR TO JOHNSON CONTROLS, DENIS HELD A POSITION OF PRINCIPAL DATA ARCHITECT AND DEVELOPMENT MANAGER WITH EMC FROM 2010 TO 2015, SPENDING 2011 IN SILICON VALLEY. HE LED A TEAM FOCUSED AT REDUCING AND CONSUMING NINE TEST AUTOMATION PLATFORMS FROM EXTERNAL MANUFACTURERS TO ONE EMC CLOUD HOSTED PLATFORM. HE ALSO WORKED ON A NUMBER OF WORKFLOW AUTOMATION SOFTWARE REPLACING TEDIOUS MANUAL EXTRACT, SEARCH AND REPORT COMPILATION THAT RESULTED IN EFFICIENCY GAIN (WRITTEN IN PYTHON). I ALSO BUILT PREDICTIVE ANALYTICS APPLICATION IN MANUFACTURING AND DATA SCIENCE MODELS FOR THE CUSTOMER VERTICAL WITH THE CTO OFFICE. DENIS BROUGHT MICROSERVICES BASED DESIGN ALONG WITH DISTRIBUTED STORAGE AND PROCESSING TO THE GROUP, CHANGING THE DEVELOPMENT CULTURE IN THE PROCESS. DENIS WAS ALSO A MEMBER OF EMC’S GLOBAL INNOVATION COUNCIL AND AS AN AMBASSADOR WITH THEIR OFFICE OF THE CTO, LEADING THEIR CUSTOMER INSIGHT SOFTWARE DEVELOPMENT. DENIS WON TWO GLOBAL INNOVATION AWARDS IN HIS TIME WITH EMC, IN THE AREAS OF SUSTAINABILITY AND E-SERVICES, AND HAS A PATENT IN INTELLIGENT POWER MANAGEMENT ON STORAGE ARCHITECTURE. HE ALSO WORKED PREVIOUSLY FOR ALPS AUTOMOTIVE DIVISION FROM 2005-2010, IN A VARIETY OF ROLES, INCLUDING AS THE LEAD COMPUTER VISION ENGINEER, AND THE LEAD TECHNOLOGIST ON EUROPEAN RESEARCH PROJECTS IN THE AREAS OF IN-VEHICLE DISTRACTION MONITORING AND SMART HOME DEVICES. DENIS ALSO SPENT TIME CONSULTING IN THE START-UP WORLD, SUCH AS A HEALTHCARE INFORMATICS CONSULTANT WITH ACE HEALTH, LEADING THE DEVELOPMENT FOR AN APPLICATION WHICH HELPS HEALTHCARE SERVICE PROVIDERS ACHIEVE BETTER PATIENT OUTCOMES AND CUT COSTS THROUGH A REGULATOR-APPROVED PREDICTIVE ANALYTICS PLATFORM IN THE DUTCH AND US MARKETS. HE ALSO HAD HELPED NUMEROUS STARTUPS ON BUILDING THEIR TECHNOLOGY ROADMAP TO ALIGN WITH DEFINED TARGET MARKETS AND CUSTOMER BASES.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s