Switching focus back to a series of technical blog posts, over the next 5/6 blog posts (there may be some Web Summit updates intertwined!) I aim to demystify “all things data”, to include reporting – analytics – data science – business intelligence, key difference and dependencies between these terms, explore an introduction to where machine learning fits into your data model in your company. Governance, security and data management will also be covered.
To begin, a short post with 10 perspectives that will get you thinking. (hopefully!)
1: Big Data is just a tool.
2: Analytics is utilized by Data Science and Business Intelligence
3: Data is never clean. You will spend more of your time cleaning and preparing data (up to 90%) than anything else.
4: 90% of tasks do not require deep machine learning
5: More data beats a cleverer algorithm
6: Data Science + Decision Science + Analytics = Business Impact
7: You should embrace the Bayesian approach
8: Academia and Business are two different worlds – know this.
9: Presentation/Visualisation is key (know your audience)
10: There is no fully automated Data Science. You need to get your hands dirty.
Over the past decade, hybrid cloud adoption has steadily increased, with closed network becoming less the option of choice. But this comes at a cost to security and trust metrics. As we become more dependent on intelligent devices in our lives, how do we ensure the data that is within the web is not compromised by external threats that could threaten our personal safety?
As the adoption of IoT increases, so does the risk of hackers getting at our personal information. As Alan Webber points out on his RSA blog6, there are three key risk areas or bubbles that companies need to be aware of.
1: Fully enabled Linux/Windows OS systems: This area concerns itself with those devices that are not part of a normal IT infrastructure, but are still run on full operating systems, such as Linux or Windows. As everyone knows, prior to IoT, these OS have vulnerabilities, and when they are deployed in the “free world”, they are not as visible to IT admins.
2: Building Management Systems (BMS): This pertains to infrastructure systems that assist in the management of buildings, such as fire detection, suppression, physical security systems and more. These are not usually classified as threatened, yet shutting down a fire escape alarm system could lead to a break-in scenario.
3: Industry Specific Devices: This area covers devices that assist a particular industry, such as manufacturing, navigation, or supply chain management systems. For example, in the case of a supply chain management system, route and departure times for shipments can be intercepted, which could lead to shipment intercept and reroute to another geographical location.
So, how do we guard against these types of risks, and make the devices themselves and also the web of connected devices less dumb? Security must be looked at holistically to begin with, with end to end security systems being employed to ensure system level safety, and to work on device level embedded control software to ensure data integrity from edge to cloud.
Data routing must also be taken seriously from a security standpoint. For example, smart meters generally do not push their data to a gateway continuously, but send it to a data collection hub, before sending it in a single bulk packet to the gateway. Whilst the gateway might have an acceptable security policy, what about the data collection hub? This raises a major challenge, as how does one micro manage all the various security systems their data might migrate across?
Security Design Considerations
Early stage IoT devices unfortunately had the potential loss of physical security in their design, so it is necessary for security officers to be aware of the focus and location of their security provisioning.
To apply security design to the devices is not the most utilized method (similar to internal storage), as the cost and capacity of these devices is counterproductive to same. The devices would look to ensure consistency of communication and message integrity. Usually, one would deploy the more complex security design upfront within the web services that sits in front and interacts with the devices. It is predicted as the devices themselves evolve, and nanotechnology becomes more and more of an enabler in the space, the security design will become closer to the devices, before eventually becoming embedded.
It is proposed that shared cloud based storage will play a pivotal role in combating the data volume perplexity, but not without its issues. How do we handle identification and authentication? How do we ensure adequate data governance? Partnerships will be necessary between security officers and cloud providers to ensure these questions are answered.
Searching for the holy grail of 100% threat avoidance is impossible, given the number of players in an entire IoT ecosystem. Whilst cloud service providers own their own infrastructure, it is very difficult for them to know if the data that is received has not being compromised. There are ways to reduce this, but using metadata and building “smarts” into the data from typical known sets as it transitions from edge to cloud. It seems like an approach of something equivalent to a nightclub security guard checking potential clients to their nightclub is a useful analogy. “Whats your name (what type of data are you), where have you been tonight (whats your migration path), how many drinks have you had ( what transactions happened on your data).!!
IoT Security and Chip Design
One area that could bring about increased data privacy is the increased usage of the concept of “Trusted Execution Environments” or TEEs, which is a secure area in the main processor of the device. This ensures that independent processing can occur on critical data within the silicon itself. This enables trusted applications to run to enforce confidentiality and integrity, and protect against unauthorized cloning or object impersonation by remove and replace. Taking it into a real world example, a home owner tampering with their smart meter to reduce their energy bill would be one scenario that would be avoided with TEEs.
If cloud services companies can somehow increase their influence on the IoT device design (outside of the popularity of TEE’s in cellular applications). then utilizing technology such as this will ensure less risk once the data reaches the cloud. Collaboration efforts should be increased between all parties to ensure best practice across the entire IoT landscape can be established.