Archive for the ‘Work Stuff’ Category

Whoa

Posted: June 16, 2017 in Audio, HA, House, Work Stuff
Tags:

Well, apparently it’s been almost 6 months since I last posted…. where did the time go? New job, new projects and general family I guess.

Anyway, I had been playing with the HiFiPi, but I got sidetracked by trying to get it working with Logitech Media Server that runs on the family server… that way I can integrate it with OpenHAB and control it around the house (read turn off the kids music after bedtime!). That diverted into trying to get a touchscreen working and then the Pi got requisitioned for a different project…

A MagicMirror project…. that then involved me writing a couple of public transport modules – bus stop info and railway info just because I felt the need to write something! The not being a developer/R&D engineer occasionally bites.

That led to me rebuilding OpenHAB (again) using the new OpenHABian RPi image and then trying to tidy up the sprawl that our OH installation had become. Basically trying to make it a bit more ‘logical’; grouping rooms by floor and use, new targeted site maps and stuff like that rather than having one huge file for all items, one for rules, one multi-level sitemap etc etc. I’m now involved in helping to test a new OH binding that is used to control the Honeywell EvoHome system.

I’ve also finally got a reasonable amp and speakers in the lounge… but of course that meant cleaning up the rats nest of cables behind the AV cupboard and retiring as part of installing the amp!

So I have been doing stuff… I’ve just been too busy actually doing it to blog about it.

In Part 1 I discussed the technologies that make up the Blocks themselves, adding a block to the end of the chain and handle the distribution of the final BlockChain. Now we can consider how we protect each block in the chain and how we can ensure that the chain stays unbroken.

Changing A Block

We have a new block with the contents all secured and consensus has agreed that this block will be added to the end of the chain. What is to stop someone from removing a previously created block from the chain and replacing it with something else?

If we calculate the hash of the contents of the previous block we can then use this ‘signature hash’ value as part of the process of adding the new block on the end of the chain.

 

Screen Shot 2016-08-09 at 10.50.22

If the contents of a block is changed, the ‘signature hash’ will be different from the one stored in the next block of the chain and everyone will know that it has been changed. Consensus ensures that only consistent chains exist and that broken chains get dropped.

Vulnerabilities

We’ve shown that the BlockChain won’t allow alterations to existing blocks, how could we add a false block to the end?

  • Create a false block that contains the fake data and the correct references to the previous block in the chain.
  • Control 51% of all the nodes in the network and force them all to agree that the false block should be the new one.
  • Do all this before the network decides which block to add to the chain as part of the normal process. FYI – BitCoin currently adds 9304 blocks per hour to the chain or 2.5 every second.

Smart Contracts

Smart Contracts are the next iteration of the original BlockChain concept.

They work on the idea of storing a small program within the BlockChain that can then run in its own virtual machine when required. When invoked this contract programme can then be used to validate, enforce and manage transactions between two or more parties in a trusted way without requiring the services of a middleman.

The outcome of this invocation can then be written back to the BlockChain or to local contract storage where it will remain.

Smart Contracts Vulnerabilities

While seen as the next big thing int an already big thing, Smart Contracts are still trying to establish themselves in real world use.

As they use ‘Turing Complete’ languages to define the contract they are vulnerable to poor coding or flaws in the underlying virtual machine. Development environments are still immature increasing the risks further.

Ethereum ‘lost’ ~$53 million because they built a VM and a language that had flaws that were exploited to make contracts do unexpected things, in this case transfer the holdings to another account within the Ethereum ecosystem.

N.B. This issue has since been rectified by basically rolling back to a point in the ledger prior to the loss and then forcing a fork in the BlockChain where the monies never left the DAO. This has caused a debate in that it proves that the BlockChain is not inviolate – if enough people say so, the BlockChain can be changed and more importantly there are no attempts to fix the underlying issues in the VM and language itself. So in theory this loss could be replicated again in the future.

BlockChain is the new must use buzzword in technology and it claims to provide the capability for building trust between disparate and non connected systems.

This post is based on the presentation I wrote to explain the potential of BlockChain for my employer.

So what is BlockChain, where did it come from and why the sudden interest?

Why?

The sudden interest comes from the fact that in theory a BlockChain allows for a trusted, accessible, permanent, encrypted, distributed record that is almost impossible to modify. It allows two parties to perform transactions without the need for a trusted third party – something that can add both costs and complexity to the original transaction.

BlockChain is being used for everything from currency and diamond registry through to medical record management and music rights control. Large scale schemes such as welfare payment control are being investigated by the UK government.

Where?

BlockChain is the underlying technology for a virtual currency called BitCoin. As a technology used for the manipulation of funds, it has to be transparent, reliable and secure. While BitCoin initially suffered from bad press relating to its association with criminal organisations (Silk Road) and the high profile failure of businesses (Mt Gox), it is now established as a true virtual currency.

What?

At its most basic BlockChain is the combination of four current technologies in a new and unique way:

  • Asymmetric encryption (A.K.A. Public Key Encryption)
  • Hashing
  • Peer to peer networks
  • Consensus

Asymmetric Encryption

Public key encryption uses two keys to protect content.

One key is Private and is held ONLY by the user, the other key is Public and is available to everyone. These two keys are created together at the same time and are a mathematical pair – they will only work together.

If I want to send a message to someone, I use their public key to ‘sign’ the message. Signing the message effectively encrypts it so that only the private key can open it. Any attempt to change the message will change it and will result in the private key not opening it. The inability to open it implicitly implies tampering has occurred.

Software such as Pretty Good Privacy (PGP) made this incredibly complex topic easy for everyone to use, much to the horror of law enforcement agencies worldwide. At one point the PGP software was classed as a weapon and placed on a restricted list limiting it’s availability outside of the US!

Hashing

Cryptographic hashing converts a value of variable length into an alphanumeric string of fixed length. Hashes are a ONE WAY function – you can’t take a hash and find out the original value. Hashes are also very quick to calculate with the ability to do so available in all programming languages.

The specific method of hashing used determines the length of the final hash output. SHA1, MD5 & SHA256 are all common hashing methods, each providing different levels of complexity and output size.

For example using SHA1 to hash a simple string:

Hello world => 7b502c3a1f48c8609ae212cdfb639dee39673f5e

Changing even the smallest thing (“H” to “h”) results in a completely different result!

hello world => 2aae6c35c94fcfb415dbe95f408b9ce91ee846ed

Hashing can be used in everything from password validation to ensuring the file you download has not beed altered by malicious third-parties.

Peer to peer

Using Peer to peer technologies allows the BlockChain to be distributed across hundreds if not thousands of nodes.

A node can be a simple device making use of the BlockChain or a more complex or powerful device that is used to manage the BlockChain itself.

As only 4 nodes are required to ensure that the BlockChain remains available, accurate and up to date, the more nodes that host it serve only to increase reliability, speed and robustness.

Consensus

Consensus is a technique in which a set of nodes can reach an agreed outcome without a designated leader and with automatic detection of tampering.

When a new block is proposed to the BlockChain, all nodes must add it to their own BlockChain. If any node tries to lie, that node becomes tainted. As long as there is no single party with the control of >= 51% of the nodes, the BlockChain is always up-to-date and truthful.

Quorum systems are applied in BlockChain to reduce the complexity of the consensus algorithms with no penalty in consistency, partitioning the nodes in a similar way to electoral constituencies.

Consensus must happen before a new block can be added to a chain, so the only way to add a ‘bad block’ to the chain is to control the majority of the nodes.

As the nodes are distributed throughout the internet, this becomes impractical and this means that we have established an effective trust between all the users of the BlockChain without requiring a third-party.

Building the Block

By combining these technologies we can create the next block for inclusion on a chain:

  • The contents can be encrypted using Public Key Encryption
  • Hashing protects against the contents changing
  • Consensus agrees which block will be the new one on the end of the chain and that everyone agrees that this is true
  • Peer-to-peer networking distributes the new block to all locations for inclusion

Low maintenance blog

Posted: July 28, 2016 in Work Stuff

This uses a version of Ghost in a Docker container that then serves up an external folder as the blog root

It allows Ghost to do what it does best – simple blogging tools and site management while using external scripts to ensure backup of the content.

If we lose the blog server, it is very easy to rapidly redeploy the blog and get back up again.

Start the blog

We will create the basic blog and associate it with a remote repo for backup. This assumes that Docker is installed in the host machine!

1. create a remote repo that will be used to store the blog
2. check out the repo to the box that will serve the blog e.g. /home/blog/content
3. start up a Docker for the Ghost blog

docker run -d -p 80:2368 -v /home/blog/content:/var/lib/ghost -e "NODE_ENV=production" --name myblog ghost:latest

This starts the Docker and tells it to map the Docker Ghost dir (/var/lib/ghost) to the real directory (/home/blog/content). It also maps traffic from port 80 through to the normal Ghost port of 2368 and runs everything in daemon mode.

Backup the blog

We can now install a script to monitor the blog and push changes up to the remote repo. This assumes that either username/password can be used in the https git calls or that the appropriate .ssh keys have been setup and the repo uses ssh to connect.

4. install gitwatch and it’s dependencies (git & inotify-tools)
5. point gitwatch at the blog directory

gitwatch -s 300 -r git@github.com:username/your-repo.git /home/blog/content/

This waits 300 seconds after a change has been detected in /home/blog/content/ and then pushes it out to the specified repo.

6. the above command will need to be run in the background either using screen or nohup to make sure it runs all the time.

Belt & Braces

We could also install a script called buster to automatically convert a Ghost blog to static pages compatible with Jekyll and/or GitHub pages. This would allow us to publish the blog on other platforms if we decided that Ghost wasn’t our platform of choice anymore.

As buster generates a static folder in the blog dir, it will also be pushed back to the remote repo if anything changes.

Redeploy

If the worst happens, the Docker can easily be restarted with the same command as above. If the entire box is lost, the repo can be checked out on to a new box and the Docker started as normal. The only thing that may need outside assistance is ensuring that the domain name points to the new machine.

Contracts on the BlockChain

Posted: July 20, 2016 in Work Stuff

Introduction

Initial investigations into using the Ethereum BlockChain were targeted towards the use of Smart Contracts.

These Smart Contracts have the potential to help my employer manage entitlements and digital/asset rights without requiring a single point of approval.

The Contract

As a paper based exercise prior to trying to convert the theory into practise, the following three contract types have been considered.

  1. Generic shop
  2. Asset based
  3. Customer based

As Ethereum contracts come with basic storage capabilities it allows us to store data in a way defined by us within the contact. This can be anything from simple key-value pairs through to complex structures.

The contract would then store the information about the transaction depending on the contract type – the customer, the asset and any access rights associated with the transaction in its storage.

Playback would depend on checking that the user requesting the asset on the current device meets any access rights. Success would allow the player to continue, failure would cause playback to stop with an appropriate message.

In the case of the generic shop contract, the assumption is that there would be a single version of this contract on the BlockChain that then handles the duties for the entire marketplace.

For asset based contracts, each asset could have its own contract deployed on release to the marketplace. Each transaction for that asset would be stored in the contract.

For customer based contracts, each customer would have an account contract deployed on signup that then contained a list of available assets.

In all three scenarios, the assumption is that contract storage would be used to store the relevant information. However, if the contact is part of a private BlockChain, it could add blocks to that BlockChain rather than using contract storage.

In either case, storing information back to the BlockChain – either in contract storage or as blocks – has a cost associated with it that is proportional to the amount of data to be stored.

Each version has its benefits and drawbacks as outlined below.

Generic Shop Contracts

A generic contract like this acts as the central store in the asset lifecycle. The system would allow Customers to sign in with their account address (allowing access to the customer’s wallet) and then handle the transaction between the buyer and seller of the asset.

This would then allow for both open purchases, limited purchases (time, device, resolution and/or playback limits) as well as rentals or downloads.

Pros
1. Potential for single contract for all transactions.
2. If necessary, each Provider could extend the basic contract via inheritance to enable additional access rights that are relevant to them.

Cons
1. Would require extensive storage within the contact.
2. Speed of searching the storage to find a match could be detrimental to playback.
3. Single point of vulnerability – contract would need to be complex to handle all provider requirements.
4. Adding new requirements or providers could be problematic.

Asset Based Contracts

These contracts are associated with a specific asset or title. It would allow for a contract that holds the basic asset information that can then be extended by using inheritance to add contracts for specific versions or use-cases.

The provider can modify the contracts to better reflect their actual requirements on a per asset basis. New or popular titles can have tighter requirements. As assets get older, new contracts can be issued that reflect the changes.

 Star Wars
    |\
    |  \___ Star Wars UHD
    |   \
    |    \___ Star Wars HD
    |   
Star Wars SD        

Potentially, the entire contract could be updated allowing existing owners to benefit without changing individuals. For example, if the SD version was no longer available, all the current SD owners could be migrated across to the HD version without any action on their part.

The sale of the asset would need to be registered within the storage of the associated contract.

Playback would require searching the relevant asset contract for permission to proceed.

Pros
1. Allows for fine control over assets.
2. Can deploy multiple contracts & sub-contracts to limit risk of exposure or speed up validation searches.
3. Can revoke entire contract or sub-contract if asset security breached.

Cons
1. Lots of storage if asset is popular.
2. Speed of searching through storage.
3. More management for providers.

Customer Based Contracts

These contracts are associated with a specific customer. It allows for a contract that holds basic information about the customer. If necessary, the providers could extend this info via sub-contracts to ensure that they had information relevant to their access requirements.

 Customer Account
   |\ 
   |  \___ Studio 1 Account
   |   
  Studio 2 Account

Pros
1. All customer’s assets in a single storage tree.
2. Quicker rights searches.
3. Single master account for all providers.
4. Potentially easier to validate all assets on start-up allowing for status messages/alerts and removal of expired assets.

Cons
1. Customer loses the account, loses all the content.
2. Assumes all providers use the same system.
3. Account breached, all content exposed.

Ethereum – A Quick Overview

Posted: July 18, 2016 in Work Stuff

Introduction

Ethereum is a technology that extends the concept of the secure BlockChain (made popular by BitCoin) by the addition of smart contracts. These contracts allow for facilitating, enforcing and validating of agreements between two or more parties.

The idea is that the smart contracts are written to the BlockChain and then deployed into a virtual machine that then runs inputs against the contracts and returns results. The contracts are written using one of two Ethereum specific languages and then compiled into a bytecode version for use by the Ethereum ecosystem in the form of DApps.

Once deployed to the BlockChain the contracts cannot be changed, but can be used by anyone.

The Idea

The initial concept behind investigating the use of smart contracts in Ethereum was the implementation of a marketplace for video assets. By using a smart contract, we could potentially enable both digital and access rights management for an asset, without the need for third-parties to provide ‘trusted services’.

This would allow for Studios to sell niche or older assets directly to the customers without the costs associated with going through normal video sale channels.


The Issues

Investigations into using Ethereum have run into issues that have effectively prevented us from building any form of working anything, let alone an actual prototype.

The platform was evolving fairly rapidly, with all the associated issues that this causes – out of date tutorials and documentation, broken tools, fragmented development environments and frameworks. Given the scale of the platform, this fragmentation is more obvious and problematic

From A 101 Noob Intro to Programming Smart Contracts on Ethereum

A development cycle in Ethereum would be to develop a contract locally with testing to ensure it works. Then you can either deploy to the Ethereum TestNet or to a private BlockChain in your own network that you have created. This would allow access by multiple users simultaneously to ensure that the contract still behaved as expected. Finally, the contract would be deployed into the real world for use by the DApp clients.

Tools

To build a contract you need a set of tools that work – even at this early stage we would expect the ability to write a contract and debug it without significant issues.

However, the toolset to build and deploy contracts is not complete. Issues include the deprecation of key components without working replacements. Tools such as AlethZero that supposedly allowed you to deploy contracts and interact with them are being replaced by a new IDE called Mix.

Mix needs to be built from source, but fails to build on most occasions. Once it does build, it is unstable and exits without warning when trying to write and test contracts. Mix is apparently the only tool that allows you to see inside a BlockChain for debugging. Without this ability, it becomes difficult to properly test contracts and the libraries trying to interact with them.

While AlethZero still exists, it is not updated and now seems to be missing functionality caused by the platform evolving, limiting its usefulness.

The core Ethereum ecosystem is written in Go and C++, with two distinct languages created for writing contracts – Serpent (with a syntax similar to Python) and Solidity (with a syntax similar to JavaScript). Both languages need to be compiled to bytecode (Abi) prior to use.

Converting to bytecode requires a compiler that also needs to be built from source – although this is a more reliable process than building Mix, it is still unpredictable. Tools such as Mix have a compiler built in, but as they are unstable, getting a contract written in another editor and then converting it is an awkward process.

Frameworks

There are two significant frameworks written by third-parties that attempt to make the contract creation process much less painful – Truffle and Embark. Both are NodeJS based apps that allow you to undertake some contract development. They allow for the creation of a contract, compilation to a JavaScript library and deployment as part of a web-based app. However neither allows you to actually see the resulting BlockChain and its contents for debugging and neither are of any use if you want to write an app that is not HTML based.

Both frameworks also contain bugs that have caused issues when trying to build contracts.


Summary

While the idea behind Ethereum and the use of Smart Contracts has a huge amount of potential, it is currently hampered by a poor tool set and development environment. This situation is not likely to improve in the near future as the platform is frozen while the community tries to find a solution to the problems caused by the loss of almost $53 million in funds from the DAO core system by a hack.

This hack also served to underline some significant flaws in the virtual machine that runs everything and the languages that have been built upon it. It appears that changes to the underlying VM will be needed to remove the revealed security flaws. Developing a new VM would mean rebuilding the BlockChain from scratch on the new VM, something that is unlikely to happen given the amount of money currently locked into the Ethereum ecosystem.

The languages available, although being presented as a language especially tailored for Smart Contracts, are not much different than regular programming languages (in fact, they are Turing complete). This means that everything written in them is entirely procedural and that no notions, terminology or vocabulary of high-level concepts of contracting can be used in the contracts’ code. While this might not seem important at first sight, it actually entails two big problems:

1. The procedural code can potentially be too complex to understand or verify prior to execution. This is one of the main reasons that the DAO code, even after being audited twice, had important vulnerability issues causing the problems aforementioned.

2. Contracts cannot be properly monitored or formally checked without the use of high-level contracting notions (fulfilment, violation, parties, clauses) at the language level. This is widely accepted in the field of the research of electronic contracting but the Ethereum developers have completely ignored it.

Given the weakness and instability in the platform and its toolset, we feel that further investigations in this technology needs to wait for either an alternative platform to emerge or for Ethereum to address these issues and create a more stable toolset.

CLJ vs CLJS

Posted: May 10, 2016 in Work Stuff
Tags:

The default languages in the team are Clojure and NodeJS, these provide both the power and speed necessary for rapid prototyping and concept development.

Clojure (clj) has a sub project that allows you to write code in Clojure and then compile it into JavaScript that can be run in any normal browser. This is called ClojureScript (cljs) and it appears to give you all the benefits of Clojure combined with the benefits from Node.

A recent pure Clojure project provided an API with an in memory (with persistent file based storage) database. While this works perfectly, is stable and runs without any issues, the time to deploy it on our Kubernetes Cloud can be in excess of 10 minutes.

As an extended spike, we decided to see how easy it would be to migrate the app to NodeJS via ClojureScript.

The Pros

  • Re-use of large chunks of the existing Clojure code base without change
  • Ability to create CLJC (common) files that can be used in either pure Clojure or in ClojureScript
  • Asynchronous application
  • Fast deployment and start times

The Cons

Problem Number 1 – Data Storage

There are few reliable, in memory with persistence JS only database libraries. There are even fewer that also compile happily with ClojureScript. Most of the databases that JS utilises in Node are based on external server capabilities – MongoDB, CouchDB.

Given the lightweight nature of the data being stored, and the ability to run it without relying on additional servers, I initially used SQLite via the SQLite3 NPM library. While this appeared to work fine, run-time errors meant that the DB was never properly built or connected. Using naive NodeJS to create, write and read back the same data did not give us errors, so it appears that the library uses something that the compilation process doesn’t handle.

Swapping to the lighter-weight DBLite NPM library allowed us to build the new db, populate it and then use it via the app.

The app followed the usual path of being Dockerised and then run in Kubernetes. Here we hit a strange situation that is probably NOT related to ClojureScript, but is worth flagging. While the SQLite DB would initialise quite happily in a Docker that ran locally, deploying the same container to the Kubernetes platform consistently gave us a DB error that the file was locked.

Eventually, I removed DBLite and tried a third database system called NeDB. While this required re-writing sections of the DB system to use a MongoDB/Key-value style of record handling, the app now deploys and runs without issue.

Problem Number 2 – Platform Bias

ClojureScript is written to compile to browser JavaScript, not NodeJS. This means that things can fail with strange error messages. Initially I tried to use the Request NPM library to make HTTP GET and POST calls to an external REST API. Although the code didn’t throw any errors during the compilation process, when it was used in the actual program it would return an error:

“XMLHTTP Not found”

It appears that as ClojureScript is browser focused, it makes use of, or at least assumes the presence of, libraries built into the browser. Using NPM dependencies that also use them in NodeJS where they don’t exist naturally causes problems. As these issues don’t prevent compilation, errors will only appear during run-time.

Reverting to built-in NodeJS functionality – in this case replacing the ‘Request NPM’ with the native ‘request’, removes this as an issue, but does mean that additional work is required to write and then use HTTP calls.

Problem Number 3 – Logging frameworks might not run

Normal NodeJS development will often make use of a third party logging framework or tool, but these appear to suffer from having their output curtailed. For example ‘Sexylog’ NPM when used in TRACE mode did not seem to produce as much output as I would expect from previous experience.

However, the built in logging and use of the normal ‘console.log’ probably limit the effect this will have on development. As with platform bias above, it pays to keep it simple and utilise the native capabilities of NodeJS & ClojureScript rather than add complexity in the form of external libraries.

Problem Number 4 – Scaffolding

Clojure requires a fair amount of scaffolding to setup, compile and run the final code. While most of this is handled by Leiningen (or Boot), the number and variety of frameworks/task runners/ helpers than can be called to setup the initial project is overwhelming.

Granted most of these are targeted at specific outcomes – use Om if you want to build a React based app, but it is not always obvious which framework will be the best to use in a given situation.

This is more of a personal peeve than a technical one – as someone who is new to Clojure I seem to spend a lot of time at the beginning trying to remember what I should do to start off a new project. Do I use figwheel or not? What was figwheel? Is it an app or a library? What’s the difference?

Now add the additional complexity on top of that to ensure that we can use ClojureScript… I’ve still not figured out how I can scaffold a Clojure backend with a ClojureScript front end.

Problem Number 5 – Clojure Libraries

Like the Node NPM libraries discussed above in Platform Bias, not all Clojure libraries will run with ClojureScript. So this often leaves us with a NodeJS library we may know and use that we cannot use via ClojureScript.

The Gotchas

Other little things that have caught me out:

1) During compilation missing dependencies (NPM or Clojure) don’t get flagged.

So a missing NPM/Clojar dependency won’t show up until everything falls over at runtime

2) EDN -> JSON data conversion will break if EDN is using numbers as keywords.

The EDN  {:1 “some data”, :2 “other thing”} won’t convert to JSON as JSON doesn’t accept numbers as labels.

{“1”:”some data”, “2”:”other thing”} is NOT valid.

3) REPL is missing

One of the best things about developing Clojure is the REPL. It allows you to immediately see the effect changes on the code will have. As the code compiles to ClojureScript, it appears that this level of immediate feedback isn’t possible – although this might just be in the Atom based photo-REPL add on.

4) The ’new’ keyword and callbacks

We’ve all dealt with callback hell in JavaScript… thankfully as ClojureScript is asynchronous callbacks are not an issue. However, when we try to use some NodeJS libraries or functions callbacks are essential. This leaves us with the only option but to ‘hack’ our way around the problem by using either Clojure channels or anonymous functions.

Eg:

JS: db.loadDatabase (console.log(error))
CLJS:    (.loadDatabase db (fn [err] (println err)))

Of the two options channels is by far the more efficient as it allows the output to be piped directly to the calling function without additional processing, something that is more in line with functional programming.

The Outcome

Although the outcome was a working version of the API, the extra time spent trying to find solutions to the differences between Browser JS and NodeJS engines was considerable.

The ability to migrate a significant amount of the internal processes – data structures, validation and algorithms across without change initially made the re-write very easy. It was when issues started to appear at the two ‘ends’ of the program that things began to slow down.

Getting data in and out via the web interface and getting data in and out of the storage system both showed where the NodeJS compiled ClojureScript wasn’t quite working.

Based on this spike, we have decided to continue the development of the API as a Clojure project for now… but with a view to using more CLJC (common) files that will allow us to migrate to ClojureScript at a later date if we need to. It is our belief that the development of the Clojurescript compiler is moving forward and the differences highlighted between the browser and Node JavaScript output will soon be handled.