mte1oda0otcxodaxntq0mja1

Pain and the Saint

Hardly have news of Mother Theresa of Calcutta’s beatification reached around the world, the age-old criticisms have been wheeled out once again and dusted off like it befits the tired tropes they are. Many of them are somewhere on the road between Ridiculousville and Obscenetown, and of course some are just plain exaggerations. There is one that is a little more complex, and that hits a little closer to home.

This one goes somehow along the following way: “Mother Theresa, because of her beliefs, glorified pain, and therefore let poor people die without adequate pain relief”. The first part is pure speculation, based on the second part, which derives from a letter by the surprisingly undistinguished Dr Robin Fox, editor of the Lancet between 1990-1995, that examined the state of medical care in Mother Theresa’s Calcutta institution.1http://www.sciencedirect.com/science/article/pii/S0140673694923531 Four things before all:

  • One, it’s what is called a ‘letter to’, and as such not subject to peer review before publishing. Especially not if it’s by the editor. It is not a peer reviewed study. It is not scientific evidence. It is not science. It is a travel report, presented without corroboration.
  • Two, Dr Fox is not an anaesthesiologist, a specialist in palliative care or an expert in terminal care analgesia. He did not bother to take one along, domestic or foreign.
  • Three, and most problematically: Dr Fox did not compare what he saw in Calcutta with what he would have seen at any other Indian hospital. The third of these is the most condemning, because if he had, he might have learned a little about the difficulties of obtaining adequate pain medication in India.
  • Four, saints are not perfect, and neither is the Lancet. In recent years, the Lancet has committed a lot of fairly egregious sins against good research, viz. the Burnham Iraq mortality paper,2http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(06)69491-9/abstract the Bristol Cancer Centre study3http://www.thelancet.com/journals/lancet/article/PII0140-6736(90)93402-B/abstract and more than a few other blunders. These are just the biggies. And these were actual papers, i.e. peer reviewed. Imagine the non-peer reviewed stuff. Non-scientists tend to have an elevated image of journals, especially those well-known and with a high Impact Factor, like Science, The Lancet or the NEJM, ignoring that they, too, are fairly flawed products of fairly flawed human institutions.

The religious angle

Let’s dispose of one of the more pernicious arguments right here. It is sometimes argued that Mother Theresa intentionally let people die in pain because suffering is great and Catholics hate adequate analgesia. That’s false on both counts.

There’s a difference between saying ‘suffering is meaningful’ and ‘suffering is great’. There’s nothing positive about avoidable suffering. Consider §2279 Cathechism of the Catholic Church:

Even if death is thought imminent, the ordinary care owed to a sick person cannot be legitimately interrupted. The use of painkillers to alleviate the sufferings of the dying, even at the risk of shortening their days, can be morally in conformity with human dignity if death is not willed as either an end or a means, but only foreseen and tolerated as inevitable. Palliative care is a special form of disinterested charity. As such it should be encouraged.

The above is completely in line with modern medical ethics, which permits ‘terminal sedation’ or ‘terminal analgesia’, administering adequate pain medication (which often can mean rather high doses), even if this will almost inevitably hasten death (due to the respiratory suppressive effects of opiates/opioids), but not intentional use of pain medication to kill. There is a whole realm of ethics, medical and otherwise, on the doctrine of double effect that is involved here,((As a good starter, interested readers should consider Gillon (1986)) but what is clear without a doubt is how far this position of the Church, binding on Mother Theresa and, insofar as one can tell, flawlessly applied, does not oppose proper analgesia at any point. As such, she was not getting kicks out of people dying in pain. In the first place, she started her work to do what she could to keep people from dying an undignified and horrible death on the streets, and instead spend their last days or hours in dignity. The cost of this were, as her own writings reveal, extreme emotional distress, nightmares and what could without doubt be diagnosed as a severe anxiety disorder. If anything about dispensing medicine at her house is strange, it is that she did not start dipping into the Xanax jar.

The medical angle

Perhaps it deserves mention that Mother Theresa’s order did not run a hospital, or a hospice in the modern sense of the world. In fact, hospices in the Western sense, which Dr Fox seems to compare Mother Theresa’s institutions with, did not exist in India at the time and remain fairly scarce. She ran an institution with very modest means and staffed by volunteers that aimed at giving dying people some dignity. None of them were forcibly picked up on the street by jack-booted nuns and told they’re going to go to Mother T’s, or else. It was up to those who did so to decide whether they wanted to or not.

As such, the criticism that the institution did not distinguish between the terminal and non-terminal is rather strange, because 1) they were not a hospital or hospice in the modern, Western sense of the word, 2) their care was not specific to the dying – the sick can derive significant help from being in a clean, safe environment, 3) they lacked the medical resources.

In an alternate universe, Mother Theresa had to her avail the suns needed to run a properly staffed medical institution, with doctors and referrals and all the drugs in the world. In that perfect universe, perhaps the abject poverty that meant the alternative would be dying on the streets would not have existed, either. Of course the care she administered was, when judged from the perspective of a hospital, inadequate. But she was at no point running a hospital. Much as you don’t expect your hairdresser to have an M.D., her institution was what it was. Equally, canonisation is what it is – it is not a medical doctorate, or the Church sending ‘atta girls’ for a hospital well run.

The personal angle

There’s a reason while this story hits home to me. I tend not to speak about this publicly, but I have been living a long, long time now with extremely severe, often interminable, pain. The international politics of pain medication and its availability, closely linked with narcopolitics, is one of my pet topics I can bore people with into a stone cold stupor. What it boils down to is this: proper pain relief is an integral part of human dignity. Being in unmanaged or inadequately managed pain means the patient is inadequately treated, and ignoring pain relief is the worst kind of non-profitary medical malpractice.

With that said, I’ve also been at the forefront of research into pain, both as a subject and as a participant. I have had the pleasure to try quite a few modern approaches to pain management. I’m hoping to be able to find some better ones. Throughout this, I was aware by the risks and complexity of pain analgesia. With a frail, terminal patient, it gets even more complicated.

Strong pain medication is not like Tylenol that you can simply pop a few of and things get better. In general, patients are started on a low dose PRN (‘as needed’) oral opioid in conjunction with an NSAID, then eventually the PRN oral opioid is increased (a process called ‘titration’) until it manages their needs. Then, the PRN opioid is converted into a long-term opioid, such as a matrix patch, which releases the drug into your fatty tissues over time (usually three to five days) or a long-lasting time-release opioid formulation, together with low doses of the PRN opioid for ‘breakthrough pains’, pain spikes that are no longer treated adequately by the long-term pain medication. Alternatively, severely ill/bedbound patients may be offered a solution like PCA, which injects a constant stream of an opioid with the option of the patient to add a given number of ‘bolus’ doses for pain spikes – these are used e.g. in the post-surgical context, for the first day or so after a surgery. More complex pain management issues exist for chronic complex pain, such as spinal catheters, neurosurgery, implantable pain pumps, implantable spinal cord stimulators and so on. This is the state of the art, today, in the West, in 2016. In Mother Theresa’s days, pain patches were barely existent and certainly not available. The only thing they could have had was oral morphine sulphate or IV morphine.

Pain medicine is one of the most expensive branches of medical care, despite the fact that most pain medications are cheap as chips. The reason for that is the incredible attention required and the risk involved in pain medication. Especially before antagonists like naloxone became widely available and financially feasible, it would have taken a host of highly qualified doctors to appropriately dispense pain medication in Mother Theresa’s institutions. Dr Fox admonishes her for not stocking strong opioids, but really, should she be not praised instead for not stocking potentially fatal pain medicine that takes specialist care to administer, specialist care that their people lack and that legally requires doctors their houses did not have and could not afford to have on staff?

Perhaps the more appropriate angle to this is immense personal gratitude that in our world, we have ways of managing pain that the poorest of Calcutta have no access to. Truly, we are blessed. As were they, when Mother Theresa gave them the small mercy of giving them a modicum of care and respect for their dignity in their last hours.

One should never stop hoping and demanding for the world to improve. But nor shall one confuse a desire for a better world with a blanket condemnation of those who made do with the world they were handed. In a demanding and dire situation, so dire that it drove Mother Theresa herself to anxiety and insomnia in the beginning, she made the best she could with what she got. The cost of aspiring to a better world should not be the denigration of those who had to live in this one.

References   [ + ]

1. http://www.sciencedirect.com/science/article/pii/S0140673694923531
2. http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(06)69491-9/abstract
3. http://www.thelancet.com/journals/lancet/article/PII0140-6736(90)93402-B/abstract
Merton_College_library_hall

The one study you shouldn’t write

I might have my own set of ideological prejudices,1Largely, they presume outlandish stuff like ‘human life is exceptional and always worth defending’ or ‘death does not cure illnesses’, you get my drift. while at the same time I am more sure than I am about any of these I am certain about this: show me proof that contradicts my most cherished beliefs, and I will read it, evaluate it critically and if correct, learn from it. This, incidentally, is how I ended up believing in God and casting away the atheism of my early teens, but that’s a lateral point.

As such, I’m in support of every kind of inquiry that does not, in its process, harm humans (I am, you may be shocked to learn, far more supportive of torturing raw data than people). There’s one exception. There is that one study for every sociologist, every data scientist, every statistician, every psychologist, everyone – that one study that you should never write: the study that proves how your ideological opponents are morons, psychotics and/or terminally flawed human beings.2For starters, I maintain we all are at the very least the latter, quite probably the middle one at least a portion of the time and, frankly, the first one more often than we would believe ourselves.

Virginia Commonwealth University scholar Brad Verhulst, Pete Hatemi (now at Penn State, my sources tell me) and poor old Lindon Eaves, who of all of the aforementioned should really know better than to darken his reputation with this sort of nonsense, have just learned this lesson at what I believe will be a minuscule cost to their careers compared to the consequence this error ought to cost any researcher in any field.

In 2012, the trio published an article in the American Journal of Political Science, titled Correlation not causation: the relationship between personality traits and political ideologies. Its conclusion was, erm, ground-breaking for anyone who knows conservatives from more than the caricatures they have been reduced to in the media:

First, in line with our expectations, higher P scores correlate with more conservative military attitudes and more socially conservative beliefs for both females and males. For males, the relationship between P and military attitudes (r = 0.388) is larger than the relationship between P and social attitudes (r = 0.292). Alternatively, for females, social attitudes correlate more highly with P (r = 0.383) than military attitudes (r = 0.302).

Further, we find a negative relationship between Neuroticism and economic conservatism (r_{females} = −0.242, $$r_{males}$$ = −0.239). People higher in Neuroticism tend to be more economically liberal.

(P, in the above, being the score in Eysenck’s psychoticism inventory.)

The most damning words in the above were among the very first. I am not sure what’s worst here: that actual educated people believe psychoticism correlates to military attitudes (because the military is known for courting psychotics, am I right? No? NO?!), or that they think it helps any case to disclose what is a blatant bias quite openly. In my lawyering years, if the prosecution expert had stated that the fingerprints on the murder weapon “matched those of that dirty crook over there, as I expected”, I’d have torn him to shreds, and so would any good lawyer. And that’s not because we’re born and raised bloodhounds but because we prefer people not to have biases in what they are supposed to opine on in a dispassionate, clear, clinical manner.

And this story confirms why that matters.

Four years after the paper came into print (why so late?), an erratum had to be  published (that, by the way, is still not replicated on a lot of sites that republished the piece). It so turns out that the gentlemen writing the study have ‘misread’ their numbers. Like, real bad.

The authors regret that there is an error in the published version of “Correlation not Causation: The Relationship between Personality Traits and Political Ideologies” American Journal of Political Science 56 (1), 34–51. The interpretation of the coding of the political attitude items in the descriptive and preliminary analyses portion of the manuscript was exactly reversed. Thus, where we indicated that higher scores in Table 1 (page 40) reflect a more conservative response, they actually reflect a more liberal response. Specifically, in the original manuscript, the descriptive analyses report that those higher in Eysenck’s psychoticism are more conservative, but they are actually more liberal; and where the original manuscript reports those higher in neuroticism and social desirability are more liberal, they are, in fact, more conservative. We highlight the specific errors and corrections by page number below:

It also magically turns out that the military is not full of psychotics.3Yes, I know a high Eysenck P score does not mean a person is ‘psychotic’ and Eysenck’s test is a personality trait test, not a test to diagnose a psychotic disorder. Whodda thunk.

…Ρ is substantially correlated with liberal military and social attitudes, while Social Desirability is related to conservative social attitudes, and Neuroticism is related to conservative economic attitudes.

“No shit, Sherlock,” as they say.

The authors’ explanation is that the dog ate their homework. Ok, only a little bit better: the responses were “miscoded”, i.e. it’s all the poor grad student sods’ fault. Their academic highnesses remain faultless:

The potential for an error in our article initially was pointed out by Steven G. Ludeke and Stig H. R. Rasmussen in their manuscript, “(Mis)understanding the relationship between personality and sociopolitical attitudes.” We found the source of the error only after an investigation going back to the original copies of the data. The data for the current paper and an earlier paper (Verhulst, Hatemi and Martin (2010) “The nature of the relationship between personality traits and political attitudes.” Personality and Individual Differences 49:306–316) were collected through two independent studies by Lindon Eaves in the U.S. and Nichols Martin in Australia. Data collection began in the 1980’s and finished in the 1990’s. The questionnaires were designed in collaboration with one of the goals being to be compare and combine the data for specific analyses. The data were combined into a single data set in the 2000’s to achieve this goal. Data are extracted on a project-by-project basis, and we found that during the extraction for the personality and attitudes project, the specific codebook used for the project was developed in error.

As a working data scientist and statistician, I’m not buying this. This study has, for all its faults, intricate statistical methods. It’s well done from a technical standpoint. It uses Cholesky decomposition and displays a relatively sophisticated statistical approach, even if it’s at times bordering on the bizarre. The causal analysis is an absolute mess, and I have no idea where the authors have gotten the idea that a correlation over 0.2 is “large enough for further consideration”. That’s not a scientifically accepted idea. A correlation is significant or not significant. There is no weird middle way of “give us more money, let’s look into it more”. The point remains, however, that the authors, while practising a good deal of cargo cult science, have managed to oversee an epic blunder like this. How could that have happened?

Well, really, how could it have happened? I trust this should be explained by the words I’ve pointed out before. The authors had what is called “cognitive contamination” in the field of criminal forensic science. The authors had an idea about conservatives and liberals and what they are like. These ideas were caricaturesque to the extreme. They were blind as a bat, blinded by their own ideological biases.

And there goes my point. There are, sometimes, articles that you shouldn’t write.

Let me give you an analogy. My religion has some pretty clear rules about what married people are, and aren’t, allowed to do. Now, what my religion also happens to say is that it’s easier not to mess up these things if you do not engage in temptation. If you are a drug addict, you should not hang out with coke heads. If you are a recovering alcoholic, you would not exactly benefit from hanging out with your friends on a drunken revelry. If you’ve got political convictions, you are more prone to say stupid things when you find a result that confirms your ideas. The term for this is ‘confirmation bias’, the reality is that it’s the simple human proneness to see what we want to see.

Do you remember how as a child, you used to play the game of seeing shapes in clouds? Puppies, cows, elephants and horses? The human brain works on the basis of a Gestalt principle of reification, allowing us to reconstruct known things from its parts. It’s essential to the way our brain works. But it’s also making us see the things we want to see, not what we’re actually seeing.

And that’s why you should never write that one article. The one where you explain why the other side is dumb, evil or has psychotic and/or neurotic traits.

References   [ + ]

1. Largely, they presume outlandish stuff like ‘human life is exceptional and always worth defending’ or ‘death does not cure illnesses’, you get my drift.
2. For starters, I maintain we all are at the very least the latter, quite probably the middle one at least a portion of the time and, frankly, the first one more often than we would believe ourselves.
3. Yes, I know a high Eysenck P score does not mean a person is ‘psychotic’ and Eysenck’s test is a personality trait test, not a test to diagnose a psychotic disorder.
men-in-black

Give your Twitter account a memory wipe… for free.

The other day, my wife has decided to get rid of all the tweets on one of her twitter accounts, while of course retaining all the followers. But bulk deleting tweets is far from easy. There are, fortunately, plenty of tools that offer you the service of bulk deleting your tweets… for a cost, of course. One had a freemium model that allowed three free deletes per day. I quickly calculated that it would have taken my wife something on the order of twelve years to get rid of all her tweets. No, seriously. That’s silly. I can write some Python code to do that faster, can’t I?

Turns out you can. First, of course, you’ll need to create a Twitter app from the account you wish to wipe and generate an access token, since we’ll also be performing actions on behalf of the account.

import tweepy
import time

CONSUMER_KEY=<your consumer key>
CONSUMER_SECRET=<your consumer secret>
ACCESS_TOKEN=<your access token>
ACCESS_TOKEN_SECRET=<your access token secret>
SCREEN_NAME=<your screen name, without the @>

Time to use tweepy’s OAuth handler to connect to the Twitter API:

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

api = tweepy.API(auth)

Now, we could technically write an extremely sophisticated script, which looks at the returned headers to determine when we will be cut off by the API throttle… but we’ll use the easy and brutish route of holding off for a whole hour if we get cut off. At 350 requests per hour, each capable of deleting 100 tweets, we can get rid of a 35,000 tweet account in a single hour with no waiting time, which is fairly decent.

The approach will be simple: we ask for batches of 100 tweets, then call the .destroy() method on each of them, which thanks to tweepy is now bound into the object representing every tweet we receive. If we encounter errors, we respond accordingly: if it’s a RateLimitError, an error object from tweepy that – as its name suggests – shows that the rate limit has been exceeded, we’ll hold off for an hour (we could elicit the reset time from headers, but this is much simpler… and we’ve got time!), if it can’t find the status we simply leap over it (sometimes that happens, especially when someone is doing some manual deleting at the same time) and otherwise, we break the loops.

def destroy():
    while True:
        q = api.user_timeline(screen_name=SCREEN_NAME,
                              count=100)
        for each in q:
            try:
                each.destroy()
            except tweepy.RateLimitError as e:
                print (u"Rate limit exceeded: {0:s}".format(e.message))
                time.sleep(3600)
            except tweepy.TweepError as e:
                if e.message == "No status found with that ID.":
                    continue
            except Exception as e:
                print (u"Encountered undefined error: {0:s}".format(e.message))
                break
        break

Finally, we’ll make sure this is called as the module default:

if __name__ == '__main__':
    destroy()

Happy destruction!

IMG911-B-c

Immortal questions

When asked for a title for his 1979 collection of philosophical papers, my all-time favourite philosopher1That does not mean I agree with even half of what he’s saying. But I do undoubtedly acknowledge his talent, agility of mind, style of writing, his knowledge and his ability to write good and engaging papers that have not yet fallen victim to the neo-sophistry dominating universities. Thomas Nagel chose the title Mortal Questions, an apt title, for most of our philosophical preoccupations (and especially those pertaining to the broad realm of moral philosophy) stem from the simple fact that we’re all mortal, and human life is as such an irreplaceable good. By extension, most things that can be created by humans are capable of being destroyed by humans.

That time is ending, and we need a new ethics for that.

Consider the internet. We all know it’s vulnerable, but is it existentially vulnerable?2I define existential vulnerability as being capable of being destroyed by an adversary that does not require the adversary to accept an immense loss or undertake a nonsensically arduous task. For example, it is possible to kill the internet by nuking the whole planet, but that would be rather disproportionate. Equally, destruction of major lines of transmission may at best isolate bits of the internet (think of it in graph theory terms as turning the internet from a connected graph into a spanning acyclic tree), but it takes rather more to kill off everything. On the other hand, your home network is existentially vulnerable. I kill router, game over, good night and good luck. The answer is probably no. Neither would any significantly distributed self-provisioning pseudo-AI be. And by pseudo-AI, I don’t even mean a particularly clever or futuristic or independently reasoning system, but rather a system that can provision resources for itself in response to threat factors just as certain balancers and computational systems we write and use on a day to day basis can commission themselves new cloud resources to carry out their mandate. Based on their mandate, such systems are potentially existentially immortal/existentially indestructible.3As in, lack existential vulnerability.

The human factor in this is that such a system will be constrained by mandates we give them. Ergo,4According to my professors at Oxford, my impatience towards others who don’t see the connections I do has led me to try to make up for it by the rather annoying verbal tic of overusing ‘thus’ at the start of every other sentence. I wrote a TeX macro that automatically replaced it with neatly italicised ‘Ergo‘. Sometimes, I wonder why they never decided to drown me in the Cherwell. those mandates are as fit a subject for human moral reasoning as any other human action.

Which means we’re going to need that new ethics pretty darn’ fast, for there isn’t a lot of time left. Distributed systems, smart contracts, trustless M2M protocols, the plethora of algorithms that have arisen that each bring us a bit closer to a machine capable of drawing subtle conclusions from source data (hidden Markov models, 21st century incarnations of fuzzy logic, certain sorts of programmatic higher order logic and a few other factors are all moving towards an expansion of what we as humans can create and the freedom we can give our applications. Who, even ten years ago, would have thought that one day I will be able to give a computing cluster my credit card and if it ran out of juice, it could commission additional resources until it bled me dry and I had to field angry questions from my wife? And that was a simple dumb computing cluster. Can you teach a computing cluster to defend itself? Why the heck not, right?

Geeks who grew up on Asimov’s laws of robotics, myself included, think of this sort of problem as largely being one of giving the ‘right’ mandates to the system, overriding mandates to keep itself safe, not to harm humans,5…or at least not to harm a given list of humans or a given type of humans. or the like. But any sufficiently well-written system will eventually grow to the level of the annoying six-year-old, who lives for the sole purpose of trying to twist and redefine his parents’ words to mean the opposite of what they intended.6Many of these, myself included, are at risk of becoming lawyers. Parents, talk to your kids. If you don’t talk to them about the evils of law school, who will? In the human world, a mandate takes place in a context. A writ is executed within a legal system. An order by a superior officer is executed according to the applicable rules of military justice, including circumstances when the order ought not be carried out. Passing these complex human contexts, which most of us ignore as we do all the things we grew up with and take for granted, into a more complicated model may not be feasible. Rules cannot be formulated exhaustively,7H.L.A. Hart makes some good points regarding this as such a formulation by definition would have to encompass all past, present and future – all that potentially can happen. Thus, the issue moves on soon from merely providing mandates to what in the human world is known as ‘statutory construction’ or interpretation of legislative works. How are computers equipped to reason about symbolic propositions according to rules that we humans can predict? In other words, how can we teach rules of reasoning about rules in a way that is not inherently recursing this question (i.e. is not based on a simple conditional rule based framework).

Which means that the best that can be provided in such a situation is a framework based on values, and target optimisation algorithms (i.e. what’s the best way to reach the overriding objective with least damage to other objectives and so on). Which in turn will need a good bit of rethinking ethical norms.

But the bottom line is quite simple: we’re about to start creating immortals. Right now, you can put data on distributed file infrastructures like IPFS that’s effectively impossible to destroy using a reasonable amount of resources. Equally, distributed applications via survivable infrastructures such as the blockchain, as well as smart contract platforms, are relatively immortal. The creation of these is within the power of just about everyone with a modicum of computing skills. The rise of powerful distributed execution engines for smart contracts, like Maverick Labs’ Aletheia Platform,8Mandatory disclosure: I’m one of the creators of Aletheia, and a shareholder and CTO of its parent corporation. will give a burst of impetus to systems’ ability to self-provision, enter into contracts, procure services and thus even effect their own protection (or destruction). They are incarnate, and they are immortal. For what it’s worth, man is steps away from creating its own brand of deities.9For the avoidance of doubt: as a Christian, a scientist and a developer of some pretty darn complex things, I do not believe that these constructs, even if omnipotent, omniscient and omnipresent as they someday will be by leveraging IoT and surveillance networks, are anything like my capital-G God. For lack of space, there’s no way to go into an exhaustive level of detail here, but my God is not defined by its omniscience and omnipotence, it’s defined by his grace, mercy and love for us. I’d like to see an AI become incarnate and then suffer and die for the salvation of all of humanity and the forgiveness of sins. The true power of God, which no machine will ever come close to, was never as strongly demonstrated as when the child Jesus lay in the manger, among animals, ready to give Himself up to save a fallen, broken humanity. And I don’t see any machine ever coming close to that.

What are the ethics of creating a god? What is right and wrong in this odd, novel context? What is good and evil to a device?

The time to figure out these questions is running out with merciless rapidity.

Title image: God the Architect of the Universe, Codex Vindobonensis 2554, f1.v

References   [ + ]

1. That does not mean I agree with even half of what he’s saying. But I do undoubtedly acknowledge his talent, agility of mind, style of writing, his knowledge and his ability to write good and engaging papers that have not yet fallen victim to the neo-sophistry dominating universities.
2. I define existential vulnerability as being capable of being destroyed by an adversary that does not require the adversary to accept an immense loss or undertake a nonsensically arduous task. For example, it is possible to kill the internet by nuking the whole planet, but that would be rather disproportionate. Equally, destruction of major lines of transmission may at best isolate bits of the internet (think of it in graph theory terms as turning the internet from a connected graph into a spanning acyclic tree), but it takes rather more to kill off everything. On the other hand, your home network is existentially vulnerable. I kill router, game over, good night and good luck.
3. As in, lack existential vulnerability.
4. According to my professors at Oxford, my impatience towards others who don’t see the connections I do has led me to try to make up for it by the rather annoying verbal tic of overusing ‘thus’ at the start of every other sentence. I wrote a TeX macro that automatically replaced it with neatly italicised ‘Ergo‘. Sometimes, I wonder why they never decided to drown me in the Cherwell.
5. …or at least not to harm a given list of humans or a given type of humans.
6. Many of these, myself included, are at risk of becoming lawyers. Parents, talk to your kids. If you don’t talk to them about the evils of law school, who will?
7. H.L.A. Hart makes some good points regarding this
8. Mandatory disclosure: I’m one of the creators of Aletheia, and a shareholder and CTO of its parent corporation.
9. For the avoidance of doubt: as a Christian, a scientist and a developer of some pretty darn complex things, I do not believe that these constructs, even if omnipotent, omniscient and omnipresent as they someday will be by leveraging IoT and surveillance networks, are anything like my capital-G God. For lack of space, there’s no way to go into an exhaustive level of detail here, but my God is not defined by its omniscience and omnipotence, it’s defined by his grace, mercy and love for us. I’d like to see an AI become incarnate and then suffer and die for the salvation of all of humanity and the forgiveness of sins. The true power of God, which no machine will ever come close to, was never as strongly demonstrated as when the child Jesus lay in the manger, among animals, ready to give Himself up to save a fallen, broken humanity. And I don’t see any machine ever coming close to that.
Screenshot 2016-05-13 11.27.11

Actually, yes, you should sometimes share your talent for free.

Adam Hess is a ‘comedian’. I don’t know what that means these days, so I’ll give him the benefit of doubt here and assume that he’s someone paid to be funny rather than someone living with their parents and occasionally embarrassing themselves at Saturday Night Open Mic. I came across his tweet from yesterday, in which he attempted some sarcasm aimed at an advertisement in which Sainsbury’s was looking for an artist who would, free of charge, refurbish their canteen in Camden.

Now, I’m married to an artist. I have dabbled in art myself, though with the acute awareness that I’ll never make a darn penny anytime soon given my utter lack of a) skills, b) talent. As such, I have a good deal of compassion for artists who are upset when clients, especially fairly wealthy ones, ask young artists and designers at the beginning of their career to create something for free. You wouldn’t tell a junior solicitor or a freshly qualified accountant to do your legal matters or your accounts for free to ‘gain experience’, ‘get some exposure’ and ‘perhaps get some future business’. It invalidates the fact that artists are, like any other profession, working for a living and have got bills to pay.

Then there’s the reverse of the medal. I spend my life in a profession that has a whole culture of giving our knowledge, skills and time away for free. The result is an immense body of code and knowledge that is, I repeat, publicly available for free. Perhaps, if you’re not in the tech industry, you might want to stop and think about this for five minutes. The multi-trillion industry that is the internet and its associated revenue streams, from e-commerce through Netflix to, uh, porn (regrettably, a major source of internet-based revenue), rely for its very operation on software that people have built for no recompense at all, and/or which was open-sourced by large companies. Over half of all web servers globally run Apache or nginx, both having open-source licences.1Apache and a BSD variant licence, respectively. To put it in other words – over half the servers on the internet use software for which the creators are not paid a single penny.

The most widespread blog engine, WordPress, is open source. Most servers running SaaS products use an open-source OS, usually something *nix based. Virtually all programming languages are open-source – freely available and provided for no recompense. Closer to the base layer of the internet, the entire TCP/IP stack is open, as is BIND, the de facto gold standard for DNS servers.2DNS servers translate verbose and easy-to-remember domain names to IP addresses, which are not that easy to remember. And whatever your field, chances are, there is a significant open source community in it.

Over the last decade and a bit, I have open-sourced quite a bit of code myself. That’s, to use Mr Hess’s snark, free stuff I produced to, among others, ‘impress’ employers. A few years ago, I attended an interview for the data department of a food retailer. As a ‘show and tell’ piece, I brought them a client for their API that I built and open-sourced over the days preceding the interview.3An API is the way third-party software can communicate with a service. API wrappers or API clients are applications written for a particular language that translate the API to objects native to that language. They were ready to offer me the job right there and then. But it takes patience and faith – patience to understand that rewards for this sort of work are not immediate and faith in one’s own skills to know that they will someday be recognised. That is, of course, not the sole reason – or even the main reason – why I open-source software, but I would lie if I pretended it was not sometimes at the back of my head.

At which point it’s somewhat ironic to see Mr Hess complain about an artist being asked to do something for free (and he wasn’t even approached – this is a public advertisement in a local fishwrap!) while using a software pipeline worth millions that people have built, and simply given away, for free, for the betterment of our species and our shared humanity.

Worse, it’s quite clear that this seems to be an initiative not by Sainsbury’s but rather by a few workers who want slightly nicer surroundings but cannot afford to pay for it. Note that it’s the staff canteen, rather than customer areas, that are to be decorated. At this point, Mr Hess sounds greedier than Sainsbury’s. Who, really, is ‘exploiting’ whom here?

In my business life, I would estimate the return I get from work done free of charge at 2-300% long term. That includes, for the avoidance of doubt, people for whom I’ve done work who ended up not paying me anything at all ever. I’m not sure how it works in comedy, but in the real world, occasionally doing something for someone else without demanding recompense is not only lucrative, it’s also beneficial in other ways:

  • It builds connections because it personalises a business relationship.
  • It builds character because it teaches the value of selflessness.
  • And it’s fun. Frankly, the best times I’ve had during my working career usually involved unpaid engagements, free-of-charge investments of time, open-source contributions or volunteer work.

The sad fact is that many, like Mr Hess, confuse righteous indignation about those who seek to profit off ‘young artists’ by exploiting them with the terrific, horrific, scary prospect of doing something for free just once in a blue moon.

Fortunately, there are plenty of young artists eager to show their skills who either have more business acumen than Mr Hess or more common sense than to publicly snub their noses at the fearsome prospect of actually doing something they are [supposed to be] enjoying for free. As such, I doubt that the Camden Sainsbury’s canteen will go undecorated.

Of the 800 or so retweets, I see few who would heed a word of wisdom, as I see the retweets are awash with remarks that are various degrees of confused, irate or just full of creative smuggity smugness), but for the rest, I’d venture the following word of wisdom:4Credited to Dale Carnegie, but reportedly in use much earlier.

If you want to make a million dollars, you’ve got to first make a million people happy.

The much-envied wealth of Silicon Valley did not happen because they greedily demanded an hourly rate for every line of code they ever produces. It happened because of the realisation that we all are but dwarfs on the shoulders of giants, and ultimately our lives are going to be made not by what we secret away but by what others share to lift us up, and what we share to lift up others with.

You are the light of the world. A city seated on a mountain cannot be hid. Neither do men light a candle and put it under a bushel, but upon a candlestick, that it may shine to all that are in the house. So let your light shine before men, that they may see your good works, and glorify your Father who is in heaven.

Matthew 5:14-16

Title image: The blind Orion carries Cedalion on his shoulders, from Nicolas Poussin’s The Blind Orion Searching for the Rising Sun, 1658. Oil on canvas; 46 7/8 x 72 in. (119.1 x 182.9 cm), Metropolitan Museum of Art.

References   [ + ]

1. Apache and a BSD variant licence, respectively.
2. DNS servers translate verbose and easy-to-remember domain names to IP addresses, which are not that easy to remember.
3. An API is the way third-party software can communicate with a service. API wrappers or API clients are applications written for a particular language that translate the API to objects native to that language.
4. Credited to Dale Carnegie, but reportedly in use much earlier.
rotor-cipher-machine-1147801_1280

Diffie-Hellman in under 25 lines

How can you and I agree on a secret without anyone eavesdropping being able to intercept our communications? At first, the idea sounds absurd – for the longest time, without a pre-shared secret, encryption was seen as impossible. In World War II, the Enigma machines relied on a fairly complex pre-shared secret – the Enigma configurations (consisting of the rotor drum wirings and number of rotors specific to the model, the Ringstellung of the day, and Steckbrett configurations) were effectively the pre-shared key. During the Cold War, field operatives were provided with one-time pads (OTPs), randomly (if they were lucky) or pseudorandomly (if they weren’t, which was most of the time) generated1As a child, I once built a pseudorandom number generator from a sound card, a piece of wire and some stray radio electronics, which basically rested on a sampling of atmospheric noise. I was surprised to learn much later that this was the method the KGB used as well. one time pads (OTPs) with which to encrypt their messages. Cold War era Soviet OTPs were, of course, vulnerable because like most Soviet things, they were manufactured sloppily.2Under pressure from the advancing German Wehrmacht in 1941, they had duplicated over 30,000 pages worth of OTP code. This broke the golden rule of OTPs of never, ever reusing code, and ended up with a backdoor that two of the most eminent female cryptanalysts of the 20th, Genevieve Grotjan Feinstein and Meredith Gardner, on whose shoulders the success of the Venona project rested, could exploit. But OTPs are vulnerable to a big problem: if the key is known, the entire scheme of encryption is defeated. And somehow, you need to get that key to your field operative.

Enter the triad of Merkle, Diffie and Hellman, who in 1976 found a way to exploit the fact that multiplying primes is simple but decomposing a large number into the product of two primes is difficult. From this, they derived the algorithm that came to be known as the Diffie-Hellman algorithm.3It deserves noting that the D-H key exchange algorithm was another of those inventions that were invented twice but published once. In 1975, the GCHQ team around Clifford Cocks invented the same algorithm, but was barred from publishing it. Their achievements weren’t recognised until 1997.

5535098

How to cook up a key exchange algorithm

The idea of a key exchange algorithm is to end up with a shared secret without having to exchange anything that would require transmission of the secret. In other words, the assumption is that the communication channel is unsafe. The algorithm must withstand an eavesdropper knowing every single exchange.

Alice and Bob must first agree to use a modulus p and a baseg, so that the base is a primitive root modulo the modulus.

Alice and Bob each choose a secret key a and b respectively – ideally, randomly generated. The parties then exchange A = g^a \mod(p) (for Alice) and B = g^b \mod(p) (for Bob).

Alice now has received B. She goes on to compute the shared secret s by calculating B^a \mod(p) and Bob computes it by calculating A^b \mod(p).

The whole story is premised on the equality of

A^b \mod(p) = B^a \mod(p)

That this holds nearly trivially true should be evident from substituting g^b for B and g^a for A. Then,

g^{ab} \mod(p) = g^{ba} \mod(p)

Thus, both parties get the same shared secret. An eavesdropper would be able to get A and B. Given a sufficiently large prime for p, in the range of 6-700 digits, the discrete logarithm problem of retrieving a from B^a \mod(p) in the knowledge of B and p is not efficiently solvable, not even given fairly extensive computing resources. Read more

References   [ + ]

1. As a child, I once built a pseudorandom number generator from a sound card, a piece of wire and some stray radio electronics, which basically rested on a sampling of atmospheric noise. I was surprised to learn much later that this was the method the KGB used as well.
2. Under pressure from the advancing German Wehrmacht in 1941, they had duplicated over 30,000 pages worth of OTP code. This broke the golden rule of OTPs of never, ever reusing code, and ended up with a backdoor that two of the most eminent female cryptanalysts of the 20th, Genevieve Grotjan Feinstein and Meredith Gardner, on whose shoulders the success of the Venona project rested, could exploit.
3. It deserves noting that the D-H key exchange algorithm was another of those inventions that were invented twice but published once. In 1975, the GCHQ team around Clifford Cocks invented the same algorithm, but was barred from publishing it. Their achievements weren’t recognised until 1997.
Panna cotta time!

Panna cotta time!

Panna cotta time!

Summertime is panna cotta time! A panna cotta (Italian for ‘cooked cream’) is a great dessert for hot days, as it’s light, does not melt (like chocolate does), and feels cool without weighing your tummy down. It can even substitute for a full meal as it’s a fairly strong dish.


15′ + 3-5h in fridge
Easy peasy

Ingredients

  • 3 cups of heavy cream (‘double cream’ for Limeys) or mascarpone
  • 1/3 cup fine sugar
  • 35ml milk
  • 2 teaspoonfuls of vanilla extract, ideally alcoholic
  • 1 tablespoon or 2 normal sheets of gelatin (be sure to get one you trust, bad gelatin is worse than no gelatin!)
  • Frozen fruit (raspberries, blueberries and forest fruits are generally the best) – alternatively, simply keep the fruit in the fridge for 3-4 hours
  • Finely grated lemon peel (the real thing, not freeze-dried crap)

  1. Add the milk to the saucepan and gently warm. Dissolve the mascarpone or cream in the saucepan, using a whisk if needed.
  2. Add the vanilla extract.
  3. In a separate saucepan, warm up 25-30ml water and dissolve the gelatin.
  4. Pour gelatin into the milk/cream mixture and gently dissolve.
  5. Divide among 6-8 ramekin dishes or small Kilner jars.
  6. Drop in the cold fruits.
  7. Sprinkle lemon peel over the mixture.
  8. Put into fridge, covering it either only very gently with a paper towel or not at all.
  9. Leave to cool for 3-4 hours. Enjoy cold, with a root beer or as a treat on a hot summer day.
watman

What’s the value of a conditional clause?

No, seriously, bear with me. I haven’t lost my mind. Consider the following.

Joe, a citizen of Utopia, makes Ut$142,000 a year. In Utopia, you pay 25% on your first Ut$120,000 and from there on 35% on all earnings above. Let’s calculate Joe’s tax rate.

Trivial, no? JavaScript:

var income = 142000;
var tax_to_pay = 0;

if(income <= 120000){
 tax_to_pay = income * 0.25;
} else {
 tax_to_pay = 30000 + (income - 120000) * 0.35;
}

console.log(tax_to_pay);

And Python:

income = 142000

if income <= 120000
    tax_to_pay = income * 0.25
else
    tax_to_pay = 30000 + (income - 120000) * 0.35

print(tax_to_pay)

And so on. Now let’s consider the weirdo in the ranks, Julia:

income = 142000

if income <= 120000:
    tax_to_pay = income * 0.25
else
    tax_to_pay = 30000 + (income - 120000) * 0.35
end

Returns 37700.0 all right. But now watch what Julia can do and the other languages (mostly) can’t! The following is perfectly correct Julia code.

income = 142000

tax_to_pay = (if income <= 120000
                  income * 0.25
              else
                  30000 + (income - 120000) * 0.35
              end)

print(tax_to_pay)

This, too, will return 37700.0. Now, you might say, that’s basically no different from a ternary operator. Except unlike with ternary ops in most languages, you can actually put as much code there as you want and have as many side effects as your heart desires, while still being able to assign the result of the last calculation within the block that ends up getting executed to the variable at the head.

Now, that raises the question of what the value of a while expression is. Any guesses?

i = 0

digits = (while i < 10
                i = i + 1
          end)

Well, Julia says 0123456789 when executing it, so digits surely must be…

julia> digits


Wait, wat?! That must be wrong. Let’s type check it.

julia> typeof(digits)
Void

So there you have it. A conditional has a value, a while loop doesn’t… even if both return a value. Sometimes, you’ve gotta love Julia, kick back with a stiff gin and listen to Gary Bernhardt.

Screenshot 2016-02-14 13.13.28

10 tips for passing the Neo4j Certified Professional examination

Everybody loves a good certification. Twice so when it’s for free and quadruply so if it’s in a cool new technology like Neo4j. In case you’re unfamiliar with Neo4j, it’s a graph database – a novel database concept that belongs to the NoSQL class of databases, i.e. it does not follow a relational model. Rather, it allows for the storage of, and computation on, graphs.

From a purely mathematical perspective, a graph G(V,E) is formally defined as an ordered pair of vertices V (called nodes in Neo4j) and edges E (known as relationships in Neo4j). In other words, the first class citizens of a graph are ‘things’ and ‘connections between things’. No doubt you can already think of a lot of problems that can be conceptualised as graph problems. Indeed, for a surprising number of things that don’t sound very graph-y at all, it is possible to make use of graph databases. Not that you should always do so (no single technology is a panacea to every problem and I would look very suspiciously at someone who would implement time series as a graph database), but that does not mean it’s not possible in most cases.

Which leads me to the appeal of Neo4j. In general, you had two approaches to graph operations until graph databases entered the scene. One was to write your own graph object model and have it persist in memory. That’s not bad, but a database it sure ain’t. Meanwhile, an alternative is to decompose the graph into a table of vertices and its properties and another table of connections between vertices (an adjacency matrix) and then store it in a regular RDBMS or, somewhat more efficiently, in a NoSQL key-value store. That’s a little better, but it still requires considerable reinvention of the wheel.

The strength of graph databases is that they facilitate more complex operations, way beyond storage and retrieval of graphs, such as searching for patterns, properties and paths. One done-to-death example would be the famous problem known as Six Degrees of Kevin Bacon, a pop culture version of Erdös numbers: for an actor A and a Kevin Bacon K within a graph G_{Actors} with A, K \in G_{Actors}, what is the shortest path (and is it below six jumps?) to get from A to K? Graph databases turn this into a simple query. Neo4j is one of the first industrial grade graph DBs, with an enterprise grade product that you can safely deploy in a production system without worrying too much about it. Written in Java, it’s stable, fast and has enough API wrappers to have some left over for the presents next Christmas. Alongside the more traditional APIs, it’s got a very friendly and very visual web-based interface that immediately plots your query results and a somewhat weird but ultimately not very counter-intuitive query language known as Cypher. As such, if graph problems are the kind of problem you deal with on a regular basis, taking Neo4j for a spin might be a very good idea.

Which in turn leads me to the Neo4j certification. For the unbeatable price of $0.00, you can now sit for the esteemed title of Neo4j Certified Professional – that is, if you pass the 80-question, 60-minute time-capped test with a score of 80% or above. Now, let not the fact that it’s offered for free deter you – the test is pretty ferocious. It takes a fairly in-depth knowledge of Neo4j to pass (I’ve been around Neo4j ever since it has been around, and while I’ve never tried it and passed at first try recently, it has been surprisingly hard even for me!), the time cap means that even if you do decide to refer to your notes (I am not sure if that’s not cheating – I personally did not, as it was just so time-intensive), you won’t be able to pass merely from notes. Worse, there are no test exams and preparation material is scarce outside (rather pricey!) trainings. As such, I’ve written up the ten things I wish I had known before embarking upon the exam. While I did pass at the first try, it was a lot harder than I expected and I would definitely have prepared for it differently, had I known what it would be like! Fortunately, you can attempt it as often as you would like for no cost, and as such it’s by no means an impossible task,1I’ve been told that feedback on failed tests is fairly terrible – there is no feedback to most questions, and you’re not given the correct answers. but you’re in for a ride if you wish to pass with a good score. Fasten your seat belt, flip up the tray table and put your seat in a fully upright position – it’s time to get Neo4j’d!

1. This is not a user test… it’s a user and DBA test.

I haven’t heard of a single Neo4j shop that had a dedicated Neo4j DBA to support graph operations. Which is ok – compared to the relatively arcane art of (enterprise) RDBMS DBAs, Neo4j is a breeze to configure. At the same time, the model seems to expect users to know what they’re doing themselves and be confident with some close-to-the-metal database tweaking. Good.

The downside is that about a quarter or so of the questions have to do with the configuration of Neo4j, and they do get into the nitty-gritty. You’re expected, for instance, to know fairly detailed minutiae of Enterprise edition High Availability server settings.

2. Pay attention to Cypher queries. The devil’s in the details.

If you’ve done as many multiple choice tests as I have, you know you’ve learned one thing for sure: all of them follow the same pattern. Two answers are complete bunk and anyone who’s done their reading can spot that. The remaining two are deceptively similar, however, and both sound ‘correct enough’. In the Neo4j test, this is mainly in the realm of the Cypher queries. A number of questions involve a ‘problem’ being described and four possible Cypher queries. The candidate must then spot which of these, or which several of these, answer the problem description. Often the correct answer may be distinguished from the incorrect one by as little as a correctly placed colon or a bracket closed in the right order. When in doubt, have a very sharp look at the Cypher syntax.

Oh, incidentally? The test makes relatively liberal use of the ‘both directions match’ (a)-[:RELATION]-(b) query pattern. This catches (a)-[:RELATION]->(b) as well as (b)-[:RELATION]->(a). The lack of the little arrow is easy to overlook and can lead you down the wrong path…

3. Develop query equivalence to second nature.

Python was built so that there would be one, and exactly one, right way to do everything. Sort of. Cypher is the opposite – there are dozens of ways to express certain relations, largely owing to the equivalence of relationships. As such, be aware of two equivalences. One is the equivalence of inline parameters and WHERE parameters:

MATCH (a:Person {name: "John Smith"})-[:REL]->(b)
RETURN a;
MATCH (a:Person)-[:REL]->(b)
WHERE a.name = "John Smith"
RETURN a;

Also, the following partials are equivalent, but not always:

(a)-[:FIRST_REL]->(b)<-[:SECOND_REL]-(c)
(a)-[:FIRST_REL]->(b)
(c)-[:SECOND_REL]->(b)

When you see a Cypher statement, you should be able to see all of its forms. Recap question: when are the statements in the second pair NOT equivalent?

4. The test is designed on the basis of the Enterprise edition.

Neo4j comes in two ‘flavours’ – Community and Enterprise. The latter has a lot of cool features, such as an error-resilient, distributed ‘High Availability’ mode. The certification’s premise is that you are familiar – and familiar to a fairly high degree, actually! – with many of the Enterprise-only features of Neo4j. As such, unless you’re fortunate enough to be an enterprise user, it might repay itself to download the 30-day evaluation version of Neo4j Enterprise.

5. The test is generally well-written.

In other words, most things are fairly clear. By fairly clear, I mean that there is little ambiguity and it uses the same language as the reference (although comparing test questions to phrases that stuck in my head which I ended up checking after the test, just enough words are changed to deter would-be cheaters from Ctrl+F-ing through the manual! There are no trick questions – so try to understand the questions in their most ‘mundane’, ‘trivial’ way. Yes, sometimes it is that simple!

6. TRUNCATE BRAINSPACE sql_clauses;

A lot of traditional SQL clauses (yes, TRUNCATE is one example – so is JOIN and its multifarious siblings, which describe a concept that simply does not exist in Neo4j) come up as red herrings in Cypher application questions. Try to force your brain to make a switch from SQL to Cypher – and don’t fall for the trap of instinctively thinking of the clauses in the SQL solution! Forget SQL. And most of all, forget its logic of selection – MATCHing is something rather different than SELECTing in SQL.

7. Have a 30,000ft overview of the subject

In particular, have an overview of what your options are to get particular things done. How can you access Neo4j? You might have spent 99% of your time on the web interface and/or interacting using the SDK, but there is actually a shell. How can you backup from Neo4j, and what does backup do? What are your options to monitor Neo4j? Once again, most users are more likely to think of one solution, perhaps two, when there are several more. The difficult thing about this test is that it requires you to be exhaustive – both in breadth and in depth.

8. Algorithms, statistics and aggregation

As far as I’m aware, everyone gets slightly different questions, but my test did not include anything about the graph algorithms inherent in Neo4j (good news for philistines people who want to get stuff done). It did, however, include quite a bit of detail about aggregation functions. You make of that what you will.

9. Practice on Northwind but know the Movie DB like the back of your hand.

Out of the box, if you install Neo4j Community on your computer, you have two sample databases that the Browser offers to load into your instance – Movie and Northwind. The latter should be highly familiar to you if you have a past in relational databases. Meanwhile, the former is a Neo4j favourite, not the least for the Kevin Bacon angle. If you did the self-paced Getting Started training (as you should have!), you’ll have used the Movie DB enough to get a good grip of it. Most of the questions on the text pertain or relate in some way to that graph, so a degree of familiarity can help you spot errors faster. At the same time, Northwind is both a better and bigger database, more fun to use and allows for more complex queries. Northwind should therefore be your educational tool, but you should know Movie rather well for that little plus of familiar feeling that can make the difference between passing and failing. Oh, by the way – while Getting Started is a great course, you will not stand a snowball’s chance in hell without the Production course. This is so even if you’ve done your fill of deployments and integrations – quite simply put, the breadth of the test is statistically very likely to be beyond your own experiences, even if you’ve done e.g. High Availability deployments yourself. In the real world, we specialise – for the test, however, you must be a generalist.

10. Refcards are your friends.

Start with the one for Cypher. Then build your own for High Availability. Laminate them and carry them around, if need be – or take the few functions or clauses that are your weak spots, put them on post-its and plaster them on your wall. Whatever helps – unless you’re writing Cypher code 24/7 (in which case, what are you doing here?), which I doubt happens a lot, there’s quite simply no substitute for seeing correct code and being able to get a feeling for good versus bad code. The test is incredibly fast paced – 80 questions over 60 minutes gives you 45 seconds for a turnkey execution. At least 15-20 of that is reading the question, if not more (it definitely was more for me – as noted, most questions repay a thorough reading!). Realistically, if you want to make that and have time to think about the more complex questions, you’ve got to be able to bang out simple Cypher questions (I’d say there were about 8-10 of them altogether, worth an average number of points, though I (and I do regret this now) didn’t count them.

 

While the Neo4j certification exam is far from easy, it is doable (hey, if I can do it, so can you!). As graph databases are becoming increasingly important due to the recognition that they have the potential to accelerate certain calculations on graph data, coupled with the understanding that a lot of natural processes are in reality closer to relationship-driven interactions than the static picture that traditional RDBMS logic seeks to convey, knowing Neo4j is a definite asset for you and your team. Regardless of your intent to get certified and/or view on certifications in general (mine, too, is in general more on the less complimentary side), what you learn can be an indispensable asset in research and operations as well. Of course, I’m happy to answer any questions about Neo4j and the certification exam, insofar as my subjective views can make a valid contribution to the matter.

Update 15.02.2016: Neo4j community caretaker Michael Hunger has been so kind as to leave a comment on this article, pointing out that the scant feedback is intentional – it prevents re-takers from simply banging in the correct answers from the feedback e-mail. That makes perfect sense – and is not something I thought of. Thanks, Michael. He is also encouraging recent test takers to propose questions for the test – to me, it’s an unprecedented amazingness for a certificate provider to actually ask the community what they believe to be to be the cornerstone and benchmarks of knowledge in a particular field. So do take him up on that offer – his e-mail is in his comment below.

 

Title image credits: Dr Tamás Nepusz, Reconstructing the structure of the world-wide music scene with Last.fm.

References   [ + ]

1. I’ve been told that feedback on failed tests is fairly terrible – there is no feedback to most questions, and you’re not given the correct answers.
scada

(I)IoT security is not SCADA security

The other day, at the annual Worldwide Threats hearing of the Senate Armed Services Committee (SASC) – the literal sum of all fears of the intelligence community and the US military -, the testimony of DNI James Clapper made notice of the emerging threats of hacking the Internet of Things:

“Smart” devices incorporated into the electric grid, vehicles—including autonomous vehicles—and household appliances are improving efficiency, energy conservation, and convenience. However, security industry analysts have demonstrated that many of these new systems can threaten data privacy, data integrity, or continuity of services. In the future, intelligence services might use the IoT for identification, surveillance, monitoring, location tracking, and targeting for recruitment, or to gain access to networks or user credentials.1Testimony of DNI James R. Clapper to the Senate Armed Services Committee, 9 February 2016. Available here.

It’s good to hear a degree of concern for safety where IoT applications are concerned. The problem is, they come in two flavours, neither of which is helpful.

One is the “I can’t see your fridge killing you” approach. Beloved of Silicon Valley, it is generally a weak security approach based on the fact that for what it’s worth, most of IoT is a toy you can live without. So what if your internet-connected egg monitor stops sending packages to your MQTT server about your current egg stock? What if your fridge misregulates because some teenager is playing a prank on you?

The problem with this approach is that it entirely ignores the fact that Industrial IoT exists. Beyond that, you find a multiverse of incredibly safety-critical devices with IoT connectivity. This includes pain pumps, anaesthesia and vitals monitors, navigation systems, road controls and so on. A few of these are air-gapped, but if Stuxnet is anything to go by, that’ll be of little avail altogether. In other words, just because some IoT devices are toys doesn’t mean all of them are.

The other does take security seriously… but it smacks of SCADA talk. Yes, the dangers that affect interfacing a computer with the controls of any real world object, up to and including fridges, cat feeders and nuclear power plants, mean that there are particular dangers, and whether that interface is a 1980s protocol or the Internet of Things makes no great diference. But the threat of this perspective is that it ignores the IoT specific part of the risk profile. This includes, for instance, the vulnerabilities inherent in the #1 carrier medium of non-industrial IoT traffic – 802.11b/g. Any vulnerability in the main wireless protocols is in turn a vulnerability of IoT connected devices. I’d wager that wasn’t much of an issue back when Reagan was President.

Moreover, a scary part of IoT is that a lot of the devices interface directly with customers, rather than professionals. You cannot rely on anything that’s not in the box. The wireless network will be badly configured, the passwords will be the user’s birthday and generally, everything will be as unsafe as it gets. Of course, this isn’t even true for a minority of users, but it is the worst case scenario, and that’s what determines the risk profile of a component.

The bottom line is that if you ignore the incredible diversity of devices caught by the new favourite buzzword IoT has become, you get a slanted risk profile, towards toys (IoT connected floating thingies in your pool that measure your cannonballs? Are you serious?) or towards an aging system of industrial control protocols that some hope will be revived by, rather than supplanted by, IoT.

That’s a problem if you scratch anything but the top layer. Below the very specific layers, a number of fairly generic interconnection layers lie. And the consequence of that is that is that every layer, protocol or method of communication that can conceivably be used for IoT data transmission eventually will be used for that, and as such must have a defensible risk profile.

And in turn, every (I)IoT application must consider, in its risk profile, what the risk profiles of the stack it is based on is. Every element of the stack contributes to the risk, and quite often these risks are encapsulated in vendor risk estimates of higher-up layers sold as composite. Thus for instance an entire IoT appliance might be sold with a particular risk profile, but in reality, its risk profile is the sum of the risk profiles of all of its layers: its sensors, its hardware (especially where sex via hex is concerned – something that might arouse older engineers entirely the wrong way), the radio frequencies, the susceptibility of the radio hardware to certain unpleasant hacks, heck, even the possibility of van Eck phreaking on screens.2A few years ago, a large and undisclosed metallurgical company was concerned about information leaking onto the open market that indicated possible hacking of their output values at one of their largest plants, values that were, due to some oddities in the markets, quite significant in global commodity prices. After an enormously long hunt and scouring through the entire corporate and production network, they were about to give up when they found a van Eck loop antenna curled to the backside of a subsidiary monitor that an engineer has ‘hacked’ never to go into power save. As a recovering lawyer, I entirely foresee that the expected standard from non-private users of IoT devices, as well as those who build IoT devices, will be examining the whole stack rather than relying in previous promises. It may be turtles all the way down, but you’re expected to know all about those turtles now. Merely subscribing to one of the two prevalent risk models will not suffice.

References   [ + ]

1. Testimony of DNI James R. Clapper to the Senate Armed Services Committee, 9 February 2016. Available here.
2. A few years ago, a large and undisclosed metallurgical company was concerned about information leaking onto the open market that indicated possible hacking of their output values at one of their largest plants, values that were, due to some oddities in the markets, quite significant in global commodity prices. After an enormously long hunt and scouring through the entire corporate and production network, they were about to give up when they found a van Eck loop antenna curled to the backside of a subsidiary monitor that an engineer has ‘hacked’ never to go into power save.