MELLODDY and the accuracy and privacy frontier

Also: HealthVerity acquires Curisium, Chronicled partners with Deloitte, and Sonoco launches a supply chain solution

Hi everyone,

Every week I write about blockchain technology, healthcare, and related topics. On Sundays I analyze the latest updates and ideas from that week and every few weeks I publish additional essays on related topics.

If you find this newsletter valuable please share it with someone else who would also find it valuable.

Blockchain and healthcare updates

MELLODDY launches first compute plan

MELLODDY is a federated learning network where 10 major pharmaceutical companies are collaboratively training an algorithm on their collective pre-clinical data, with the goal of accelerating drug discovery. MELLODDY was borne out of the Innovative Medicines Initiative and is a 3 year project that has just hit their 1 year milestone. With this milestone they also launched their first run of live federated learning learning across MELLODDY’s member’s datasets.

The accompanying article is a good reminder of the difficulty of standing up a federated learning network. MELLODDY has spent a full year performing audits to ensure their software actually met the privacy and security standards that each party required to go live. Only after doing so could the MELLODDY network start to train their algorithms, and their first live run is planned to go on for months. I wonder what the bill is for the computing power they are using!

Turning to the future, they plan to spend the next 2 years focusing on “maximizing predictive gains from securely harnessing our joint private data.” I thought this was interesting and worth a bit of elaboration.

As you may recall one of the challenges in training a federated learning algorithm is that algorithms may contain some of the information used to train them. If you are trying to fit an algorithm closely to some data then you may end up embedding that data in your algorithm. In turn that means someone else could take your algorithm and reverse engineer it to yield the data you used to train it! Obviously this is a bad thing if your algorithm is trained on sensitive data that you want to keep private.

In simple terms you could think of this tradeoff like this:

As you fit your algorithm to the underlying data closer it becomes more accurate. But, at the same time, there is an increasing risk of someone being able to reverse engineer your algorithm and be able to recover some of your sensitive data from it. This trade off between privacy and accuracy forms a frontier, which is the purple line above. Note: for a given level of privacy or accuracy you can never go beyond the purple line, but you can simply not reach it if your algorithm is not optimally designed.

In the case of MELLODDY the underlying data is highly sensitive and there is a minimum level of privacy that the companies involved demand:

No matter what accuracy improvements they would get these companies are not willing to cross this line, and MELLODDY has spent their year so far performing audits to show can they can reach this minimum level of privacy. Their algorithm is represented by the blue dot “A” below.

The reason why A is not on the purple line is that the focus of MELLODDY to date has been proving they can meet a minimum level of privacy, not optimizing their algorithm for accuracy. The next two years of MELLODDY is focused on improving accuracy while still ensuring that same level of privacy:

In other words, MELLODDY is trying to move from point A, where they are now, to point B, which is on the accuracy and privacy frontier. At this point they would be maximizing the accuracy of their algorithm given a certain minimum level of privacy.

These principles are broadly applicable across federated learning networks. Each network is faced with the same trade off between privacy and accuracy and will need to make a decision about the level of risk members are willing to bear. Moreover, each network will need to do the work of showing they have achieved their desired level of privacy. This is a non-trivial thing to do and will likely require independent audits by experts. Lastly, each network will need to do the work of optimizing their algorithms to the edge of the privacy-accuracy frontier, or moving from A to B, with the constraint of meeting their privacy needs.

Sonoco Thermosafe creating cold-chain track and trace solution with blockchain tech from IBM

Sonoco, a publicly traded company, is the largest global provider of temperature assurance packaging for pharmaceutical distribution. They are building “PharmaPortal,” which they describe as a “vendor-neutral blockchain platform for pharmaceutical manufacturers and carriers” to trace assets across a supply chain. Initially they are focused on “temperature-controlled drugs, such as vaccines.” That is not all that surprising given Sonoco’s background and the race for a COVID-19 vaccine.

In addition to their solution they hope to stand up a new “open” and “neutral” network, and except to appoint an “advisory council” with industry representatives on it. But, unless Sonoco is willing to give up control and let others participate in governance to a significant degree then their solution won’t be either “open” or “neutral.”

HealthVerity acquires Curisium

Curisium uses blockchain technology to help healthcare organizations to enter into and execute innovative contracting agreements. The goal of these contracts is to tie payment for some good or service to a pre-specified outcome with the ultimate goal of improving patients’ health. These contracts are quite hard to get in place and then bring a ton of administrative overhead when executing. I wrote about some of the problems and how blockchains could help here.

HealthVerity’s acquisition of Curisium is interesting. HealthVerity’s current product offerings focus on accessing, consenting, discovering, and making sense of patient data — and the datasets they are integrated with are pretty large. It makes sense for HealthVerity to offer contracting solutions as well as they can use their health data offerings to help their customers acquire the data necessary to execute a contract and then perform the contract’s analytics.

Taking a step back, it is nice to see the HealthVerity team secure an acquisition. They were an early party to the blockchain and healthcare space, as venture capital funded startups they were only preceded by Chronicled and Hashed Health. Showing that entrepreneurs can successfully start and exit businesses (either through an acquisition or IPO) in a space is very important for investors.

Chronicled partners with Deloitte

The details of this relationship are not super clear, but here is a notable excerpt from the press release:

Part of the new alliance between Chronicled and Deloitte includes a solution to help fight counterfeits and fraud in the medication used for the treatment of COVID-19, an issue that has increased dramatically since the start of the pandemic.

On a related note: Chronicled has webinar this week (August 4th) on ‘The MediLedger Network & The Future of Doing "Business Together" in Life Sciences and Healthcare

What I’m reading this weekend

The State of Ethereum from Delphi Digital

An excellent read on the State of Ethereum (they mean the public blockchain) full of data and interesting insights. Here’s one that I found interesting:

For about 50 days the fees that people have paid to use Ethereum have exceeded Bitcoin’s fees. This was quite surprising to me given that Bitcoin is worth far more. In the long run fees are the only metric of usage/demand that can’t be gamed, so this is a quite strong signal of the network’s health.

Altogether the state of Ethereum is strong. Decentralized finance is growing exponentially and looks to be Ethereum’s first killer app, ETH2.0 looks within reach, and it seems there is fresh talent and users coming to the space. Anecdotally to me the last time the space felt like it had this much energy was ~2017. That energy has both good and bad consequences. The good is we can expect more money to be invested in the blockchain space and more talent to come. The bad is that it could kick off the same sort of crypto craze that 2017 saw, which brings both bubbles and charlatans.

PolkaDot raises $43m - PolkaDot is one of the next generation smart contact platforms I follow. Looking forward to their launch (hopefully) soon.

These slides on Hyperledger Cactus - which is a framework for blockchain interoperability that I am following.

Henrietta Lacks and Her Remarkable Cells Will Finally See Some Payback - Lacks would have been 100 years old yesterday. Her story is a remarkable one and worth reflecting on this weekend. I enjoyed this book and am told there is a movie out about it, though I haven’t seen that.

STAT dives into Mayo’s projects that use patient data

From the article

STAT interviewed Mayo executives and outside ethics experts to examine the tension between developing AI tools and the fundamental privacy rights of patients, including questions at the heart of a broader push by U.S. hospitals to use patient data and AI technology to improve care.  Should the details of data deals with outside companies be disclosed to patients? Should they be allowed to opt out? And what, if anything, is owed to patients if their data are used in products that generate a windfall for Mayo and its private partners?

“If your data and biospecimens are valuable, they are yours,” said Kayte Spector-Bagdady, a bioethicist and lawyer at the University of Michigan Medical School. “There is a harm of respect for people to use your stuff without your permission, or make money from your stuff without giving some back to you.”

But compensating patients for their data raises a potential flashpoint with academic medical centers. Mayo executives said it could slow innovation and undermine development of new treatments and digital services — a need whose urgency has been reinforced by the Covid-19 pandemic.

I thought this was a very weak argument from Mayo on both economic and moral grounds. I think compensating people for their data could actually speed up innovation by encouraging people to share more data and to produce higher quality data because they know their data will be used for a good purpose. But regardless even if you disagree with that there are all kinds of things we do that slow innovation because they are the right thing to do!

A piece on why the OCC’s notice is a big deal


If you found this newsletter valuable then you can click the button below to sign up for free.

If you’re an existing reader I would deeply appreciate it if you share this with people who would find it a valuable resource. You can also “like” this newsletter by clicking the heart just below this, which helps me get visibility on SubStack.

My content is free, but if you would like to support me, you can do so on Patron here.

Feel free to reply to this email with comments, questions, or feedback. I host a blockchain and healthcare Telegram group that you can join. You can also find me on Twitter here