Linux for Health, Project Alvearie & Watson Health | Ted Tanner and Adam Orentlicher
Luke Schantz: In this episode of In The Open, join us for our conversation with Ted Tanner and Adam Orentlicher. Ted is the Global CTO and Chief Architect for Watson Health, and Adam is the VP of Development. We will be discussing the open source projects, Linux for Health and Alvearie. But before we welcome our guests, let's say hello to my co- host, Joe Sepi.
Joe Sepi: Hey, Luke. How are you my friend?
Luke Schantz: I'm doing well, Joe. How are you?
Joe Sepi: I'm okay. I'm okay. How's the weather over there? Did you get hit by Henri? How'd you fare?
Luke Schantz: Just missed us. We battened down the hatches. We put sandbags around the greenhouse. We like everything, but it's hot now, I would say. I was complaining about it not being, or I was excited it was not hot a few weeks ago. It's hot now.
Joe Sepi: Yeah. Yeah. I spent part of my time in Peekskill and part over in Connecticut and it seemed like it parked right on where I am here in Peekskill and just kept cycling around dumping rain. But generally speaking, it was fine. We had no trouble.
Luke Schantz: Well, without further ado, let's welcome our guests. Hello, Adam. Hello, Ted. Welcome.
Adam Orentlicher: Hey.
Ted Tanner: Hello there.
Joe Sepi: How are you, gents? How's the weather by you?
Adam Orentlicher: It is, let's see, 85 degrees and cloudy.
Ted Tanner: It is hot. It is extremely hot here in Charleston, South Carolina.
Joe Sepi: Oh, yeah. It stays warm down there.
Ted Tanner: Yeah. Hurricane central, too, man.
Joe Sepi: Yeah.
Adam Orentlicher: Yeah, we're a little inland so we don't get to deal with that. We just deal with the tornadoes here.
Joe Sepi: Yeah.
Adam Orentlicher: Yeah.
Luke Schantz: So maybe we kick off the conversation with just self introductions to let the audience know who they're talking to. Maybe Ted, you want to get started?
Ted Tanner: Yeah, sure. And everybody, I'm super excited to be here. Thank you, Joe, Luke very much, and I'm humbled to be here. Ted Tanner, Global CTO and Chief architect of Watson Health. I've spent a while in technology. My previous company in the healthcare industry. Some of you may know a co- founder and CTO of a PokitDok with Lisa Maki, who now is the GM of Health Alliances at Microsoft. I've also worked at Apple and Microsoft. I did another machine learning startup, BeliefNetworks, which some of you may know about. We were talking previously backstage. Also worked at a company if you're in music called digidesign, and I worked there originally helping the old pro tools. So, super psyched to be here and ready for a great conversation.
Adam Orentlicher: And I'm Adam Orentlicher and I lead engineering for payer provider, life sciences and the platform units of Watson Health. Makes up a large portion of Watson Health and what we deliver. And the reason that I'm here is really to talk about how we are truly looking to enable the open source community, our developer communities with some of our, candidly some of our wares, some of what we have been developing and what we've invested in heavily within Watson Health over many years. So, I'm glad to be here and glad to engage in this conversation with all of you.
Luke Schantz: Yeah. Oh, go ahead, Joe. I was going to cut you off.
Joe Sepi: No. No, go ahead, Luke.
Luke Schantz: Well, I was going to say it's exciting and it's great for our show, too, because we talk a lot about Kubernetes we talk a lot about JavaScript. We talk a lot about what's going on in sort of the hybrid cloud side and it's really interesting to hear that there's a similar really developer driven and open source movement going on in the Watson Health side.
Ted Tanner: Yeah. I think being what I call a developer first CTO, I think the inaudible is in the developer allowing them to build stuff. So whether people can build things, if you refer to something called Metcalfe's law of value is proportional to the number of nodes squared. So the higher, if you build stuff so other people can build things, you can get that scalability and then a hyperscale, and ultimately I, and I hope everybody else out there, I want to be able to abstract all of the quote" sausage making" in healthcare so people can think more about the great new business models coming down the pipe and also just write great code, ship code, change the world.
Joe Sepi: I imagine that that is super hard, Ted, thinking about all the sort of regulations and things that you need to be concerned about in the healthcare industry, similar to like banking and finance. Like what's your experience in terms of orienting things for the developer?
Ted Tanner: Joe, that's a great question. You have to be extremely patient and it's not a technical problem. You can't, you just slap no JS, a Kafka and some Python code on it and we're done. You can't do that. You have to, I even think terminologies like minimum viable product and healthcare should be changed to maximum viable product because you want that thing to be solid. Because ultimately, ethically, in healthcare every single line of code that developer writes has an effect on somebody. The interesting thing versus other industries, let's say telecom, finance or media, you will never... The person is at the center of the universe and persons are at the center of, so ethically you have to be very patient. Now, do some of the compliances and regulatory items meet current day technologies, and do they need to be shined off? And I'll say this transparently. I think they do. I've been successful and going through, you know, we worked on the latest legislation for transparency in APIs. I think that you need to be extremely patient and it's not a matter of just having a CICD pipeline and cranking out code. So, you have to be very thoughtful, very ethical, very patient and educate and communicate, and I think it's a two- way street. Did that make sense?
Joe Sepi: It does, and I imagine that's not easy in a lot of ways, and one thing I'm just thinking about, it's something we actually talked about in the last call with James Snell is like the community in GitHub as an example, but the community overall, oftentimes they're wanting things to happen, wanting them to happen now, and they don't care about why it may take time and why you need to work through all these sorts of hurdles. I imagine that's probably difficult you'll satisfy developers and ship those sorts of APIs and stuff.
Ted Tanner: Incentivizing, and I overloaded a very important word here, communicate. As technical executives, our first and foremost job should be to amplify other people, that you should amplify careers, and along with that I always say communicate, communicate, and it's very important both from an internal perspective and an external perspective to let everybody know what's going on so they don't get up in their head, because a software is hard, and especially in a highly regulated environment like this, software is extremely hard. And so that in and of itself, and Adam, feel free to jump in here.
Adam Orentlicher: Yeah.
Ted Tanner: I think is what that give and take, that push me, pull you, if you will, of the regulated environment versus the speed of thought is what we have to balance.
Adam Orentlicher: Yeah. Ted is absolutely right. If you look at the regulated environment that exists, 2020, 2021 saw like the CMS Cares Act that was pushed by predominantly the U. S. government, but a FHIR as a standard aligned with the U. S. core set of health resources has been driven for quite some time. But it was really codified and mandated in the last 12 to 18 months. And now what we're seeing is those mandates and the drive towards open standards and interoperability is really driving a significant sea change in the way that the market reacts. When we say market in this context, the market is then you have the developers, all of us who are actually writing code in response to what our businesses that we work for and work with are requesting. So as an example, I was reading a study recently that showed it was that each patient generates about 80 megabytes or so, give or take a little bit of data per year, if my memory serves on this. That is a lot of data and trying to think about all of us, we go to a primary care physician and then we go see a specialist. How simple do you think it is for those two physicians to be sharing data amongst one another? You switch employers in the middle of the year and your health plan changes. Do you think it's really simple for plans to be able to share data with one another? It is not that simple. It's complicated because of things like patient consent and what exact data is needed to be shared and authentication authorization. It's really, it is quite complicated. And really what we're driving towards, what Ted and I and our teams are really driving towards here is really enabling end- to- end health data pipelines through open source, enabling all of you to take advantage of both of technologies like federated learning and even centralized data lakes, enabling centralized data lakes. But it's not just data lakes and federated learning in air quotes. It's truly a set of composable services that are based on open standards that you can use and leverage as you are enabling these scenarios such as interoperability, such as enabling democratization of healthcare data, enabling consumer applications that might be making use of health data. That's really what we're driving towards here with Linux for health and Project Alvearie.
Joe Sepi: This may be a dumb question, but your focus is not just the U. S. It's more of a global thing, or?
Adam Orentlicher: It definitely is. In fact, yeah, I'll jump in on this one, Teddy. You can add on. In fact, I was on a call this morning with some of our IPAC colleagues and what they were saying was even FHIR as a standard because the U. S. and a lot of other countries in the world are starting to adopt it is also being looked at and well adopted in specific countries in AsiaPac. Now, it's much slower because in reality the mandates don't exist. However, over time they even think that that's going to change and they're seeing an increase of utilization of FHIR and other technologies, even like a CQL based cohorting as an example. So, they are seeing those type of technologies as well being used in open standards.
Joe Sepi: And just real quickly, can you just explain very quickly what a FHIR server is? Because I had to look that up after our last call.
Adam Orentlicher: Ted, do you want to take that one?
Ted Tanner: Well it is both a taxonomy schema and a payload methodology, if you will, to allow standardization of healthcare transactions. And is it globally agreed to? Ultimately the issue, and I'll go back to one of the things Adam was talking about, but it's a protocol. At the end of the day, if a FHIR, there's several different protocols. There's ASCII 5010. There's CCZA. There's HL7, and FHIR is attempting to encapsulate these and things into a payload and hook it up to various areas and be the crossbar transmission source and sync, if you will, to allow this interoperability. And it's really important that... Joe, that's a great question, and back to your global comment, it's my hope that we can create through open source a global trust protocol this borderless because I want to be able to fly to Ireland, hop off and have my healthcare record with me. But there again, and in the United States we have incentive misalignments. Globally, we have incentive misalignments. So once again, the politico and regulatory environment is definitely involved here.
Adam Orentlicher: Yeah. So like a good use case to explain on how where a FHIR server fits in. So, you're an employee of a Fortune 500 company and you're switching to another Fortune 500 company and your health plan is, I'm saying it's Aetna, and the company you're moving to is Anthem. This secure exchange of data about you, claims data and claims history and any clinical data that might be relevant that you give consent in sharing across the different entities. You think of it as, it's a protocol as well as a semantic and syntactical standard for securely sharing health data.
Luke Schantz: And just to chime in here, it was so funny. After we had a little conversation earlier this week and I was about to go look up what FHIR was, and it turns out I was also producing, we have a digital developer conference coming up next month, and there's a session, track five in the workshop, Paul Bastide's doing a workshop on the FHIR server. So literally after our prep call earlier this week, I got off and got onto an hour- long workshop about the FHIR server, and it was totally unplanned, and I found it fascinating because I have an interest in montology and semantics and it really reminded me of JSON LD and the fact that you're able to, in something that's not a full graph, be able to store how things are related and it's a great workshop so I'll put this in the show notes and I'll put it out on the chat. But check out Paul's workshop. And the other thing I wanted to mention is if anybody does have any questions for Ted or Adam, please drop it in the chat and if you're catching this as a playback later as a podcast in the show notes we'll have all of our Twitter handles, so feel free to tweet at us any questions you have.
Ted Tanner: Cool.
Joe Sepi: Yeah. I'm curious when you're thinking about this sort of from a globally perspective, is there... Because I imagine it's very hard to change regulations and things like that, but are you focusing on one particular model and trying to get people to come to that or trying to fit everyone's needs? Or how do you approach that sort of thing?
Ted Tanner: Well. Joe, so I don't think there's one size fits all. If you look across sectors, the only industry that I think's really been successful, well, possibly too the FinTech industry and the media industry with MPEG. I think MPEG is probably, it's allowing us to see and hear each other right now. But look how long that took to get to that quote" standard" and they had to align incentives at that level. We are just getting out of the box and gate. You know, to be really transparent, to overload a term here, relative to open source in the U. S. we have technologies that use flat files, they use ED or 1970s EDI, and just introducing something like a JSON file format could be a naptha. So, it really depends where we draw the line, but we have to be very volitional in our statements, and once... I'll go back to it again. We have to be very conservative in how we view things because you have to time things correctly in the market for them to grab hold. So, I don't think that there's going to be like, here's the one thing that fixes everything. I don't think that's the case. It's my hope that we do have distributed transactional integrity. I'm sure we're going to get into it talking about the projects and everything, but that is my hope with the various projects in the open source. If you look at, you've mentioned James Snell. He's awesome. He's the head of tech on the Linux Foundation Public Health Board of Directors. We are addressing that distributed transactional integrity initially via track and trace, but ultimately we want to have services and frameworks globally to hopefully enable that global trust protocol because it is a matter of trust and having the right data at the right time.
Joe Sepi: Yeah. That makes a lot of sense, and good luck with that. It doesn't sound like inaudible problems.
Ted Tanner: Well, if it was easy, everybody would be doing it, right?
Joe Sepi: Yeah, exactly. I think it would make sense to get into the projects that you are both working on, but I remember you sharing some stats in our prep call and they were pretty mind- bending. I don't know if you want to share any of those before.
Ted Tanner: Yeah. These are, get ready folks. These are probably about four or five years old and we can provide references to each, but let's run down the bullet list. 35 to 45% of all data at rest and in transit and healthcare is in clear text, and that was done, that was performed by the inaudible cybersecurity work group. Okay? And they found that then on average has about 17,000 breaches the day. I'm sure that number's gone up. 90% of the data in healthcare actually is unused from a machine learning standpoint, because it's a balance, right? It's like a seesaw. You're going to get so compliant risk averse that you can't use it anywhere, so it's data hoarding, not data fluidity. And then the one that caused me almost to leave the healthcare industry, in 2015, there were 15 billion faxes. At least for your children's children carbon footprint, please kill the fax. And then this is a really cool one, and this is more of a colloquial statistic, but this is really cool and it juxtaposed kind of my belief in a lot of things because historically I've been a very distributed agent machine learning process person. 80% of the transactions in healthcare in the U. S. eventually go through an IBM mainframe.
Adam Orentlicher: Yeah. So the one thing I just want to add on top of what Ted was stating here, the fax comment. All right? It truly was mind blowing even when I heard that. But the other side of me says, let's look at what we see the reality here with our customers in Watson Health. I mean, the reality of the situation is we are not getting a lot of FHIR data. You know, we talked about FHIR as a standard. We're just not getting a lot of FHIR data. We're getting flat files. We're getting CSP. Could be flat. You could keep going down the list. X12 messages. 837s. ASCII. It is as broad and as dirty as it could conceptually get. The level of dirtiness is skewed differently. It's much more dirty in the clinical sense because the physicians are not exactly the best at data entry, especially into their EHRs. But in reality here also that even in the claim space, the financial space, there's still a degree of dirtiness, and the way we think about this really in Watson Health is it's a matter of getting the dirty water clean, and then as we get the dirty water clean, we use standards as much as possible because then it allows all of you, it allows us ease of integration, ease of AI and ML when we're not dealing with all these disparate data formats and standards all over the place. So, I just wanted to latch onto that.
Joe Sepi: Yeah, I appreciate you digging into it because it's a lot here. From someone who doesn't deal with this regularly, there's a lot to sort through. So, let's talk about the projects you're both working on, and maybe it makes sense to talk about the Linux for Health one first.
Ted Tanner: Sure, I'll go for it. So the way we're looking at, first of all, for the developers, it's aimed at abstracting all of that sausage making that Adam was talking about. So you can write code, grab an API and go. You can pass in the data format that you... It is going to automatically read. We use pydantic a lot to validate the data coming in. But the four main attributes, it's the world's first health OS, and I mean that in a literal sense, and the four main attributes that we look at for Linux for Health is a distributed transactional integrity. That is very important because of what Adam was talking about. There's so many different formats and all of those formats are encapsulated in Linux for Health. We abstract all of those, but when the transmission happens, we hook up a mainframe, we hook up LinuxONE, which is one of the best, if not the best. I think it's the best trusted execution environment in the industry from a hyper protect standpoint and a hardware standpoint. It's got FIPS 140 compliance, and then a hybrid cloud and then devices. So, you can actually have that same longitudinal patient record across all of those things agnostic to the data type connection. The next thing is move the data compute to the data. Move the compute to where the data resides via the transactional integrity, and then zero config footprint, I hope. I want to see this thing put in routers. You're not waiting weeks for a VPN to be configured, for instance. I want the configurations to be with your build files. And then the thing that's near and dear to my heart is I want to scale developers. I don't want you to think you can't get in this game just because you know a high level language. Give all the client libraries, Node, Ruby, Haskell, Lua, whatever it is, get it and go, copy pasta go get your hello world done and build an application as fast as possible, and think about it in protocols. The future of the tech industry lies in FAT protocols. I truly believe that in distributed nature. But ultimately all this is open sourced. It's completely transparent. We're going out to the industry and evangelizing. But once again, distributed transactional integrity. Move the compute to where the data is. Zero config footprint. Developer first mentality, all on a OS. And then Adam, feel free to jump in with Alvearie, because that's a perfect balance for our bookends, if you will.
Adam Orentlicher: Exactly. So then the other side of that bookend is what we're calling Project Alvearie, and Project Alvearie is really meant to be... Think of it as the once a project that enables developers, once the data is ingested, to then essentially make sense of it and to use it quite effectively. Project Alvearie is all about modularized, componentized, extensible multi- cloud services that can be constructed in either pipelines or distributed networks to enable making sense of the data. So, it uses industry standard models both at the protocol level, at the semantics syntactic level. We have reference implementations that are built as well. So, what Project Alvearie is, there's a set of services that are part of it, like I'll give you some example services that are part of it. So, we have a CQL based cohort and measure service. CQL is, it's the industry standard that is rising in importance. Like HEDIS measures for example are now being published in CQL format, and then you can essentially use that to run a cohort analysis and or measures. Cohorts are groups of patients. Measures are really just simply put numerator over denominators calculation. Give me all the patients that have Type 1 diabetes and on a specific type of medication. That's an example cohort. How many patients have Type 1 diabetes in a specific zip plus three, let's call it zip code tranches as a measure. Other examples that we have here that are part of this, we have things like HL7 to FHIR converters. We have data quality engine, right? Because data quality in healthcare is really sacrosanct, right? We need to be able to look at streams of patients that come in as an example and be able to determine when there are things like pipe delimiters that are added to an HL7 field or a record that are causing some downstream issue. So, you can look at that. I mean, there's a whole set of. These IBM FHIR server is part of it as well. We have a knowledge map, which is a graph based server that enables medical knowledge queries and relationships. Okay?
Ted Tanner: Yeah. This is the exciting part of these dual projects is because, like in Linux for Health, right, we have both the IBM FHIR server and the Microsoft FHIR server, and because the Apple Care kit uses LinuxONE, we can connect all those up in an efficient pipeline all the way back to the mainframe. But it leverages all the components. We can easily leverage all the components Adam was talking about or other engagements and other poll requests. But this is the great thing about this balance here. You know, focusing on components versus the entire... Either verticalize it or horizontalize it. The main thing that I hope we move toward is when we do move the compute to where the data resides, we can finally get around to distributed machine learning and a federated learning, so that's what I'm personally excited about.
Adam Orentlicher: Yeah, I just want to add one more thing. One thing to comment on. Right? The act of this being composable and extensible and components is really fundamental, because the needs of, for example, a life sciences company or a pharma that is attempting to do a statistical DID scenario like removal of names, street address, the obfuscation of social security numbers in the United States, restricting the granularity of geographic regions like truncating zip codes, looking at using age bins as an example over data and restricting the use of birth and death dates. Really important for de- identification, right? I mean, to be compliant with either Safe Harbor or the HIPAA expert rule. That is not necessarily a scenario that is as relevant to a health plan that is looking at enabling value- based care over a large population. A health plan is much more interested in interoperability and FHIR and perhaps even a business intelligence integration than they would be this type of DID. Now they would use the ID in another side of their business, but they're very different scenarios.
Ted Tanner: Adam, that's a great point. And architecturally for those who've worked in varying degrees of big and small data, population health versus individuation, those architectures are at odds with each other. Just like it takes a certain amount of data to pre- cash and prewarm Spark for queries. You know, you just don't automagically throw Spark at something or HDFS for Hadoop and it's automagic. So, there's architecturally a give and take, right? So, it is a season to taste scenario, but fundamentally the substrate by development is from OS to componentization.
Adam Orentlicher: Exactly. Ted's absolutely right. If you're creating an analytic that you need to serve up data at the point of care, real time data, you would not necessarily use Spark unless it was a batch analytic. You would look at using something like a Flink. But if you really are running analytics off of batch map terabytes, petabytes of data, that's really where Spark shines, and it's different technologies through the stack inaudible.
Ted Tanner: Yeah. Like for instance, speaking of dropping down a click on the Linux for Health thing, we added IPFS. That's the most used drug database and distributed ledger, and we've added multiple blockchain connectors. Obviously Hyperledger. We added Ethereum. We're going to be adding Daml because it's used by the Hong Kong Stock Exchange and the NASDAQ. So, you can see how that distributed transactional acuity from componentization down to base substrate with that Metcalfe backend value proportional to N squared. The higher N squared, the better the value. Because somebody, for instance, somebody might just have some vehemently opposed to using Cassandra. Like for instance. Does that make sense?
Joe Sepi: Yeah, yeah, absolutely. Did you have something, Luke?
Luke Schantz: Yeah. I was going to ask, something Ted said along the way I was curious about. So you mentioned doing compute where the data is and having the OS on routers and I was just wondering, explore that a little bit, because what comes to mind obviously is edge, right? The idea and the factors for doing edge, right? Sometimes it's network connectivity. Sometimes it's, like Adam mentioned, huge amounts of data. How would you move the right? You don't even know what the right data is. You have to process it here to get the right data, and then I'm imagining also security and compliance and where you're moving that data. So I guess, and then the other thing that came to mind with this was homomorphic encryption. Is this considered edge computing when you do it that way, and what are the drivers and what are the use cases around which strategy you're using and why?
Ted Tanner: Yeah. A lot to unpack there, Luke, especially with the homomorphic encryption. We're getting there. Having started on that bus about seven years ago, we're getting there. And for those out there, that allows you to do analytics over encrypted data essentially. Let me unpack that. So, we can get into things like trust over IP and what I call IoT, Internet of Things. But not to buzzword everybody to death. I do think it's going to be a, let's start with real time and batch because I've done technologies in the past and several people have that ended up DDoSing certain corporations because they just didn't run it real time. So if you think about it, most consumer APIs run at no more than 400 milliseconds. Okay? If you look at a lot of, let's say an eligibility check and an eligibility check is something called the 270/ 271 ASCII 5010. Sorry to be getting so detailed. But the upper bound on some of those is 10 to 15 seconds. I could have a pedicure in between that. You know what I mean? It's crazy how that delay we somehow tolerate. So to your question, do I think it's, it's an and and or. So, I think depending on how you compile it, and we are going to put in an Alpine build first. This is going to be Linux Alpine first. We're also looking at Tiny Linux as well as ACRN, which is a fully distributed VM support model. But initially we'll want to put it in Alpine and look at it and say, " Okay. How you connect up your connectors," and remember, this isn't us going to the industry and saying this is how it is. This is the developers doing pull requests and feature requests and saying, " Hey, can you connect up this heart monitor device and here's the PR for it," for instance. So, it is going to be possibly an and and or to your point, but I think that edge devices are going to play a critical role. You can't overlook what Nvidia is doing in the industry with their machine learning pipeline and purchasing arm. I think they're going to change computing. So, we must deal with that, and it's not just going to be like we suck everything up into this cloud somewhere. So, the deployment models are going to be very important. We're not dictating if you use Docker Swarm, Kubernetes or even OpenShift. You can use all three. You can use Podman, whatever you want to, because I want the developer and we want the developer to make applications as fast as possible, but it might not be possible if it's on- prem, on a mainframe and up in AWS. You see the difference? And then like I mentioned, the care kit running on LinuxONE. That's got an ARM64 processor on it. One of the first things we did was run it on a PyCharm to make sure it was in the ballpark. I tried to unpack your question by going bottom up on that. Does that make sense, Luke?
Luke Schantz: I think it does, and I see how it's and and or.
Ted Tanner: Yeah.
Luke Schantz: It depends on the situation and it's driven by development and the need. Right?
Ted Tanner: Absolutely.
Luke Schantz: It's not, " Hey, we're going to make something and then deploy it. Everybody's going to use it." It's coming from the bottom up.
Ted Tanner: Absolutely.
Luke Schantz: Where the actual need is and the motivation for developers and integrators to work with this stuff.
Adam Orentlicher: Yeah.
Luke Schantz: Cool.
Adam Orentlicher: Luke, on the Alvearie side, and I think this is really important, right? The reason for and the services that we contributed to Alvearie are driven by our solutions that we have in Watson Health. So as an example, this is like us eating our own dog food in some respects. Right? So we are innovating in the open and then we are leveraging those technologies in the products that our customers use. So as a couple of examples, I gave the payer value- based care example. The services that I rattled off are being used in our health insights product. I mentioned our de- identification service that's being used in our market scan products. So, that's the base of the way we're thinking because we're dealing with petabytes of health data. We're dealing with the diversity of health data across these different scenarios. We're dealing with streaming as well as batch models. We're dealing with the fact that we have to have compliance. It's sacrosanct. There has to be trust that's built, so latching on to what Ted was saying earlier. You know, we are truly enabling a health data lakehouse here with a lakehouse architecture and we are using these services at scale and then we're continuing to innovate in the open.
Joe Sepi: Yeah, I'm glad that you put it that way too, Adam, because I feel like that's the IBM way, or at least how I see it in a lot of areas that you innovate in the open, you work in the open, you build these platforms and foundations in the open and then you build on top of that. But the open source work is a core part of what we do.
Adam Orentlicher: Exactly. And then Joe, the other thing we've tried to do here, these are very hard problems that we're solving. Make no bones about it. These are very hard technical problems. These are very hard business problems that we're looking to solve. So one of the things we've tried to do, in addition to just putting out all of our services in the open, we have what's called patterns, Alvearie patterns. Some of these patterns are actually, they're bringing Linux for Health and Alvearie together in these patterns. But those patterns are based on what we have implemented. So for example, clinical data ingestion, clinical data enrichment, quality measures in cohorting, clinical data access. So we are enabling these patterns to make it easier for developers to contribute to and leverage what we have already done using of course industry standards and our open services. And also, I want to be clear, these are all Apache licensed.
Joe Sepi: Did you have something to add, Ted, or should I jump into my question?
Ted Tanner: Yeah, yeah. Just I'll do an advertisement for CATB, the Cathedral and the Bazaar, right? I read it recently for the third time I think, and I always learn something new and I'm always humbled by it, but it is truly something I want to emphasize here. People still confuse open source with free. That is not the case. So, I just want to emphasize that there is no free lunch here. That's all I wanted to say.
Joe Sepi: No, I think that's a great point. I wanted to, and I think this kind of ties into that as well, but maybe you were sort of answering this already, Adam, with the patterns, but I'm curious, and maybe both of you can talk about this. Like, how do developers get involved in this? You know, it's open source, it's on GitHub, but what do you think a developer would do coming to these projects and maybe you have some advice for where they might start or how they would get involved?
Ted Tanner: Yeah. Oh, go ahead, Adam.
Joe Sepi: Okay, I'll go first. I'll say there's two things that I would suggest. First one is all this code and all of everything I'm talking about is up on Git. Okay? So that's where I would start. If I were a developer, I would start at Git. Okay? We also have set up a Slack server. Okay? Alvearie. slack. com, if my memory serves. I'm pretty sure it's right. But we have set up a Slack server where you can converse and work with our developers, our development teams directly on the use of these services. Okay? That's the way I would get involved. It's really those two things. That's how I would start.
Ted Tanner: And so I'll jump in. Same thing. All the features for Linux for Health are on ZenHub. Go through all the epics. If you don't see a feature request or epic, hop on there. One thing I will say, just out of sheer propriety and courtesy of the open source industry, don't fork and leave please. If you fork something and change it, please give back to the community because ultimately we know where you forked it, so it's a drag. But I will say that we are making show up on Linux for Health. We got a very concise hello world and the concept of transactional routing. Most people can write a new route in a day or two and connect up a new thing. That's my goal. I don't want these developers wasting any more time hooking up the sausage making. You dig?
Joe Sepi: Yeah, and I think that kind of gets to the question too a little bit more. So hello world, just briefly, what does that look like and what could you imagine someone doing, sort of playing with it? Or do you expect people to just come in and help build the platform and find APIs that need to be built out or whatnot?
Ted Tanner: So it could be an or. I prefer it's an and, but let's say it's an or in this case. Right out of the box, right out of the proverbial OS box, you can do an eligibility check on multiple FHIR servers with multiple types of connectors that Adam and us have went through, so that immediately gets you there, and then you can start actually looking at a longitudinal patient record. The provenance of record in this case, we're using Kafka. We looked at several things. I have had experience with protobuf, so nothing is off limits here right now. But getting in and actually doing, let's say an eligibility check on multiple FHIR servers with multiple types of connectors, CCDA, HL7, et cetera, and we even have a complete clinical data pipeline for those that are interested in the clinical side, complete clinical data pipeline out there. So you can do both administrative transactions and clinical transactions in the same LPR and also have a logical partition on mainframe LinuxONE and then as flat containers. But we also have OpenShift as well. Flat containers so you can just chunk it up on, go get an AWS server, chunk it up there and transact, and also auto automagically build it for ARM64. That's right out of the box. We have a generalized NLP rest API, so if you got some type of NLP, you want to go unstructured to structured data. And we also have our shining jewel, which I think's the best in the industry, although it is not open source, it's our NLP called ACD, and I think that's the best that we have connectors for that even, so if you're a customer for ACD already, we have connectors for that. So, you can do a ton out of the box with both of these projects.
Joe Sepi: Very cool. I come from an advocacy standpoint, background, and it sounds like this would be interesting for like a hackathon for lack of a better term. Not that it would have to be a competition, but just getting a bunch of people involved and ramped up. I guess I just wonder if anything like that is happening, any sort of events and getting people engaged.
Adam Orentlicher: It's in the cards, Joe.
Ted Tanner: Yep.
Adam Orentlicher: Be on the lookout for it. It's already in the cards. Yeah, we're working on it.
Joe Sepi: Yeah. I just think especially with the pandemic and everything, this is really front of mind for a lot of people and especially developers are always like, how can I help? What can I do? I want to make things better if it's really interesting.
Ted Tanner: Yeah, the network, as I always say, she who has distribution wins in software. Doesn't matter if you're making pencils or inaudible or paperclips. If you got the distribution, win. So, technical evangelism is paramount. The two best companies in the industry are Apple and Microsoft and we work closely with them. But technical evangelism at the most highest scale in advocacy, if you will, is paramount, and I believe getting the word out there, doing demos, doing hackathons, really ramping things up. And you've had Chris Ferris on here. I mean, he's awesome. So, this is a case where we have to get the word out on the usage of open source, the usage that this is possible even, and the usage that the community and the industry writ large is supporting it. Because I will tell you, the hyperscalers and other companies are definitely, I mean, look at what all of the other companies have put into the open source. So completely agree. Yeah.
Luke Schantz: And just to echo that, it makes so much sense for a hackathon project too, because if you're building in that environment, well, first of all, it's easy to have the connections and get started and not worry about all the underground sausage, but now the semblance of your idea is actually in a place that you can grow from and it can become real, because sometimes these projects coming out of hackathons, they turn into startups, they turn into open source projects and... I wore my Health Hackathon shirt today for the podcast. This was IBM and Johnson& Johnson sponsored at Cornell Health and it was right at the January, right when the pandemic was hitting and we did a pivot to a COVID 19 theme. John Wolicki was there helping the teams and I was actually blown away by the sophistication of the solutions that these teams came up with in 48 hours. Remarkable.
Ted Tanner: And that's a great point. One of the things that has ushered in along with the new legislation is the pandemic has compressed the timeline of urgency relative to how fast we should be developing in the market. So, that's a very cogent point, so completely agree. I think that's greatly compressed things like the time to market for telehealth and allowing the virtualization of the provider to the consumer. I don't use the word patient because eventually I want healthcare to be preemptive and incentivized against the consumer. The other thing is that I just want to clear the air is we talk about providers a lot, and it's the goal of Watson Health to augment the healthcare professional. Full stop. I implore any developer out there. Listen to the pharmacist, the nurse, the healthcare provider, the surgeon, because they are the front lines, they are the rock stars, they know the use cases, and take those use cases in features and put them up on ZenHub and start doing PRs. As simple as that, Luke.
Adam Orentlicher: Yeah.
Joe Sepi: Let me ask a pointed question and I'll be blunt, and when I ask this I'm talking about the industry. Does healthcare care about open source?
Ted Tanner: You want to go first, Adam?
Adam Orentlicher: I'll go first on this one.
Ted Tanner: Rock on.
Adam Orentlicher: Think it's a yes and no. Okay. As usual. Developers care about open source because developers care that... Open source is a force multiplier. Open source is a way that you can mitigate bugs, using a Ted term. It realistically is, right? It's a way to accelerate what you do. It's a way to gain knowledge from the best of the best in an open community session, an open community setting. It really is. But the question becomes do enterprises care about open source? It starts trickling more down. I wouldn't say no, but it's now towards the maybe. Okay? And the reason I would answer it is for the maybe is because the enterprise are trying to solve a business problem. If the open source can help them solve a business problem in a cost effective way while enabling them to remain compliant and differentiated and competitive with whatever they're doing in the market, then it becomes a yes. Otherwise it becomes one of these where maybe they have to look at commercial alternatives. Those commercial alternatives could be using open source. So, it becomes this ladder effect, and I think healthcare cares about open standards much more universally right now, but it's really driven by mandate. Like there was a recent New York Times article that showed that if you look at providers, how many providers, how many hospital systems are adhering to the price transparency mandates? Very few. And those that are are not even issuing MRFs in standard formats. Some of the files are gigabytes in size as opposed to megabytes in size. So, open standards in support of value- based care and support of money generation, yes. It's drive to improve patient care. Yes. Mandates. Yes. Open source in support of open standards to me is yes and maybe.
Ted Tanner: Yeah. I am fond of a quoting from CATV, enough eyes on glass make all bugs shallow. I just want to emphasize that. Do I think they care? Well, it's interesting given the number of years I've been doing this stuff, which is more than two. Inevitably you walk in and they're using open source. Do they care about the magnitude? And I'll keep going off of Metcalfe's law of value proportional to N squared of the developer ecosystem community. Per Adam's commentary, the problem of disintermediation lurks. Okay? And where that demarcation line is, and I'm rephrasing what you said, Adam, perfectly, is what they're concerned about, right? Now, a lot of people for years have been talking about the great health It washout. Okay? Let's choose the two bookends. Commercial off- the- shelf software. I'm going to shrink wrap something and give it to you and put it in a managed environment. Some people might want that. Is that where most things are heading now and people are starting to talk about that from an enterprise standpoint? No. So, I think they're realizing like, okay, there's some there there and we must address it because here's the issue. Almost every developer that's worth her salt is contributing to some open source project, and I think from an ethics standpoint, a transparency standpoint and an interest standpoint, they're going to start asking about that as part of their interview process. That was a double answer, but I hope you caught the nuance there.
Joe Sepi: Absolutely. Yeah. And I'm curious, too, we talked about a lot of open source and we even touched on IBM's leadership there. Another thing that IBM cares a lot about, and I see this a lot in the work that I do in open tech, is open governance. And I'm curious, the work that you all are doing, it's in GitHub, it's open source. We talked about Linux Foundation for Public Health earlier. Are you working with them at all? Is there any plans there that you want to talk about?
Ted Tanner: Yeah. So we're working closely with them. Obviously I'm on the board of directors. Jeffrey inaudible is too. We're very involved. Just stuff for all the Linux Foundation of Public Health and Linux Foundation folks. We just re- upped our membership today, so rock on. You know, IBM has been a huge proponent of open source. Eclipse. If it wasn't years ago when the elephant Hadoop was the big data thing, I said Spark's going to run it over. IBM contributed hugely to Spark, and now that they have they are contributing to this effort and it is a greater good, we're being asked to possibly present LFH and Alvearie to LFPH. But I think the magnitude of the Linux Foundation in general is starting to really be visible in the industry, and so my hats off to everybody out there who's tangentially or directly touched that foundation. So thank you. It has been very humbling.
Joe Sepi: Yeah, it is amazing. I work with the OpenJS Foundation, which is an LF project and I work with LF folks a lot and it's really impressive what they do over there and the amount that they do and CMCF and all the other projects. It's really quite impressive.
Ted Tanner: And you got to give to get, Joe. You know that.
Joe Sepi: Yep. Absolutely. I feel we try to not evangelize IBM products here per se, but you both are working on Watson Health and I wonder if we should just touch on that as a way to wrap this up.
Ted Tanner: Yeah. I'm personally like a kid in the candy store. I think our research, our care and hope team, shout out to Dr. Gretchen Jackson, our Chief Science Officer, total rockstar. Hey, Gretchen. Dr. Henry Feldman, our CMIO. He is a coding practicing nocturnist. Literally he comes off of rounds and does PRs. From a algorithmic standpoint, we have things like pain management. I think we have the best disease propagation and causal inference models in the industry. I think we have the best NLP in the industry and we compute trust in the industry. We have the best subject matter experts and we know how to deliver trust and enterprise level health software unequivocally. I'll just say that. And that's payer provider, life sciences, et cetera.
Adam Orentlicher: Yeah. And what we're really building here is ecosystems. We're building health ecosystems, right? In the end we have subject matter experts in health plans, in providers, point of care applications, and we referenced Dr. Henry Feldman and Dr. Gretchen Jackson that we have on staff. We have some of the largest pharmas who are using our real world data solutions as well as our clinical development solutions.
Ted Tanner: Drug repurposing stuff is world class.
Adam Orentlicher: Right? We have blockchain based solutions. You've seen, for example, the Empire Pass that the state of New York is using as a way to protect its citizens from COVID. So, it's really AI, blockchain data and analytics plus all of the compliance, I. e. security trust that really is enabling us to transform our customers.
Ted Tanner: And it is amazingly exciting because I think now, once again due to the pandemic, we are uniquely positioned to deliver all of those use cases both horizontally and vertically through opensource.
Adam Orentlicher: Yeah.
Luke Schantz: Sadly, we're out of time. This has been such a fascinating conversation.
Ted Tanner: Aw, c'mon, man.
Luke Schantz: I know. The hour goes so fast, and I would say definitely would love to have you both back on the podcast again. Also, from what you just mentioned, there's so many interesting characters and specialists within this ecosystem you're building. I would love recommendations for other folks within that, like this doctor who's doing poll requests, I would love to talk to that person. So, please send those on. And anyone listening, if we didn't get to any of your questions and you're hearing this on the podcast, please feel free to message us directly. And with that, we say farewell and it's been a great episode. Thank you, gentlemen.
Ted Tanner: Awesome. See you.
Adam Orentlicher: Cheers.
DESCRIPTION
In this episode of In the Open, Please join us for a conversation with Ted Tanner and Adam Orentlicher. Ted is the Global CTO and Chief Architect of Watson Health. And Adam is the VP of Development for Watson Health. We will be discussing the open source projects Linux for Health, Project Alvearie and more.
Theodore Tanner, Global CTO & Chief Architect, Watson Health, @tctjr
Adam Orentlicher, VP of Development, Watson Health, @thisisadamo
Joe Sepi, Open Source Engineer & Advocate, @joe_sepi
Luke Schantz, Quantum Ambassador, @IBMDeveloper, @lukeschantz