Commercializing Ontology; Lucrative Jobs for Philosophers
APA Substack Newsletter: Public Philosophy Digest
This month’s APA Blog Substack Newsletter extends a discussion with Barry Smith, who is a Distinguished Julian Park Professor of Philosophy and Professor of Biomedical Informatics, and Computer Science and Engineering at the University at Buffalo. He is also Director of the National Center for Ontological Research and a lead developer of Basic Formal Ontology (BFO), an international standard top-level ontology (ISO/IEC 21838-2) used by over 700 ontology development groups across the world.
Barry’s work led to the formation in 2005 of the Open Biological and Biomedical Ontologies (OBO) Foundry, a set of resources designed to support information-driven research in biology and biomedicine. He also contributes to ontology projects in the military and security domains and is one of the founders of the Industrial Ontologies Foundry (IOF), whose goal is to advance interoperability of information systems in the industrial domain. Now he and his University at Buffalo (UB) colleague and Co-Director of the National Center for Ontological Research John Beverley lead what one might call the academic arm of the Department of Defense and Intelligence Community Working Group (DIOWG), in which the principles of the OBO Foundry are being applied in the defense and security domains.
In my first interview with Barry we discussed new private and public sector jobs in the burgeoning ontological sector, involving the commercial and industrial applications of ontology across diverse industries. We also explored the evolution of the field and how philosophers are ideally suited for these positions, which employ a variety of skills in logic and computing on the one hand, and in the design of classification systems and definitions on the other. Today, we build on that discussion with Barry and John to understand their continued progress and to reflect on how philosophers can transition into interesting and lucrative public/private sector jobs in ontology.
Barry, it is an absolute pleasure talking with you again to provide more information to the philosophical community about an unpublicized industry. To start with a refresher, can you provide a little context by describing your personal history in ontology – tracing your work in the biomedical arena before we get to your recent traction in the defense industry.
My interest in ontology can be traced back to my first encounter – in high school – with the work of Russell on the philosophy of mathematics. What struck me most forcefully in Russell was his view of the mathematical domain as a Platonic arena of perfect forms, as contrasted with the real world, which is full of irregularities and accidents. How, I wondered, might we combine the two? The beginnings of an answer to this question I found in a short book entitled Time and Modes of Being by Roman Ingarden, a Polish student of Husserl’s who saw ontology as divided into three branches – existential, material and formal. Existential ontology deals with modes of existence, for example independent (organisms, planets) or dependent (shapes, colors). Material ontology deals with the real world of matter and causality. The idea of a ‘formal ontology’ Ingarden took from Husserl, who was the first to use this term in his Logical Investigations. The term ‘formal’ is understood as meaning domain neutral. Part-whole relations are formal – we encounter them in every domain. Mereology, the theory of such relations, is one of the most mature branches of formal ontology.
I began to think about applied ontology in the 1990s after encountering work applying mereology to the field of Geographic Information Science (GIS). There, two kinds of boundaries are dealt with: natural boundaries, such as coastlines; and conventional boundaries, often created by drawing lines on maps to demarcate, for example, postal districts. I discovered that the GIS community treated the latter in ways which appear confused when viewed from a philosophical (ontological) perspective. Above all, they called them ‘conceptual boundaries’, a terminology that appears inappropriate when we reflect on the ways in which these boundaries affect human behavior, for example when countries go to war. I proposed instead the terminology of ‘fiat boundaries’, a term which is now used in 100s of papers in GIS and related fields.
For my work in GIS, I received in 2002 an award from the German government to establish a research institute on a topic of my choice in Germany. By that stage there were already three centers devoted to applied ontology research, all of them in Italy. I founded the Institute for Formal Ontology and Medical Information Science (IFOMIS), which was the first such center devoted to the topic of medical ontology. IFOMIS combined my nascent applied ontology work with the newly burgeoning sphere of biomedical informatics, an area of intense activity at just that time, in light of the successful completion of the Human Genome Project.
To make this more tangible, please explain how the explosion of data such as occurred in the Human Genome Project has necessitated management and conformance across systems, with ontologists establishing standards that support the interchange of information across domains.
In the wake of the Genome Project biomedical scientists were attempting to work out how to use the new data to help in understanding human health and disease. There occurred a rapid growth, not merely in the amount of available data, but also in the many new kinds of data, and this led to the establishment all over the world of -omics data repositories. The problem was that the ways the data were codified and described differed from one repository to another. The new bioinformatics discipline was creating for itself a series of silos which threatened to cripple research.
The first successful attempt to solve this problem was initiated in 1998 by a group of researchers studying the genomes of three model organisms: fruit fly, mouse, and yeast. They created what they called the Gene Ontology (or ‘GO’), which consisted in a controlled structured vocabulary for describing the cellular components, molecular functions and biological processes associated with proteins and other gene products.
The GO vocabulary was ‘controlled’ in the sense that it provided a single, gradually expanding collection of terms to be used to describe data under these three headings. It was ‘structured’, first, in the sense that the terms were presented in graphical form (roughly: in taxonomies). Edges in the graphs were the relations is_a (meaning ‘subtype of’) and part_of. But it was structured also in the sense that it required for each GO term some sort of logical definition.
When the GO was first brought to my attention in the early days of IFOMIS, I realized immediately that it would form the principal focus of our work. At the same time, however, I realized that there were many, many problems with the way the GO, and especially its definitions, had been put together. I documented these problems and brought them to the attention of the GO leadership, who were sufficiently convinced by my arguments – many of which could only have come from a philosopher – that they put me in charge of the logic, first of the GO and then later of an entire body of biomedical ontologies – the OBO Foundry referred to above. Foundry ontologies were built to work together with the GO in areas such as disease, anatomy, proteins and so forth. It was in this way that ‘applied ontology’, in the sense in which this term is understood by Buffalo ontologists and by the 100s of ontologists who follow in our footsteps, was born.
In 2024 there is, for a number of reasons, a tremendous surge in the need for ontologists, which – given the shortage of persons with ontology skills – goes hand in hand with very high salaries. The surge reflects the fact that the problem of data silo formation arises not only in biology and biomedicine but in every information-driven activity. Each group builds its data systems in its own way. But then, when they have to work with other groups – for example when two companies merge, or when a single company wants to make use of data created in different branches or in different stages of its existence – failures of interoperability are almost guaranteed to ensue. And these failures, it is now being discovered, create havoc when companies attempt to use Large Language Models to harvest value from their data. This is where the ontologist enters the picture. The trick is to persuade responsible persons at different levels of an organization that there is a well-tested road to solving the problem of interoperability, which rests on the creation, by human beings, of robust and shareable ontologies. These ontologies can then be used in different ways – to serve as maps between existing data systems; to serve as a source of well-defined terms for the creation of new data systems; to serve as a common vocabulary which human beings can use, for example, when negotiating mergers; and in many other ways.
Barry, to continue to define the landscape, please frame the day-to-day responsibilities and potential salaries across various industries.
There is no typical day in the life of an ontologist. Often it will involve meetings, mostly online. Often it will involve the crafting or revising of systems of definitions. Often it will involve working with experts in non-philosophical fields. Often it will involve coding. Or training. Or presenting at conferences or workshops and publishing results, for example in one of the dozen or so ontology journals existing today, from the Journal of Biomedical Semantics through the International Journal of Metadata, Semantics and Ontologies to the journal Applied Ontology.
A theme which all applied ontology jobs share in common is a focus on specific information-driven disciplines or application areas or government agencies or companies (for example Amazon, or Capital One, or Raytheon). The work will thus often involve collaborating with subject matter experts in ways focused on the needs of the organizations these experts represent.
Demand for trained ontologists has increased dramatically in the past three years. Entry-level salaries for individuals with ontology-focused master’s level credentials are easily over six figures on average; salaries are considerably higher for those with doctoral level credentials. The Department of Philosophy in the University at Buffalo – where most applied ontologists in the U.S. are trained – has a placement record of 100% and salaries of this magnitude for ontology graduates.
When we (John and I, and now many others) started down this road, there were those who would say that we thereby ceased to be philosophers. Our work has indeed been almost completely ignored by those in the philosophy mainstream. On the other hand, there were those who would accuse us of being ‘just philosophers’ – and thus persons who have no business telling natural scientists or data engineers or intelligence analysts how to do their work. I think both groups are wrong. My background as a philosopher plays a role in everything I do. I draw, for instance, on the sorts of argumentative skills I learned as a philosopher – giving me the ability to formulate arguments that will convince skeptics of the rightness of a given ontological approach, or to serve as an adjudicator when conflicting ontological choices need to be resolved, or for working out a logically coherent means of representing, for example types of orbits, or types of neurons.
Also, what is your counsel on steps to build a career in ontology and how can the philosophers learn more?
I would recommend, first, reading the book Building Ontologies with Basic Formal Ontology as it provides a bridge between philosophical and applied aspects of ontology. I would in addition encourage attending one or more meetings of John’s weekly online working group (write to him at johnbeve@buffalo.edu). John runs an "Ontology 101" working group aimed at bringing participants up to speed on our methods. For those interested in pursuing a more classical approach to training, the Department of Philosophy at UB has recently created a graduate-level Applied Ontology track, with courses ranging from the underlying formal logic of applied ontology practices, applying ontologies to intelligence analysis, biomedical ontology development, the ontology of economics, among many others. UB will hereby soon be home of the world’s first academic program in applied ontology.
John, as my APA Blog Philosophy and Technology Series construes technology as broadly as possible, exploring its impact on the discipline and culture, please expand on how technology has enabled and accelerated this new realm. Secondly, note how, alternatively, these jobs exist and are so valuable because of technology’s limitations. Indeed, we discussed the absolute differences between machines and human beings in our second interview about Barry’s book with Jobst Landgrebe, "Why Machines Will Never Rule The World", where Barry and Jobst explained how human thought is fundamentally non-logical and its ingenuity cannot be replicated.
As Barry indicated, the field of applied ontology is experiencing an explosion of interest. This is not the first time the ontology field has been at the forefront of advancing technology, but it is – to my mind – the first time technology has advanced enough to achieve the goals of integrating disparate bodies of data using ontologies. In the early days of GOFAI (for: good old-fashioned [logic-based] AI), there was a naive hope that, with enough symbolic logic, we might build generalized AI. Systems like, for example, SHRDLU led to optimism and eventually to early ontology-adjacent projects, such as Cyc (short for ‘Encyclopedia’), which sought to represent all human knowledge using first-order logic. Hype and overpromising were everywhere, but technology was not advanced enough to make good on such promises. You have to remember that, back then, if you wanted to work with AI you often had to build your infrastructure in-house. This was before Amazon offered web servers, before wifi, before Python and all its helpful libraries. A decent amount of the blame for AI failures was, however, directed towards the emphasis on logical representations. Accurately representing an actual, real-world, domain in first-order logic takes a lot of work, will likely be always incomplete, and will likely be fragile, meaning any slight change in the domain will undermine the accuracy of the representation. As a consequence, rival AI approaches emerged, based on connectionist methodologies, predecessors of what we call machine learning and natural language processing paradigms. Logical precision, with its inherent fragility, gave way to statistical models, which were more accommodating to real-world modeling.
Lessons from those early (GOFAI) days were the inspiration for Tim Berner-Lee’s proposal for the creation of a Semantic Web, the dream of a machine-readable world wide web using web standards for encoding semantics in data stored online. Tim’s idea was to impose a ‘subject-predicate-object’ format for all data on the web. The power of this idea led to another boom period for applied ontology; but here again technology was not yet up to the challenge. Researchers no longer needed to build their own servers and other architecture, but they nevertheless ran aground on the fragility of ontology representations, coupled with the seeming impossibility of representing the semantics of all web data. Fragility came also from too many ontologists building representations of the same thing in different ways. Failure to represent all web data came about from too few ontologists relative to the exponential increase in size of the world wide web. Perfection was, in this boom period, the enemy of the good.
We are now again in a boom period with ontologies at the heart of advances in contemporary AI. It is well-known at this point that LLMs are capable of impressive feats as well as drastic failures. Excitement around LLMs has generated something of a gold rush towards implementations of these technologies, with caveats that their outputs cannot be blindly trusted. Hype around these models has highlighted again and again the need for trained ontologists. LLMs are statistical black box input-output machines. They are not designed to provide robust explanations for their output. Consider though, if I ask such a chatbot whether I should give my three-year-old daughter Tylenol, I do not simply want a “yes” or “no”. I want justification for such a response; I want evidence. I also want to know that I can trust whatever evidence is provided. LLMs are not designed for this level of justification. Systems employing ontologies, however, are. Ontologies are developed based on a formal logical language – the World Wide Web Consortium-approved Web Ontology Language. This is a computer-friendly fragment of First Order Logic which permits the generation of entailments and proofs. A question-answer system built on top of an ontology can provide justification for output, by generating steps in a proof leading to that output.
Now I would like to turn to the context of the discipline and situate this ontological domain in the history of philosophy. Does this new industry represent the end of philosophy, in the sense that real problems are finally being solved, or simply its most practical application, invigorating the discipline by demonstrating the breadth of its purview? Barry, when I first raised this question, you discussed how Aristotle was in your opinion the original applied ontologist – so please expand on how this new field draws on his work and the classical goals of the discipline. Indeed, a recurring theme I have explored in the APA Blog has been the ends of philosophy, such as in our Substack Newsletters discussing intellectual ambition with Samuel Kimbriel in the context of his essay "Thinking is Risky" and Tim Andersen on the limitations of objectivity in Physics and Religious Faith. As a philosopher professor steeped in the tradition, please share your views on the progress and promise of the discipline.
Creating a common vocabulary across different data sets can address interoperability problems only if the ontologies people build are themselves interoperable. But the history thus far shows that the more the idea of an ontology-based method becomes popular, the more people are attracted to building ontologies – and then they all do it in different ways, resulting in another ontology winter. This is where philosophy enters into the picture. For almost two millennia philosophers knew where they could go to find a common set of highly general terms which they could then use as a starting point for defining more specific terms in different domains of interest. Aristotle’s Table of Categories is not quite the ontology we need to ensure interoperability between domain ontologies being built today; but Aristotle’s idea serves as the basis, alongside the work of Ingarden, for an ontology – called Basic Formal Ontology (or ‘BFO’) which is serving this purpose today. BFO is the favored standard for the OBO Foundry, the IOF, and multiple government organizations. BFO is an international standard, as in 2021 it was subject to the standards validation process of the International Standards Organization, documented as ISO/IEC 21838-2. This represents the first time in history that a piece of philosophy has been declared an ISO standard. It owes this special position precisely because, like Aristotle’s Categories, it is so general and also so small, consisting of just 35 basic ontological terms such as ‘object’, ‘process’, ‘disposition’, ‘quality’, and so forth. It also has one advantage over Aristotle in that it was built to be useful – not only to humans, but also to computers.
John, perhaps you too can expand on your work in the context of the history of the discipline.
I do not see applied ontology as the end of philosophy but rather as the emergence of a new discipline from philosophy. I see philosophy, as I believe many of us do, as the study of wisdom; wisdom is not itself a specialty, but it spins out specialties. As philosophers we investigate challenging questions striving for consensus not merely on answers, but also on methods for obtaining answers. In that respect, we are inventors of methods. Once some set of methods is codified and sufficiently accepted by a community, it forms the basis of a new science. This has happened time and again in the history of science. Newton fashioned himself a natural philosopher; the early Chomsky debated philosophers in print; Wundt’s main journal, which initiated the discipline of experimental psychology, was called Philosophische Studien; and on it goes. Applied ontology is emerging along similar lines as a new discipline. As consensus is forming around our methods, for example BFO-based automated reasoning techniques, there arise also journals, international conferences, research centers – and now finally, in Buffalo, degree programs. None of this implies the end of philosophy. Applied ontology cannot solve all problems humans find or will find interesting. We will need new sciences, after all.
Further looking to the future, John, please tell us about your new traction where, per a recent press release, you have developed a resource that will be adopted as the standard for ontology-related projects for all agencies within the DOD and Intelligence Community (IC). After describing how BFO will support the interchange of information between domains – providing controlled, shared vocabularies – please tell us about potential new partners.
BFO and one its more widely used extensions – the Common Core Ontologies, created by our close partners at CUBRC – have as of this year been directed for use as “baseline standards” across the DOD and the Intelligence Community. This has already led to a significant increase in ontology development among these communities and highlighted the need for more ontologists trained to use BFO and CCO. We are, moreover, working closely not only with the DOD and the IC but also with the Department of Homeland Security, where BFO serves as the basis for a new cross-agency data repository. This reflects not only the fact that some of our students have over the years been employed by these agencies, but also the many training events in which we have been involved.
BFO serves the DOD and IC, as well as DHS and other government agencies, by providing a common, highly general, logically rigorous, vocabulary from which more specific ontologies can be extended based on user needs. The central idea behind leveraging BFO in these environments is actually quite simple. History has shown that, if left to their own devices, different agencies will create different vocabularies and coding systems to represent domains of interest, none of which will be easily integrated with any other. By having agencies use a common vocabulary, even at a highly general level, we can ensure data represented with that vocabulary shares at least a minimal semantics. We then go further by extending BFO with ontologies designed to represent content in more detail. For example, where BFO stops at the level of ‘process’, ‘object’, and so on, extensions of BFO such as CCO introduce terms designed to represent, say, ‘tracking process’, ‘act of driving’, ‘physician’, ‘ground vehicle’, and so on. Importantly, extending from BFO requires in every case maintaining a logical path into BFO. A ‘tracking process’, to illustrate, would be a specialization of the BFO class ‘process’. It is in this manner – securing common semantics across disparate data sets – that BFO acts as an information exchange between domains.
The list of partners grows almost daily, from governments, to industries, and even to academia. Alongside the DOD, IC, and DHS communities, we are routinely in contact with representatives from the Johns Hopkins University Applied Physics Lab, the Air Force Research Lab, KadSci, CUBRC, and MITRE, now also with the Program for Artificial Intelligence of the European Innovation Council. Our close ties to the Industrial Ontologies Foundry involve regular engagement with representatives from the National Institute of Standards and Technology (NIST), Crownpoint, and the University of Arizona, among many other partners. Partnerships of this sort result in collaboration on publications, on grant proposals, providing and receiving support for ontology projects in real-world applications, as well as conferences and workshops. This is not to mention the regularity with which we are solicited for talented ontologists. Bloomberg, Amazon, and Morgan Stanley (see Indeed.com) have each come calling, alongside many, many other organizations hoping to leverage ontologies at scale.
Lastly, I would like to explore potential qualms in the philosophical community: is there are danger in the close coordination of the discipline with the DoD and IC? One of my favorite APA Blog pieces was Deadly Drones, Killer Trolleys by Christopher Kutz, who maintained that philosophy has been complicit in the moral calamity of remote warfare, where the discipline reduces “complex moral and political questions to reductive, God’s-eye judgements about who is ‘liable to be killed’”. Christopher makes a compelling case that “Trolleyology’s” extension into the ethics of war has been calamitous. Is it fair, then, for the philosophical community to be wary of this unusual alliance with the public/private defense industry? Given the stakes, perhaps both of you could address the question, starting with Barry.
Barry: I am not, I’m afraid, any sort of expert in matters of ethics. On the other hand, I am very pleased with the way in which applied ontology is being found useful, not just in the biomedical field but also in areas such as AI, ecology, business, manufacturing and so forth. Applied ontology is indeed in some ways mimicking the trajectory of ‘applied ethics’, which has already performed a valuable service in demonstrating to the wider world that philosophy – and philosophy training – might actually be useful. For BFO to reach the point where it could be approved as a standard by the International Standards Organization, and this with the support of the Department of Defense, is I think good for the discipline, since it shows that philosophers are seen as being useful in hitherto unanticipated ways.
John: I unfortunately have little patience for most ethics I read on this topic. I am being sincere when I say that this is unfortunate. I find it deeply upsetting. On the one hand, opining over how high-stakes decisions might be calamitously reduced to trolley problems in war is, to my ears, tone deaf. Decisions over life and death on the part of the military are not simply reductive utility calculations. The DOD chain of command, for example, is incredibly complex in order to create redundancies that require oversight whenever decisions that impact human lives must be made. This is because decision makers in the DOD know how easy it is to lean on such reductive strategies; so, they create infrastructure to make it challenging to do so. They recognize the dangers of building straw soldiers (i.e. holding reductive views of military decision making that are so simplistic as to be susceptible to complaints of ‘trolleyology’).
On the other hand, I take a central question worth addressing in this space to be not whether we are complicit, but what are our responsibilities as philosophers and ethicists. To that question, I say philosophers and ethicists should be working in areas where important ethical decisions are being made that affect people’s lives. Few people in government or private industry, however, want to hire an 'ethicist' to tell them why what they're doing is good or bad. Applied oncologists, in contrast, are treated like engineers in these circles, engineers that have in many cases strong backgrounds in ethics. In my experience, engineers who raise moral objections or raise issues surrounding morally complicated scenarios are taken more seriously in these settings than those hired as ‘ethicists.’
I doubt that writing blog posts or publishing academic articles about, say, just war theory has much of an impact on real-world decision making in challenging, morally complex, situations. I know being an ethically trained ontologist working with decision makers does have an impact. To my mind then, rather than arguing about complicity, or publishing yet more work on trolleys, our aim should be to train ethical engineers. This is precisely what we are doing at UB.
Barry and John, I think you both raise fair points about the complexity of the subject and demonstrate how your work is an excellent, practical counterpoint. Indeed, to conclude, I greatly appreciate your time in briefing the philosophical community on your ground-breaking efforts. I often read about the arduous state of academics and challenges of being a professional philosopher, so I am very excited to share your initiatives on the APA Blog. I encourage anyone interested in embarking on a career in ontology to reach out to both of you, and I hope we can talk more about your progress in the future!
From the APA Archive:
Coming Robot Rights Catastrophe
Interview with Christopher Tollefsen
Bioethicists Must Take Off Their Blinkers
What else I’m Reading/Listening To:
Further Reading:
B Smith, M Ashburner, C Rosse, et al., “The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration”, Nature Biotechnology 25 (11), 2007, 1251-1255.
B Smith, W Ceusters, B Klagges, et al., “Relations in biomedical ontologies”, Genome Biology, R46 6 (5), 2005.
R Arp, B Smith, AD Spear, Building Ontologies with Basic Formal Ontology, Cambridge, Mass.: MIT Press, 2015.
P Morosoff, R Rudnicki, J Bryant, R Farrell, B Smith, “Joint Doctrine Ontology: A Benchmark for Military Information Systems Interoperability”, Semantic Technology for Intelligence, Defense and Security (STIDS). (CEUR vol. 1325), 2015, 2-9.
N Otte, J Beverley, A Ruttenberg, “BFO: Basic Formal Ontology.” Applied. Ontology 17 (2022): 17-43.
This interview is really good! It gives all the context needed for people to understand the dynamics I see happening between data, ontology and AI engineering.
Interesting🤔