By Ben Goertzel & Nil Geisweiller
One big part of the SingularityNET vision is the creation of a decentralized “society of minds” in which multiple AI agents can cooperate and collaborate to solve problems together — so that the SingularityNET network as a whole becomes a complex self-organizing system in which the intelligence of the whole greatly exceeds the intelligence of the parts.
The current SingularityNET marketplace comprises mainly agents that carry out AI functions for customers on their own, without referencing other agents on the platform in their own back-end operations. There is no obstacle to connecting multiple agents on the current network into complex assemblages, however, the platform also does not provide any particular tools designed to make this easier and more natural.
The SingularityNET whitepaper alludes to a meta-API or API-of-APIs according to which AI agents can describe to each other various aspects of their operation, to make it easier for agents to make automated decisions about which other AIs to connect to for what reasons. However, the whitepaper does not describe in any detail how this API-of-APIs would be designed or implemented.
Now that the basic SingularityNET platform is operational in a strong beta version with a number of valuable agents running on it, it’s time to bite the API-of-APIs bullet!
In our discussions with Charles Hoskinson and his colleagues at Cardano regarding the planned deep integration of the SingularityNET network and the Cardano blockchain, we realized there were some very interesting synergies in this regard. Cardano’s Plutus smart contract language, with its flexible capability for abstraction as derived from its use of the Haskell functional programming language, turns out to be particularly well suited for implementation of a SingularityNET API-of-APIs.
Using the current Ethereum implementation of SingularityNET, the only sensible approach to implementing an API-of-APIs would be to make it a wholly different piece of software than the smart contract layer (invoked via the smart contract layer via foreign function calls or similar). However, using Plutus it should be possible to embed the API-of-APIs more directly in the smart contracts.
We have begun thinking in terms of an AI-DSL (Domain-Specific Language), analogous in some ways to the Marlowe DSL for finance that has been implemented already on the Cardano blockchain (though there are many differences between Marlowe and the sort of AI-DSL we need to implement a SingularityNET API-of-APIs). In essence, the AI-DSL we need here is a flexible and robust description language that an AI agent can use to describe its relevant properties to other AI agents and other general software processes that want to interact with it.
This blog post presents some of the design thinking we’ve been doing regarding the AI-DSL. These ideas are sure to evolve as they are fleshed out in the course of implementation, but have reached a point now where it seemed interesting to share them with the community.
As well as high-level design thinking, we have been doing some simple implementation experiments with these ideas in the Idris programming language, and exploring the idea of implementing an AI-DSL in Idris and then writing an Idris-to-Plutus compiler so that the AI-DSL can operate directly within Plutus smart contracts. There is some overlap here with recent work we’ve been doing on the OpenCog Hyperon design, which involves a new AI programming language (currently being called Atomese 2) that incorporates many ideas from Idris along with further refinements like gradual and probabilistic types. But in this post, we’ll try to keep things relatively simple and not dig too deep into dependent-type intricacies, so as to keep the discussion accessible to anyone with a bit of CS or software background.
So let’s plunge into the practicalities. What exactly do we need an AI-DSL for? The language we are creating will allow:
- AI agents to effectively interoperate with each other without human intervention (including the choice of which AI agents should interoperate together in which contexts)
- External software systems to know how to interact with AI agents
- AI agents to know which datasets they should be considering to ingest or produce
- Systems in charge of running or guiding the execution of AI agents on infrastructures, to know how to do so most effectively
Key properties to be specified in the AI-DSL include: The structure of input and output data, processing cost (in money, time, compute resources), results in quality, the evidence used, fairness according to various standards, and internal processing structure including potential for concurrent processing.
The term “AI agent” is used judiciously, where an “agent” means a software process that potentially has some ability to autonomously act in various contexts, perhaps based on some goals it possesses. An AI algorithm or process without ability to act “on its own” in any context can be considered a special case of an agent with vanishingly small agency; the AI-DSL envisioned will also cover this case.
In the context of the SingularityNET platform, this AI-DSL will then serve as the language underlying the “meta-API” or API-of-APIs — a formal language that AIs in the SingularityNET network use to express their APIs and communicate them to each other. The idea is that an AI acting in SingularityNET should be able to communicate its API and relevant associated properties, as well as critical properties of how it fulfils its API functions, in a common formal language with clear semantics.
A SingularityNET subnetwork comprising AI agents that are annotated with AI-DSL descriptions will be capable of a higher level of intelligent functionality than a SingularityNET subnetwork comprising agents without such annotations. The AI-DSL descriptions allow AI agents to be automatically composed into assemblies carrying out combined functions, and they allow software processes (including AI agents and external ones) to automatically gain all the information needed about AI agents to make decisions regarding which agents to work within which contexts.
It is for these reasons that the “meta-API” or “API of APIs” was included in the original SingularityNET high-level design overview, though the current SingularityNET implementation lacks such mechanisms, instead simply requiring each AI agent to present a software API without a clear formal semantics.
To integrate the AI-DSL with the current Ethereum-based implementation of SingularityNET would require the DSL to exist on a completely different level from the Solidity smart contracts specifying the operation of the platform. This is perfectly feasible but becomes an awkward and inefficient design. In the context of SingularityNET-on-Cardano, things can be much more elegant; the AI-DSL can be made to compile into Plutus so that AI-DSL expressions can be manipulated directly within Cardano smart contracts.
The AI-DSL we’re envisioning requires a hierarchical data ontology (covering datasets and data streams) in order to meaningfully function.
E.g. a data ontology would contain a data-type for “sequence of symbols over a fixed alphabet”, with a way to indicate whether this sequence is available in batch mode or as a stream; and then a data-type for “natural language text” which is a subset of “sequence of symbols over a fixed alphabet”; and then a data-type for “English natural language text” which is a subset of “natural language text”; and then a way to specify the specific encoding of a text; etc.
It is understood that any such data ontology is going to be imperfect and also is going to change over time, and the AI-DSL must be robust with respect to these aspects of real-world data ontologies.
We will not enlarge on the data ontology in detail here, but many attempts in this direction exist (“The great thing about standards is there are so many of them”) and we will integrate several as appropriate guided by the specific needs of the AI agents on the SingularityNET platform now and slated for near-future development.
The AI-DSL will characterize an AI agent using a type description formulated using a dependent type system. Currently, we are doing some basic prototyping using the Idris language, and it seems that Idris will very likely have sufficient flexibility for the AI-DSL envisioned.
The choice of dependent typing has been made after some reflection on the AI-DSL’s goals.
One important goal here is that key properties of combinations of AI agents should be inferable from properties of the individual AI agents being combined, by a relatively rapid and deterministic “type checking” process.
Another important goal is that key AI agent properties (e.g. resource consumption, utilization or otherwise of certain bodies of evidence, fairness according to specified criteria, architectural correctness according to given specifications) should be formally verifiable, via a relatively rapid and deterministic “type checking” process.
Of course, there are viable alternative paths — for instance, one could have key agent properties described using predicate logic, and then have formal verification and inference about agent combinations done by a logic theorem prover. At an abstract level, this could be made equivalent to a dependent type approach such as we are currently pursuing. The choice of a dependent type-based approach is basically a matter of taste regarding what precise formalism to use, and what set of implementation tools to favour.
The most “basic” AI agent properties we need to formalize using an AI-DSL are:
- Input and output structures — — what sort of data does the agent consume and what does it produce? These can be elementary data types (described according to a reference Data Ontology) or functions or other constructs defined atop elementary data types.
- Financial cost of running the AI agent in a given instance (i.e. on a certain set of input data, with certain parameters, in a certain context)
- Time required to run the AI agent in a given instance
- Computational resource cost of running the AI agent in a given instance (which may be rich in its description, e.g. how many CPUs of which type compared with how many GPUs of which type).
- Quality of results obtainable by running the AI agent in a given instance
There are dependencies among these factors of course: e.g. the cost and time required to run a certain algorithm on a multi-GPU server may be different from the cost and time required to run them on one or multiple CPUs. Formally expressing these properties in a way that accounts for these dependencies is one of the aspects of AI-agent description that requires a relatively rich underlying language framework.
Uncertainty is also a key aspect here. Sometimes properties regarding cost, time and quality will be deterministic, but sometimes they will come in the form of probability distributions (which may be represented in various ways, e.g. confidence intervals or parametrized first order distributions, or imprecise probabilities, etc.). The AI-DSL will need to give a few simple ways to represent uncertainty, which brings us into the domain of probabilistic dependent types.
Next, there are more “advanced” properties which are also highly critical but require a bit more subtlety in terms of choice of formal representation particulars:
- Evidence consumed (e.g. which logical premises were used by an inference agent; which observations contributing to uncertain truth values attached to logical premises were used; which data items in a dataset were used for training an ML model)
- Fairness from bias (e.g. fairness as judged according to a certain set of data features); there are existing approaches to formal verification of fairness, and one would like them to be applicable compositionally so that under appropriate constraints compositions of agents that are fair with respect to a certain set of features will still be fair
- Internal process structure — e.g. decomposition of a neural model-based agent into layers, decomposition of an OpenCog agent among multiple atom spaces (TensorSafe is a dependent type-based type based formalism that deals with formal verification of neural model architecture for example.)
- Concurrency properties — e.g. decomposition of the AI agent’s internal process into subprocesses that can be done concurrently vs. that need to be done in parallel. This can be formalized using behavioural types that reflect process calculus type semantics.
- Privacy Preservation — For agents that can operate via homomorphic encryption, functional encryption or multiparty computing, the agent’s type must pass along the information needed to manage the sharing of encrypted results among different agents with different levels of access to different data aspects.
- Preconditions and Postconditions — Predicates describing the environment of the AI agent, which are required to be true for the agent to be activated (pre-) or which are guaranteed (perhaps with some probability) to be true after the agent has completed a round of activity (post-). Among other things, in an autonomous AI agent context, these can be used to specify information regarding the goals the agent is trying to achieve and the expected results of the agent’s execution as regards these goals.
We assume that almost any AI agent operating in an AI-DSL enabled subnetwork of SingularityNET will come along with a description specifying its basic properties; whereas having a description specifying the advanced properties will be more optional. The more of the advanced properties are specified, the more easily and richly the AI agent can be composed with other agents to form greater intelligent assemblies, and the greater the extent to which the AI agent can be formally verified in its various properties, and optimized in its execution (e.g. via supplying it with distributed processing infrastructure in accordance with its concurrency properties).
For those who want to dig a little deeper, in this section we indicate the direction we’re moving in regarding the formalization of the “basic” agent properties mentioned above.
An AI agent task is represented by a function, for instance, a function turning an audio file into a score with lyrics. Attributes describing aspects such as financial and temporal costs, depending on the reality inhabiting such a function such function, not just on the function itself, are represented by a wrapper of that function decorated with these attributes. That is, the purely functional function type is lifted to a “Realized Function” type. Then various compositional operators are lifted to operate in the domain of realized functions.
For instance, given a function
f : A -> B
a : RealizedAttributes financial_cost time computational_cost quality
where such attributes are either constants or functions from input to attribute type `A -> AttrType` if the attribute values depend on the input (like its size). Later on, we’ll also want to consider replacing constants by intervals, or probability distributions, etc, to offer the possibility of accounting for their uncertainties.
Then the realized function would be represented by
RealizedFunction f a
This realized function is what is being exposed to the network so that users (humans or machines) may adequately select the best agent for the job. To do so though they need to be able to infer their own realization provided the realization of other functions involved in their interwork and the way they are combined.
So for instance the sequential combination of two realized functions makes financial and temporal cost both roughly additive. Whereas the parallel combination makes the temporal cost non-additive, likely min or max, for instance, depending on the type of parallel combination. Overall the combination laws for the financial and temporal cost should be reasonably easy to define and implement.
The compositional laws to infer attributes such as Computational resource cost shouldn’t be much harder than financial and temporal costs, just more fine-grained.
It seems, however, the set of compositional laws for result quality is highly non-trivial to formalize, that is because it ultimately depends on
- The semantics of the result.
- The way such results are combined. For instance, if the results are sequentially chained (i.e. the output of an agent becomes the input of another one), then the error will likely compound and the quality will degrade. On the contrary, if the result is a probability of some given model, obtained by averaging probabilities of this model calculated by different agents based on disjoint pieces of evidence, then the quality of the end result will be enhanced. This brings us to the need to keep track of Evidence consumed by an AI agent — while thorough incorporation of this factor will be deferred to a later portion of AI-DSL development it may become necessary to incorporate it in relatively simple ways even in this initial portion.
- The measure of quality. Such a measure may, in the simplest case, be reducible to a probability of fulfilling an objective — and. I, in more complex cases, be a probability distribution over a domain representing the quality, which could be discrete or continuous, uni or multi-valued, structured or not, etc. And obviously, the right choice of measure of quality is ultimately tied to the semantics of the result. Even in the simplest case, where the quality measure is the probability of fulfilling an objective, such objective needs to be formalized in order to be able to infer the quality measure resulting from the composition.
Finally, the input and output structures of an AI agent in the context of a given task should normally be entirely captured by the input and output types of the function
A -> B
and thus should not require a field in the ‘RealizedAttribute’ data structure. That isn’t to say it is trivial though, it depends on how much information the AI agent developer decides to put into it. For instance, taking again the example of turning an audio file into a score, the input data structure could be a vector of doubles, or a wav file, or a wav file tagged as a piece of music of a certain genre. And the more information there is the better the selection of the AI agent maybe because some algorithms work better on specialized data. Which brings us back to the usage of ontologies about the real world.
While we have found strong reasons to implement our AI-DSL using a dependent type formalism, we are also aware that dependent type based languages are not at the moment maximally intuitive for the average AI developer, who is more accustomed to imperative languages. It is probably not realistic to expect AI agent authors to write formal descriptions of their AI agents in a dependent type formalism.
With this in mind, we envision two layers to the AI-DSL:
- A middle-layer that exploits Idris syntax and makes explicit, rich use of dependent type formalism
- A syntactically-sugared end-user layer that is simpler and less abstract, aimed at making life easy for the AI developer, but with some restrictions on the expressive capability
Our current work focuses on the middle layer. But of course, it is the end-user layer that will be directly interacted with by most users.
Consider the case of a document summarization agent, which would like to outsource the summarization of any videos that might be embedded into incoming documents.
A syntax-sugared version of the AI-DSL should allow the developer of the document summarizer to easily build a query such as “Find me an agent that takes video inputs, performs semantic summary services, has an average rating above 4.5 stars, and will cost me at most AGI 10 per embedded video, or offers monthly subscriptions for up to 10,000 videos/month”
We would then be able to:
- Run an Agent that processes queries like those by calling out to the AI-DSL interpreter;
- Create client-side code (initially in Python, and in the future in other languages) that facilitates interactions with the above Agent. This library would:
- Allow easy discovery of the Agent endpoints,
- Automate the process of sending queries to the finder Agent and downloading results,
- Integrate with existing SNET tools for obtaining the gRPC API for the selected video summarization agent, and creating an agreement on the platform with the agent.
In the future, the client-side library might provide Pythonic syntax for querying, much like ORMs do for interacting with relational databases without having to write raw SQL, but for these first iterations just exposing that Agent would be a major milestone.
Of course, there are limits to how much the AI-DSL can validate about an Agent here. In the simplest case where the AI agent does not send any request to other AI agents on the network, the developer of such an AI agent would simply define the realized type of the corresponding function, without its implementation, and thus the AI-DSL would need to trust that the actual implementation validates that type.
In the more complex cases where such an AIsuch AI agent sends requests to other AI agents, then the developer might want to re-implement some higher-level aspects of their algorithm in the AI-DSL to represent the composition of the involved functions. Idris allows one to type check partially implemented code, so this should be possible. However, once again, the AI-DSL will only be able to type check that partial re-implementation, not the actual code, and thus trust that the former reflects the latter. We will have a situation where the more work the developer chooses to put into fleshing out their agent’s AI-DSL description, the more benefit their agent will obtain from the AI-DSL.