There exists a small number of approaches to the authorisation problem of deciding, once a user has been authenticated, what that user is permitted to do. This problem is very naturally viewed as an (ontological) subsumption problem: `is this user provably a member of the class of entities allowed access to this resource?'. This approach provides a flexible solution, in which delegation and federation are natural, and which fits into a broad range or architectures. I also describe a reasoning service, Quaestor, which implements the reasoning service.
Once a user has been authenticated -- that is, once a resource has decided that a user really is who they claim to be -- there exists a separate problem of deciding what that user is and is not allowed to do with or to the resource. There are various approaches which address this (including Shibboleth and PERMIS), but the problem can very naturally be expressed in ontological terms, as a straightforward subsumption problem: `is this user provably a member of the class of entities allowed access to this resource?'.
The account below describes:
The simple, but non-trivial, use-case which this demo addresses is the following. A database is to be accessible to researchers at institutions in the UK and researchers who are members of a particular collaboration. Certain tagged rows are accessible to researchers at African institutions.
There are a few other use-cases on the VOTech wiki.
On the right is the asserted hierarchy for an
access-control ontology, as displayed by the ontology editor Protégé.
Although it is not obvious from this screenshot, Protégé writes its
ontologies out as OWL ontologies [std:owl], which is
layeed on top of RDF [std:rdf].
The various locations are represented as classes, gathered
together under other classes representing continents. The
GroupOfPeople class represents either collaborations or
institutional groups, and the Person class has subclasses
based on location and on access rights. The goal is to end up
assigning individuals into the CanSeeAllData class, the
CanSeeTaggedData class, or neither.
To this class hierarchy, we add further conditions. We
declare a locatedIn property, which has
Person as its domain, and
GeographicalLocation as its range, or co-domain. We then
declare as a necessary condition of membership of the
UniversityOfLeicesterPerson class that an entity has a
locatedIn property whose range is specifically
UnitedKingdom, with similar necessary conditions on the
other institutional groups. We can then add as a necessary and
sufficient condition on the PersonAtUKInstitution
class that they have a locatedIn property whose range is
UnitedKingdom. If we subsequently assert that
#norman is a UniversityOfLeicesterPerson,
then we can deduce that he must have the given
locatedIn property, and this is sufficient to then
deduce that he is a member of the PersonAtUKInstitution
class. If we then assert that a necessary and sufficient condition
for membership of the CanSeeAllData class is that an
individual is a member of the union of the
PersonAtUKInstitution and
CollaborationXMember classes, and that a member of the
CanSeeTaggedData class is in the union of the
CanSeeAllData and PersonAtAfricanInstitution
classes, then we are finished.
These various conditions are all compactly asserted as extra
statements about the classes shown in the hierarchy here. Indeed, the
displayed tree is just the visualisation of the assertion that, for
example, a UniversityOfLeicesterPerson is necessarily a
member of the InstitutionalGroup class.
We (or rather
Protégé) can give this collection of assertions (which corresponds to
a single RDF graph) to a
reasoner, and ask it to deduce the inferred subclass
hierarchy which these extra conditions impose on the asserted
hierarchy we have displayed above. That results in the hierarchy shown
here. Observe that the Person class has been
restructured, with various InstitutionalGroup subclasses
appearing under Person, and several of them appearing
also under PersonWithAccessRights. You can see that
someone who is a member of the
UniversityOfLeicesterPerson class is also a member of the
CanSeeAllData and CanSeeFlaggedData
classes. We can see that, with a hierarchy of classes plus a
few extra conditions, the reasoner has done most of our
authorisation work for us.
Although we have presented this as a single ontology, this is only for the purposes of this demo, and in practice this would most reasonably be split amongst several ontologies, maintained by different actors.
GeographicalLocation and
InstitutionalGroup classes are rather generic, and could
be managed centrally. Note that defining the class
UniversityOfLeicesterPerson is distinct from asserting
that a given individual is a member of it; the former can be done
centrally and mechanically, while the latter should be done only by
the appropriate authority at Leicester (there are complications here,
see below).PersonAtUKInstitution are `utility'
classes, in the sense that they provide useful bits of logic which
build on the simple generic classes above.CanSeeAllData are meaningful only to
a specific resource owner, and are the place where that resource owner
actually articulates her access policy. They can be defined only by
(or on behalf of) that resource owner.How, then, do we exploit this as a component of an authorisation architecture?
Once the ontology is created, we can add assertions about individuals. For example, here are some assertions written in Notation 3 [std:n3]:
@prefix : <urn:example#> .
@prefix ac: <http://eurovotech.org/access-control.owl#> .
:Norman a ac:UniversityOfLeicesterPerson, ac:CollaborationXMember.
:Guy a ac:CambridgeUniversityPerson.
:Markus a ac:EuropeanSouthernObservatoryPerson.
:Sébastien a ac:CentreDeDonnéesDeStrasbourgPerson.
:Jonathan a ac:HarvardUniversityPerson;
a ac:CollaborationXMember.
:Nelson a ac:UniversityOfCapeTownPerson;
a ac:CollaborationXMember.
:Tutankhamun a ac:UniversityOfCairoPerson.
We can add further assertions such as:
<urn:example#Norman> = <mailto:norman@astro.gla.ac.uk>.
This indicates that these two URIs are to be deemed to be equivalent, in the sense that any assertion made about one can be taken to be made about the other also.
As with the ontology above, these various assertions would be made
in practice by different actors. Assertions that
<urn:example#Norman> a
ac:UniversityOfLeicesterPerson would be made by (a proxy of)
the Leicester personnel department, and an equivalence relation
similar to the one above might be made by the resource owner to link
the URI that the Leicester authorities use to a different local name
for the same individual, such as a local username or, as in this case,
an email address.
So we have an ontology plus some individuals. How do we get this information out? How do we go about actually plumbing this in to the architecture of our resource-owner's system?
Enter SPARQL [std:sparql].
SPARQL is a vaguely SQL-like language for querying RDF triple-stores. A query against the access-control ontology might be:
prefix : <http://eurovotech.org/access-control.owl#>
select ?person
where { ?person a :CanSeeFlaggedData }
This would return a list of all the individuals in the triple-store
which were members of the CanSeeFlaggedData class.
Alternatively,
ask { <mailto:norman@astro.gla.ac.uk>
a <http://eurovotech.org/access-control.owl#CanSeeAllData> }
would return a yes or no answer if
norman@astro.gla.ac.uk was indeed in the class
of individuals who could see all the data (it should be `yes').
There are other types of query which return RDF graphs, and various ways of filtering and enhancing the results. As of April 2006, SPARQL is not yet standardised, but it is an advanced W3C Working Draft, with multiple working implementations.
I have created a generic SPARQL endpoint, called Quaestor, which can be given multiple ontologies and instance assertions, and run SPARQL queries against the merged result. It has both RESTful [fielding00] and XML-RPC [std:xmlrpc] interfaces, and runs within Tomcat. Once the ontologies have been uploaded to it, via HTTP PUT requests, a client can make SPARQL queries of the merged result using either HTTP POST or GET queries.
This service is generic in the sense that it is not tied to any particular ontology -- in particular, it is not tied to just this access-control problem. It is designed to provide OWL-based reasoning services as part of a larger infrastructure, and so its interface has been designed with generality and extensibility in mind.
There is a walkthrough of the interaction with Quaestor, and you can download the demo files and the service .war file from here.
This approach is heavily standards-based, and builds on pre-existing standards rather than new ones.
OWL, as used here, is essentially a logic programming language, and so the architecture described here is essentially one which relies on mobile code, though it is safe because the language is sufficiently restricted. This flexibility also means that resource owners can be as sophisticated as they wish in defining their security policies, and are not restricted to a pre-existing authorisation language.
Because the relevant assertions are given to the reasoner in the form of OWL/RDF, which is a very low-level format, it is possible to extract assertions from a wide variety of other sources, such as SAML assertions, X.509 certificates, and PERMIS policies (I expect -- I haven't yet tried this, and so don't know just how much preprocessing would be required).
For the same reason, federation of authorisation logic is (again,
should be) relatively simple, and flexible. If, for example,
institution A has a class a:CanSeeEJournals,
and wishes to give e-journal access to members of another institution,
B, without re-registering all the relevant members of that
institition, then it can do so in multiple ways. If the other
institution (or `identity provider', IdP) maintains a class
b:LibraryUser and gives
access to its e-journals to individuals it asserts to be members of that
class, then institution A could simply declare class
b:LibraryUser to be a subclass of
a:CanSeeEJournals, at which point any individuals
asserted to be members of b:LibraryUser can be
immediately deduced to be members of a:CanSeeEJournals
also. Alternatively, it might be more suitable for institution B to
assert individuals' membership of a:CanSeeEJournals
directly. In either case the set of assertions would be transmitted
to institution A in a discrete packet of RDF assertions (a
single RDF graph), and
in each case the trust is isolated into A's decision whether or not
to trust that particular set of B's assertions (this is
expanded on in the discussion of security below).
An infrastructure based on these standards allows delegation in other ways.
We have described an architecture in which the reasoning is done
locally to the resource, using RDF graphs which may originate
from multiple sources. Alternatively a decision could be delegated, in
whole or in part, to a remote IdP. Continuing the example above,
the resource A could wholly or partially decide to allow
an entity access to its e-journals by simply asking B whether
they would allow that entity access to their e-journals; that
is, by sending a SPARQL query to ask B whether that entity is
in b:LibraryUser.
The problem is not of course completely solved. The following problems need to be addressed.
In the simplest scenario, the reasoning service described here would sit well away from the open internet, and the graphs which it handles would either be generated locally in the case of the resource owner's own rules, obtained from known-good sources in the case of utility ontologies, or from otherwise secure sources, such as a graph extracted from a signed X.509 certificate.
In contrast to this, the delegation example above required an RDF
graph to be sent from one institution to another. This could either
be done through a separately secured channel, or by signing the graph
using one of the relevant emerging standards (see, for example,
http://xmlns.com/wot/0.1/).
Since the parsed RDF graphs are programmatically manipulable, it would be possible for a resource owner to constrain or filter the set of assertions which a remote entity makes, to ensure that the graph is not only from a known source, but also that it does not assert anything it shouldn't.
Privacy: This architecture suffers from some of the same information-leakage problems that SAML assertions do. It is not immediately obvious how an IdP should restrict the set of RDF assertions it makes available to those which are relevant to the properties a remote resource needs or wishes. A possible solution to this is to allow the resource to make more indirect SPARQL queries of the IdP, such as `would you allow this person in to your library?', since these do not expose the underlying assertions. However a malevolent user of such an interface could still build up a substantial amount of information through such a channel, through multiple crafted queries. Combining queries of this type with the Shibboleth handle system [proj:shibboleth] would provide most of the required security.
The Shibboleth system [proj:shibboleth] has defined an intricate infrastructure for access control. When a user requests access to a resource, the resource owner may securely query an appropriate IdP, as guided by the user, to discover the set of attributes, transported in a SAML assertion, which the IdP will warrant applies to the user. The resource owner will then allow or deny access based on those attributes. The Shibboleth system concerns itself with the mechanism for negotiating and transporting the attribute sets, and does not cover any support for the resource owner's reasoning.
The PERMIS system [proj:permis] focuses on the resource owner's specification of their access policy, and provides algorithmic support for the reasoning involved. The PERMIS system does not provide easy support for the dynamic or delegated authorisation frameworks, though it is possible to add such support indirectly.
Since RDF functions at a rather low level, it will be possible to transform PERMIS policies and SAML assertions into equivalent OWL/RDF graphs, so that an OWL-based reasoning infrastructure would be possible as a plug-in replacement for the reasoning in these other authorisation frameworks. This would have the advantage that the resource owner is limited only by their ingenuity in the type and structure of the access controls they wish to impose.
[This section needs to be expanded; add refs to NESC federation experiments. Add pointers to demos/downloads]
$Log: access-control.xml,v $ Revision 1.3 2008/08/18 22:33:30 norman Substantial reworking, to fit with newer stylesheets. I'm coming back to this work with the AGAST project, so this has become live again. Revision 1.2 2006/04/13 17:26:11 norman The ontology prefix has changed from access-control2.owl to access-control.owl. Point to Quaestor demo/walkthrough. Revision 1.1 2006/04/07 11:05:31 norman Initial version