|
Privacy-aware Identity Lifecycle
Management [back to
projects page]
This R&D project is about the management of the
lifecycle of identity information in enterprises driven by “privacy
obligation” policies (i.e. policies dictating expectations and
duties on how this data should be handled, based on privacy
preferences and guidelines). How to ensure
that personal data is managed within enterprises according to users'
preferences and legislation, deal with data retention and deletion,
notifications and complex data workflows (involving human and system
interactions)? How to leverage current enterprise identity
management solutions to achieve this?
Access control solutions
cannot deal with all aspects of privacy policy enforcement. In
particular, access control solutions are not designed to handle
constraints dictated by obligations, such as on data deletion, data
retention, data transformation, notifications, etc. Privacy obligations
introduce the need to deal with privacy-aware information lifecycle
management, i.e., ensuring that the creation, storage, modification
and deletion of data is driven by privacy criteria.
This work focuses on the
explicit modeling and representation of obligation policies (to reason on them), their scheduling, enforcement and
monitoring (for compliance reasons) – by means of an obligation
management system and solution. In this context requirements such as
the scalability of the management of obligation policies on large
sets of data have been kept into account.
A prototype of a Scalable
Obligation Management System (SOMS) has been implemented and
integrated, as a proof of concept, with HP OpenView Select Identity,
in an enterprise provisioning and user account context. We are
currently exploring with HP business groups and customers how to
move towards the productisation of this technology. More details
follow.

Personal data, digital
identities and users’ profiles are collected by enterprises and
other organizations to enable their business processes and provide
required services. Privacy laws and legislation dictate policies and
constraints on how this personal data should be handled, stored,
processed and disclosed by enterprises. Part of these policies have
an impact on access control aspects i.e. how data should be
accessed, based on data subjects’ consent, stated purposes for
collecting data, etc. Another part of these policies dictate
obligations that enterprises need to fulfill on collected data, i.e.
expectations and duties on how to handle this data in terms of data
retention/deletion, notifications, data transformation, etc.
This R&D work focuses on
privacy obligation policies. The management of obligations has
an impact on how the lifecycle of personal data is handled in
distributed data repositories and systems within enterprises. This
area is still underestimated and open to innovation. HP Labs have
been working on this topic in the last few years, both in the
context of the EU PRIME project and internal R&D projects. Our aim
is to provide a pragmatic approach to the representation, management
and enforcement of obligation policies to be deployed within
enterprise IT infrastructures, in particular state-of-the-art
identity management solutions. This is a key requirement made by
enterprises, as well as the need for automation and cost reduction.
In our vision privacy
obligations are explicit policies that dictate constraints,
expectations and duties on how personal data must be managed by
enterprises. They require dealing with data deletion, data
retention, data transformation and minimisation, notifications,
execution of (potentially complex) workflows on data by involving
human and system interactions, etc. Privacy obligations could be
short-termed, long-termed or have ongoing implications. Their
management and enforcement is at the very core to enable
privacy-aware information lifecycle management in enterprises.
Our approach has been
refined and implemented both in EU PRIME project and HP Labs
projects. In our vision, a privacy obligation policy is a self
contained “entity” having a unique identifier and consisting of:
Target, Events and Actions sections. Simple examples of privacy
obligations are: (1) “Delete credit card details of User X at time T
and Notify this User”; (2) “Notify Administrator A if financial
details of User X have been accessed more that Y times in T hours”;
(3) “Execute Workflow W on Information X of User Y if Context C has
property P”.
From an operational
perspective (i.e. actual representation of privacy obligation
policies in a format that can be programmatically interpreted,
managed and enforced) we proposed an explicit representation of
obligation policies in an XML format, as reactive rules: WHEN
Events happens THEN trigger the execution of Actions on Target.
Based on our XML representation of obligation policies, we have
defined an obligation management framework model and a related
obligation management system to interpret, schedule, enforce and
monitor these policies. A high-level overview of the architecture of
the obligation management system follows:

Our obligation management
technology and framework has been designed to allow users (at the
time of disclosing their personal data or afterwards) to express
privacy preferences (e.g. on deletion time of some of their
attributes or notification preference) on how their personal data
should be handled by the enterprise. Our obligation management
system is then able to automatically derive and instantiate related
obligation policies based on these privacy preferences. We have
achieved this capability by introducing the concept of obligation
policy template. In our approach, a template consists basically of
an obligation policy which contained simple “placeholders” in its
Events and Actions sections. Templates are defined upfront, by
privacy administrators, to cover all the types of obligations
supported by an enterprise. In this context, a template is
instantiated just by replacing its placeholders with the actual
privacy preference values (for example a deletion date or a
notification preference, etc.). In this context an “instantiated”
obligation policy is (1) uniquely associated to a piece of data and
(2) it embeds privacy preferences in its Events and Actions
sections. The resulting “instantiated” obligation policies are then
scheduled, enforced and monitored by our obligation management
system. A working prototype has been fully implemented and
integrated with HP OpenView Select Identity, a state-of-the-art
identity management solution, to demonstrate the feasibility of our
ideas and its deployment in enterprise contexts.

The implementation of an
initial prototype (and a related demonstrator), related tests and
feedback received by HP customers/third parties helped us to
identify another key problem: the scalability of our approach. On
one hand our approach provided great flexibility in defining a broad
range of privacy obligation policies, potentially customisable to
users’ needs and directly associated to personal data. On the other
hand for each piece of managed data (and related privacy
preferences), one or more “instances” of our obligation policies had
to be created and associated to this data.
In real world scenarios,
large amounts of user’s data (greater than 100K records) are
collected and managed by enterprises. In our approach, this meant
having to deal with a similar (large) amount of associated
obligation policies with negative implications and impacts in terms
of required resources and processing power to run our obligation
management system. Additional feedback highlighted the need not only
to passively monitor failures in enforcing privacy obligations (i.e.
spotting cases where the enforcement of stated Actions fails or
changes in the status of managed data invalidates previously
enforced actions but also being able to proactively remediate to
these failures (e.g. by notifying administrators or trying to
reinforce failed actions).
We realized that it is
necessary to manage obligation policies in a scalable way, on a
potentially large set of personal data stored in various enterprise
data repositories. To address this problem and keep into account
related requirements, we introduced the concept of parametric
obligation policies. A parametric obligation policy is a policy that
leverages the concepts of our previous version of obligation
policies. The same categories of obligation policies are managed.
However, the key differences are:
-
A parametric
obligation policy can be associated to a potentially
large set of personal data (i.e. no multiple
instantiations) and, at the same time, it can dictate customized
obligation constraints (based on users’ privacy preferences) on
each data item;
-
A parametric obligation
policy does not embed privacy preferences in its Events
and Actions sections (as instead happens in our previous version
of obligation policies). Instead, this policy contains explicit
references to these preferences, that are stored
elsewhere - in data repositories;
-
The Target section
of parametric obligation policies explicitly model and describe
the data repositories that will contain preference values
pointed by these references - in addition to repositories
containing personal data;
-
A new “On Violation”
section has been introduced to
explicitly automate the process of “remediation” of violated
obligations – as described in the requirement section.
The key feature introduced
by parametric obligations is that privacy preferences are
stored separately from parametric obligation policies:
references are used to retrieve these preferences. This ensures that
a parametric obligation policy can apply to a potentially
large set of personal data – as defined in its Target element
– and, at the same time, allows the “customization” of its Events
and Actions based on references to external privacy
preferences. Parametric obligation policies still need to be
deployed in an obligation management framework for their
interpretation, enforcement and monitoring. A Scalable Obligation
Management System (SOMS) has been fully implemented to deal with
these tasks.

The key innovation
introduced in the SOMS system is its capability to dynamically
interpret parametric obligation policies (i.e. their Target,
Events, Actions and OnViolation Actions
sections) and map their references on actual “targeted” data and
preferences. This is done in an efficient way, via SQL queries that
are instantiated on-the-fly – based on targeted data and related
preferences. The following figure provides and high-level view of
the related process implemented in the SOMS system, triggered by the
occurrence of external events of relevance for a given parametric
obligation policy.

When external events happen
for a given parametric obligation, the SOMS system identifies the
targeted personal data and related preferences. Based on this
context, a few SQL queries are dynamically built to solve any
reference in the Events section and, at the same time, check
their values against stated Events conditions. For each piece
of data (targeted by this parametric obligation) where the
“customized” Events section triggers the enforcement of
Actions, the system will dynamically build SQL queries to solve
references in the Actions section and enforces them.
A full working prototype of
our SOMS system has been implemented and re-integrated with HP
OpenView Select Identity solution, a state-of-the-art User Account
and Provisioning solution for enterprises. This shows the
feasibility of our approach in a real-world environment.
Initial results are very encouraging. Despite the fact that at this
stage we cannot yet provide a quantitative analysis of SOMS
performance, our prototype has been already tested with about 100K
items of personal data – in a context where about 10 parametric
obligation policies have been deployed (covering most common
combination of event and action types). Each item of personal data
was associated to specific privacy preferences. The SOMS system
(installed in a “standard” PC using MS Windows XP Professional, with
data stored in MySQL databases) has gone through all the required
steps in terms of event processing, action enforcement and
monitoring - without noticeable problems.
We are currently performing
additional tests on larger datasets and different types of
parametric obligations and collecting information on the behavior of
the system (future papers will provide this information). Future
work includes further extensions of managed policies, performance
tests and R&D in PRIME.
Further information and details about this
project can be found in the following HPL Technical Reports:
-
HPL-2007-7
Marco Casassa Mont, Filipe Beato - On Parametric
Obligation Policies: Enabling Privacy-aware Information
Lifecycle Management in Enterprises - HPL-2007-7, 2007
-
HPL-2006-109 Marco Casassa Mont -
On Privacy-aware Information Lifecycle Management in
Enterprises: Setting the Context - HPL-2006-109,
2006
-
HPL-2006-51 Marco Casassa Mont, Robert Thyne
- A Systemic
Approach to Automate Privacy Policy Enforcement in Enterprises - HPL-2006-51,
2006
-
HPL-2006-45 Marco Casassa Mont - Towards Scalable
Management of Privacy Obligations in Enterprises - HPL-2006-45,
2006
-
HPL-2005-180 Marco Casassa Mont - A System to
Handle Privacy Obligations in Enterprises - HPL-2005-180, 2005
-
HPL-2005-110 Marco Casassa Mont, Robert Thyne, Kwok
Chan, Pete Bramhall - Extending HP Identity Management
Solutions to Enforce Privacy Policies and Obligations for
Regulatory Compliance by Enterprises - HPL-2005-110, 2005
-
HPL-2004-34
Marco Casassa Mont - Dealing
with Privacy Obligations: Important Aspects and Technical
Approaches- HPL-2004-34, 2004
|