Skip to main content

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "PDS Architecture"

(Managed data service)
(HBX)
 
(139 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}} [[Image:Higgins.funnell.PNG|right]] __NOTOC__
+
{{#eclipseproject:technology.higgins|eclipse_custom_style.css}} [[Image:Higgins.funnell.PNG|right]]  
  
What is often called a '''personal data store''' lies at the heart of a '''Personal Data Service (PDS)'''. This page describes both the PDS and the personal and 3rd party managed data stores that it manages.
+
This document describes the top level Higgins 2.0 PDS components under active development. Here are the bugzilla component names:
 +
* H2-Client
 +
* H2-HBX
 +
* H2-PDS
 +
* H2-PDS Support
 +
* H2-ADS
 +
* H2-Data Model
  
=== Introduction ===
+
== Front End  ==
  
A PDS provides a central point of control for information about a person, their lives, friends, interests, affiliations and so on. It provides a dashboard for this information. It also provides the ability update self-asserted data as well as a way to manage authorizations and set policies under which 3rd parties gain access to selected portions of the user’s information. A PDS implements a ''discovery'' API that allows the user to be discoverable by other people, organizations, apps and exchanges when the incoming inquiries meet criteria the user specifies.
+
There are two front end components: a web client, and a browser extension.  
  
A PDS manages both personal and managed data stores. A '''personal data store''' is a data service where self-asserted information about the user is stored and protected. A '''managed data store''' is a third party service that manages information about the user.
+
[[Image:Higgins client 2.0.222.png|center]]
  
[[Image:PDS intro 2.0.121.png|center]]
+
=== Client ===
  
A PDS enables the user to participate as a peer within a distributed personal data ecosystem of interoperable PDSes developed and operated by a multiple organizations. We share a common vision with other open source projects of an internetwork of PDSes that can exchange personal information under the control of the individual. A person's PDS will include links to data objects stored in their friend's PDSes. These links, taken together, form a social graph that is distributed across these PDSes. Every person could choose to host their own PDS or have it operated by an trusted organization; hopefully one that acts as an agent of the individual.
+
The client is written in HTML and JavaScript and runs in any desktop browser (e.g. IE, FF, Safari, Chrome). In the future we also plan to make it display well on the limited screen size of smartphone mobile browser (e.g. iPhone, Android, etc.).
  
=== PDS  ===
+
* [[Org.eclipse.higgins.js.pds.client | .js.pds.client]]
  
Information from a variety of data sources (e.g. social networks, telco and health data sources) are virtually integrated by the PDS and presented in a "dashboard" application in a browser or in desktop and mobile clients. The PDS gives you control over your own information by allowing you to share selected subsets of it with other people and organizations that you trust.
+
=== HBX ===
  
* Is a service that enables the user to participate as a peer within a distributed personal data ecosystem
+
The Higgins browser extension makes possible functionality like browser-side integration with other web APIs and sites, scraping and form filling.
* Provides a web portal that provides a dashboard view of the user’s data, the ability update self-asserted data, and a way to manage authorizations (e.g. using something like an UMA Authorization Manager) and set policies under which 3rd parties (e.g. apps) gain access to portion of the user’s information
+
* Implements a Discovery API that allows the user to be discoverable by other people, organizations, apps and exchanges whose inquiries that meet user-defined criteria 
+
* Provides an identity provider (IdP) endpoint (e.g. OpenID OP)
+
* Implements two factor authentication
+
* Provides a run-time environment for Kynetx-like apps that run within the PDS itself
+
* Decrypts data from the user's personal data stores (using a local key) to allow their attributes to be managed in the PDS's dashboard UI.
+
  
===Personal data store===
+
* .chrome.bx - Chrome-only Higgins Browser Extension
* Provides a personal data abstraction layer mapping internal and external data sources into a consistent data model based around notions of personas
+
* .js.pds.cde - Connection Data Engine 1. Loads CDE1-compatible JSON Scripts (See [[App-data vocabulary]]) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
* Manages a set of your personas (e.g. Work, Home & Friends, Citizen, Health, Anonymous)
+
* [[org.eclipse.higgins.js.pds.cde2|.js.pds.cde2]] - Connection Data Engine 2. Loads CDE2-compatible JSON Scripts (See [[App-data vocabulary]]) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
* Provides an encrypted "lock box" in the cloud such that certain internally stored data in the store (e.g. persona definitions) cannot be read by the store's operator
+
* .js.pds.connector.common
* Backs up personal data stored on your desktop and mobile devices
+
* Synchronizes personal data to other devices and computers owned by the person using a variety of network protocols.
+
* Links information from personas to accounts (profiles) that the user has at services providers, websites, social networking sites, etc. and over which the user has joint control and rights
+
* Links information from the user's personas with the personas of the user's friends and colleagues
+
  
===Managed data store===
+
====Functionality====
* Gives the user control over their information stored and managed by external organizations (aka silos)
+
* Provides a data abstraction layer mapping external data sources into a consistent data model
+
  
=== PDS Apps ===
+
=====Browser interactions=====
 +
When the user's browser lands on a new webpage it:
 +
* Determines if the current PDS user is currently logged in.
 +
** This requires there be a template for the current site (domain) and that it contains an IsLoggedIn script
 +
** It is possible that a different PDS user (not the current PDS user) is currently logged in.
 +
* If the user is not logged in then
 +
** It automatically logs the user in (or should it just auto-fill in the userid/password and wait for the user to click?)
 +
* Looks for every appropriate form on the page
 +
** Automatically fills in each form as best it can  -- this requires there be a template for the current site (domain) and that it contains a Fill script for this form (is there one fill script container with lots of per-form-submit-URL scripts? Or are there lots of Fill scripts each with an for-this-form-submit-URL attribute?
 +
* Waits for the user to submit a form (including a login form with or without a custom template?)
 +
** Scrapes the form submit data and writes it into the PDS. If it is a login form then it writes into the proxy object, else the corresponding context
  
The following kinds of apps are shown in the diagram above:
+
=====Web client interactions=====
# PDS App – A web app that consumes data from the PDS
+
When the user opens a connection editor page (e.g. to edit the nytimes.com connection):
# Exchange – A kind of PDS App that is involved in creating personal data exchanges analogous to a stock exchange. An exchange itself is a platform that supports yet another layer of apps above it [this is not shown above].
+
* The BX immediately starts a background process to login and scrape the latest data values from the site.
# Data Refinery – A kind of PDS App that reads datasets from the PDS, refines them, and writes them back to the PDS user. The refinery process includes analytics, inferencing, segmentation, etc. Refineries generally to create higher value, more refined data from the more raw forms of data, while often also making the data sets less personally identifying.
+
** This is necessary because the user may have gone to the site directly (not using the PDS) and updated data values. A progress bar that shows this background process.
 +
* If the user edits an attribute it writes the updated attribute value to the site.  
 +
** If this "write" operation happens before the background sync completes, there is some possibility for sync collisions and and confusion.
  
=== Active Clients ===
+
== Back End Components ==
  
As shown at the top of the diagram above, we are developing native Windows, Mac and mobile active clients for the PDS. These clients have two advantages over the web-based PDA. '''First''', data stored on these devices is entirely under your control without the need to rely on third party hosted services. '''Second''', the client is closely integrated with the browser and other local apps. This allows the client to capture information about you as you browse and can augment your web experience through web augmentation (overlaying context-specific information within your browser) as well as through automatic form filling (e.g. filling in your passwords).
+
There are three back end components mostly written in Java and running in the cloud (e.g. Amazon AWS):
  
=== Data model for information about a person ===
+
*PDS
 +
*PDS Support
 +
*ADS
  
We all play different roles and share different sub-sets of our social graph and attributes depending on who we're interacting with. For this reason a single person is represented as a set of partial identities that are used in different situations. For this reason, the heart of the model used by the personal data store and managed data stores is based on a set of objects called ''contexts.'' Each context holds a partial digital identity called a ''persona''. Each persona instance has a set of attributes and values. Thus one individual (natural person, data subject) is represented as multiple personas each in its own context-container.  
+
[[Image:Higgins server 2.0.230.png|center]]
  
These contexts are usually rendered as digital cards in a user interface. A context/card could hold the attributes of a person's driver's license, home address, credit card. They might simply hold a verified assertion that a person is over 21 years of age. Contexts may also be about friends and colleagues, not just about you.
+
===PDS===
 +
PDS Subcomponents:
  
The user can also choose to collect a set of these partial identities into something called a ''persona''. For example the user could group together a home address card, an AMEX credit card, a proof of age-over-21 and a card holding a set of "shopping friends" into an "eCommerce" persona. This is done by tagging each of these cards with the "eCommerce" label. When the user goes to a new eCommerce site, it can "project" (either by form filling or something more sophisticated!) the minimal set of required attributes from these "eCommerce" cards to the site without tedious data entry.  
+
*.pds.usermanager.ws - simple web service to manage user accounts, change password, etc.
  
What's more, if the user desires, the user can give a semi-permanent (revocable) permission to the relying site, app or system to be able to access an approved set of attributes. The user can basically send a "pointer" to these cards to the relying site. The relying site can de-reference the pointer and read (and in some cases update) selected attributes.
+
===PDS Support===
 +
PDS Support Subcomponents:
  
The data in these contexts adheres to the Higgins [[Persona Data Model 2.0]], a general purpose vocabulary for describing identity and social networking data.
+
*.pds.client - wrapper around Open Anzo java client
  
== Components  ==
+
===Attribute Data Storage===
 +
ADS Subcomponents:
  
*[[PDS 2.0]]: Personal Data Store core service
+
*PLANNED: .ads.ld - Linked Data endpoint
*PDA: We have not even started developing this component for Higgins 2.0. We developed something similar in [[Cloud Selector 1.1|Cloud Selector]] from Higgins 1.1. Similar in that it was a pure web app and that it was a "client" of the core PDS.
+
*[[PDS Client 2.0]]: a library used to access the [[Personal Data Store 2.0]]. It is incorporated into the PDS agent as well as PC and mobile PDS clients.
+
*[[Authentication Service 2.0]]: is an OAuth web service that authenticates PDS users and returns an access token that is relied on by the PDS Agent and the PDS Vault.
+
  
=== Data Models ===
+
== Data Model ==
  
Data models used in Higgins code and services:
+
Data attributes whether created by the user or imported from an external service are stored in a common data model. This allows them to be consistently displayed to, and in some cases edited by, the user irrespective of its original source. We call this the [[Persona Data Model 2.0]].
  
[[Image:Higgins data models.png|center]]
+
[[Category:Higgins 2]]
 
+
*[[Persona Data Model 2.0]]
+
*[[Higgins Data Model 2.0]]
+
*[[Context Data Model 2.0]]
+
 
+
=== IdAS   ===
+
 
+
The IdAS solution is a testbed for exercising the IdAS Java framework.
+
 
+
*Higgins 1.1: See [[Higgins 1.1 Plan#IdAS_Solution_1.1]]
+
*Higgins 1.0: [[IdAS Solution 1.0]]: a basic configuration of the [[Identity Attribute Service 1.0]] (IdAS). IdAS is a java framework that provides a common interface to identity, profile, and relationship data from external data sources (e.g. websites, databases, directories).
+
 
+
=== XDI4J  ===
+
 
+
XDI4J is a java library for working with XDI.
+
 
+
*Higgins 1.1: [[XDI4j 1.1]]
+

Latest revision as of 12:12, 4 January 2012

{{#eclipseproject:technology.higgins|eclipse_custom_style.css}}
Higgins.funnell.PNG

This document describes the top level Higgins 2.0 PDS components under active development. Here are the bugzilla component names:

  • H2-Client
  • H2-HBX
  • H2-PDS
  • H2-PDS Support
  • H2-ADS
  • H2-Data Model

Front End

There are two front end components: a web client, and a browser extension.

Higgins client 2.0.222.png

Client

The client is written in HTML and JavaScript and runs in any desktop browser (e.g. IE, FF, Safari, Chrome). In the future we also plan to make it display well on the limited screen size of smartphone mobile browser (e.g. iPhone, Android, etc.).

HBX

The Higgins browser extension makes possible functionality like browser-side integration with other web APIs and sites, scraping and form filling.

  • .chrome.bx - Chrome-only Higgins Browser Extension
  • .js.pds.cde - Connection Data Engine 1. Loads CDE1-compatible JSON Scripts (See App-data vocabulary) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
  • .js.pds.cde2 - Connection Data Engine 2. Loads CDE2-compatible JSON Scripts (See App-data vocabulary) from templates and uses them to implement auto-login, auto-registration, form filling, etc.
  • .js.pds.connector.common

Functionality

Browser interactions

When the user's browser lands on a new webpage it:

  • Determines if the current PDS user is currently logged in.
    • This requires there be a template for the current site (domain) and that it contains an IsLoggedIn script
    • It is possible that a different PDS user (not the current PDS user) is currently logged in.
  • If the user is not logged in then
    • It automatically logs the user in (or should it just auto-fill in the userid/password and wait for the user to click?)
  • Looks for every appropriate form on the page
    • Automatically fills in each form as best it can -- this requires there be a template for the current site (domain) and that it contains a Fill script for this form (is there one fill script container with lots of per-form-submit-URL scripts? Or are there lots of Fill scripts each with an for-this-form-submit-URL attribute?
  • Waits for the user to submit a form (including a login form with or without a custom template?)
    • Scrapes the form submit data and writes it into the PDS. If it is a login form then it writes into the proxy object, else the corresponding context
Web client interactions

When the user opens a connection editor page (e.g. to edit the nytimes.com connection):

  • The BX immediately starts a background process to login and scrape the latest data values from the site.
    • This is necessary because the user may have gone to the site directly (not using the PDS) and updated data values. A progress bar that shows this background process.
  • If the user edits an attribute it writes the updated attribute value to the site.
    • If this "write" operation happens before the background sync completes, there is some possibility for sync collisions and and confusion.

Back End Components

There are three back end components mostly written in Java and running in the cloud (e.g. Amazon AWS):

  • PDS
  • PDS Support
  • ADS
Higgins server 2.0.230.png

PDS

PDS Subcomponents:

  • .pds.usermanager.ws - simple web service to manage user accounts, change password, etc.

PDS Support

PDS Support Subcomponents:

  • .pds.client - wrapper around Open Anzo java client

Attribute Data Storage

ADS Subcomponents:

  • PLANNED: .ads.ld - Linked Data endpoint

Data Model

Data attributes whether created by the user or imported from an external service are stored in a common data model. This allows them to be consistently displayed to, and in some cases edited by, the user irrespective of its original source. We call this the Persona Data Model 2.0.

Back to the top