May 29, 2012
Editor: Ka-Ping Yee (ping
spam@zesty.ca)
URL of this specification:
http://zesty.ca/pfif/1.4
FAQ, examples, and other information on PFIF:
http://zesty.ca/pfif
This document is licensed under the GNU Free Documentation License 1.2.
1. Abstract
This document defines the People Finder Interchange Format,
which consists of a data model and an XML-based exchange format
for sharing data about people who are missing or displaced
by natural or human-made disasters.
The data model is first described in a manner
independent of implementation style (object-oriented, relational, or XML),
then the PFIF XML format is specified by an RELAX NG schema.
This document also offers an example
of a possible relational database schema for PFIF data.
2. Design principles
- The purpose of PFIF is to bring people and data together.
The design aims to promote convergence:
convergence of people who seek the same person,
convergence of information about a person obtained from various sources,
convergence of duplicated data,
and ultimately convergence of missing people with their loved ones.
- Data should be traceable.
Since data comes from sources of unknown reliability and accountability,
information on the origins of data should be maintained,
to help users ascertain its trustworthiness.
- Each record belongs to an original repository,
which is the (PFIF or non-PFIF) repository where the record was first entered.
The record may be copied to other places,
but the original repository remains the authority on the record.
Only the original repository should ever change the contents of a record.
- Each aggregator of data has its own perspective on the world
and is responsible for choosing which data sources to trust.
It is not possible to dictate truths
about all data from a single central authority.
- Because multiple records might refer to the same person,
PFIF allows such records to be associated with each other.
But, by the preceding principle,
each aggregator makes its own decisions about
which records to associate; there is no central authority.
- It should be possible to resolve multiple copies of the same record
that have been imported via different data paths.
- All dates and times must be in UTC, never in a local time zone,
because data records will be transmitted among many different time zones.
This format uses dates in the
RFC 3339 format,
with only UTC allowed.
Front-ends can convert dates and times to the local time zone for display.
3. Data life cycle
Each PFIF repository
may contain original records and clone records.
An original record is a record residing in its original repository;
a clone record is a copy of a record that originated in another repository.
The following diagram describes the life of a PFIF record
as it is created and then travels to other repositories.
.----------------------.
| 1. real-world facts |
'----------------------'
| |
entered by a human | | entered by a human
into a PFIF repository | | into a non-PFIF repository
| |
entry_date, source_date, | |
source_name, source_url | |
are set by the repository | |
v v
.------------------------------. .---------------------------------.
| 2a. original PFIF record in | | 2b. original non-PFIF record in |
| record's original repository | | record's original repository |
'------------------------------' '---------------------------------'
| |
exported as a PFIF | | parsed and converted to the PFIF
document or feed | | data model by a human or program
| |
| | source_date, source_name, source_url
| | are set by the human or program
v v
.-----------------.
.--------------> | 3. PFIF record |
| '-----------------'
| |
| | loaded into a PFIF repository
| |
| | entry_date is set to date/time of import
| v
| .--------------------------------------.
| | 4. clone record in a PFIF repository |
| '--------------------------------------'
| |
| | exported as a PFIF document or feed
| |
'------------------------'
3.1. Incremental export mechanism
Whenever a PFIF repository adds a new original record or clone record,
it must set the entry_date field
to the current time.
This time value must never decrease as records are added.
A client can incrementally update its copy of a repository
by querying for all records with an
entry_date greater than or equal to the
entry_date of the last received record.
3.2. Data update mechanism
The original repository for a record (2a or 2b in the diagram above)
can update any of the fields on a record after it is created,
except the person_record_id field.
Whenever a PFIF repository creates or updates an original record,
it must set both the source_date and
entry_date fields to the current time.
When a repository imports a PFIF record
that has the same record identifier as an existing record,
it should keep the version
with the latest source_date.
3.3. Data expiry mechanism
If present, the expiry_date field
indicates when a record should be deleted
to preserve the privacy of the personal information it contains.
Conforming PFIF implementations must meet the following requirements:
-
Within one day after expiry_date,
a PFIF repository must make the contents
of the person record and
any associated note records
inaccessible to all external clients,
including users and machine API clients.
-
Thereafter,
if the repository exports its data through an API,
it should continue to export a placeholder record
in the place of the expired person record.
This placeholder should keep the same
person_record_id and
expiry_date values,
and have both source_date
and entry_date
set to the time that the placeholder was created.
All other fields should be empty or omitted.
-
Within 60 days after expiry_date,
a PFIF repository must permanently and unrecoverably delete
all its copies (including backups)
of the contents of the person record
and any associated note records,
except for the
person_record_id,
source_date,
entry_date,
and expiry_date
fields needed to produce the placeholder.
To satisfy a user request to delete an existing original record,
a PFIF repository should set the record's
expiry_date to the current time.
(In accordance
with the preceding section,
it would also set the source_date
and entry_date to the current time.)
The expiry mechanism described above would then cause the deletion
to propagate to other conforming PFIF repositories.
4. Data Model
There are two types of records.
person records
are for information that identifies a person.
note records
are for information about the current status of a person.
Each note record belongs to a particular
person, and
a person record may have
with any number of associated note records.
person records may be created
both by those who seek missing a person
and by those who have information on a missing person.
The person record for a person is the
point of convergence for all parties;
the note records on that person are the
growing pool of shared knowledge.
A person record should only be updated
if the information in the record is incorrect.
If the status or location
of a particular person has changed,
this should be indicated by adding a new note record
associated with that person record.
4.1. person records
A person record contains 25 fields.
There may be multiple
person records for the same person.
In fact, any given application that imports data from multiple sources
is likely to acquire multiple
person records for the same person.
It is up to the application to associate such records
(see Suggested relational database schema below).
It is recommended that applications keep copies of all the records,
and separately keep track of which records correspond to the same person.
Metadata about the record itself (9 fields)
This metadata is necessary to enable users of the data
to trace and ascertain its reliability.
- person_record_id
(ASCII string, required)
-
Unique identifier for this record,
which consists of a lowercase ASCII domain name
followed by a slash and a local identifier.
The domain name identifies this record's original repository,
which is the authority for this record.
The format of the local identifier is up to the original repository.
When the person_record_id
begins with a domain other than the application's own domain,
it means this record is a clone record.
- entry_date
(ASCII string in the form "yyyy-mm-ddThh:mm:ssZ", optional):
-
Date in UTC that this copy of this record was stored
(see Incremental export mechanism above).
- expiry_date
(ASCII string in the form "yyyy-mm-ddThh:mm:ssZ", optional):
-
Date in UTC after which this record should be deleted
(see Data expiry mechanism above).
- author_name (Unicode string, optional):
-
The full name of the person who entered this record.
- author_email
(e-mail address string, optional):
-
The preferred contact e-mail address of the person who entered this record.
- author_phone (ASCII string, optional):
-
The preferred contact phone number of the person who entered this record.
- source_name (Unicode string, optional):
-
The human-readable name of the original repository of this record.
- source_date
(ASCII string in the form "yyyy-mm-ddThh:mm:ssZ", required):
-
The date in UTC that the original copy of this record was created
in its original repository.
- source_url (URL string, optional):
-
The URL to this record in its original repository
(as specific as possible, down to the URL of the individual record).
Identifying information about a missing person (16 fields)
These fields contain information that is used to identify a person;
this is information that is not expected to change unless it is incorrect.
Searches for person records
should search over these fields.
- full_name (Unicode string, required):
-
The full name of the person sought or found,
combined in the order and fashion
customary to the person, language, and culture.
For example, a typical English name would be formatted
as a first, middle, and last name with spaces between them,
whereas a typical Chinese name would be formatted
with the family name first and no spaces between characters.
If the person has more than one full name
(for example, both an English name and a Chinese name),
begin with the most commonly used full name
and use newline (Unicode U+000A) characters to separate the full names.
- given_name (Unicode string, optional):
-
Given name of the person sought or found,
optionally followed by a space and any middle names or middle initials.
- family_name (Unicode string, optional):
-
Family name of the person sought or found.
- alternate_names (Unicode string, optional):
-
Any other names associated with the person
that are not part of the person's usual full name,
such as nicknames, alternate spellings, and transliterations.
For example, hiragana readings of Japanese kanji could go in this field.
Use newline (Unicode U+000A) characters to separate names.
- description (Unicode string, optional):
-
Description of the person in free-form text.
In entry forms, a multi-line text box is appropriate for this field.
- sex (ASCII string, optional):
-
Physical sex of the person sought or found,
specified as one of the three strings
female
, male
, or other
.
If the sex is unknown, omit this field.
- date_of_birth
(ASCII string in the form "yyyy", "yyyy-mm", or "yyyy-mm-dd", optional):
-
Exact or approximate date of birth
of the person sought or found.
- age
(integer, or ASCII string in the form "min-max", optional):
-
Approximate age of the person sought or found,
in years since birth as of
the source_date of this record.
The value of this field is either a single decimal integer
or an inclusive range given as two decimal integers separated by a hyphen.
This field has no defined meaning
when source_date is missing.
- home_street (Unicode string, optional):
-
Street name of the home address of the person sought or found.
To protect user privacy,
applications should generally avoid including a street number in this field.
- home_neighborhood (Unicode string, optional):
-
Name of the home neighborhood of the person sought or found.
Use this field for the names of official or unofficial geographic regions
not captured by the other address fields.
- home_city (Unicode string, optional):
-
Home city of the person sought or found.
- home_state (Unicode string, optional):
-
Home state, province, territory, district,
region, parish, county, or department
of the person sought or found,
specified as an uppercase ISO 3166-2 code (preferred) or by its name.
- home_postal_code (ASCII string, optional):
-
Postal code of the home address of the person sought or found,
in the format most commonly used in the country.
- home_country
(ASCII ISO-3166-1 country code, optional):
-
Home country of the person sought or found,
specified as an uppercase two-letter ISO-3166-1 country code.
- photo_url (URL string, optional):
-
URL to an image of an identifying photograph of the person sought or found.
- profile_urls (URL string, optional):
-
URLs to the person's profile pages on other websites.
Use newline (Unicode U+000A) characters to separate URLs.
4.2. note records
Each note record belongs to
exactly one person record.
There may be any number of
note records
associated with a particular
person record.
(See below for implementation notes.
A database might implement this by including a
foreign key, person_record_id,
that refers to the person record.
An object-oriented representation might implement this
by embedding a list of
note objects
within the
person object.)
note records are used to provide
updated, current information on a missing person.
Every note has a timestamp and information on the author of the note.
Applications can use the timestamp to
determine the most recent value of a given field.
Users can use the author information to
ascertain the reliabiliy of a given field.
Metadata about the record itself (8 fields)
- note_record_id (ASCII string, required):
-
Unique identifier for this record,
which consists of a domain name followed by a slash and a local identifier.
The domain name identifies this record's home repository,
which is the authority for this record.
The format of the local identifier is up to the home repository.
When the note_record_id
begins with a domain other than the application's own domain,
it means this record is a clone of a record from another source.
- person_record_id (ASCII string, required):
-
The person_record_id
of the person record
to which this note belongs.
- linked_person_record_id (ASCII string, optional):
-
The person_record_id
of another person record
to associate with the record to which this note belongs.
When this field is present,
it signifies that the author of this note
believes that the two records identified by
person_record_id and
linked_person_record_id
refer to the same person.
If this field is present,
the text field should
explain how these records were determined to refer to the same person.
- entry_date
(ASCII string in the form "yyyy-mm-ddThh:mm:ssZ", optional):
-
Date in UTC that this copy of this record was stored.
A PFIF repository must guarantee that this value increases monotonically
as records are added,
so that a client can update a copy of a repository
by querying for all records with an
entry_date greater than or equal to the
entry_date of the last received record.
- author_name (Unicode string, required):
-
The full name of the person who entered this note.
- author_email
(e-mail address string, optional):
-
The preferred contact e-mail address of the person who entered this note.
- author_phone (ASCII string, optional):
-
The preferred contact phone number of the person who entered this note.
- source_date
(ASCII string in the form "yyyy-mm-ddThh:mm:ssZ", required):
-
The date in UTC that the original copy of this note was created
in its home repository.
In most cases, notes should be sorted by this field for display.
Status information about a missing person (6 fields)
The
author_made_contact,
status,
email_of_found_person,
phone_of_found_person and
last_known_location fields
store data that changes over time.
When these fields are present in a note record,
the record is specifying new values for these fields,
and the source_date field indicates
the date that the new values took effect.
So, for example, an application that wants to display the most recent
known location can look for the
note
with the latest
source_date
that has a non-empty
last_known_location
field.
- author_made_contact (ASCII string, optional):
-
This value is the string
true
if the author of this note has personally contacted the missing person,
or false
otherwise.
If this field is true
,
the text field of this note should
describe HOW and WHEN the person was contacted or seen.
- status (ASCII string, optional)
- Status of the person sought or found, specified as one of the
following five strings:
information_sought
- The author of the note is seeking information
on the person in question.
is_note_author
- The author of the note is the person in question.
believed_alive
- The author of the note has received information that
the person in question is alive.
believed_missing
- The author of the note has reason to believe that
the person in question is still missing.
believed_dead
- The author of the note has received information that
the person in question is dead.
- email_of_found_person
(e-mail address string, optional):
-
The current preferred contact e-mail address of the FOUND person.
This field is present ONLY if the person has been FOUND.
If this field is present,
the text field of this note should
describe HOW the person's contact information was determined.
- phone_of_found_person
(ASCII string, optional):
-
The current preferred contact phone number of the FOUND person.
This field is present ONLY if the person has been FOUND.
If this field is present,
the text field of this note should
describe HOW the person's contact information was determined.
- last_known_location
(Unicode string, optional):
-
A free-form description of the last known location of the person.
To specify geographic coordinates in this field,
give the latitude in decimal degrees (positive for north), then a comma,
then the longitude in decimal degrees (positive for east).
If this field is present,
the text field of this note should
describe HOW the person's location was determined.
- text (Unicode string, required):
-
Free-form text description of the person's current condition,
situation and location details, where they were last seen,
corrections to other information, and so on.
In entry forms, a multi-line text box is appropriate for this field.
- photo_url (URL string, optional):
-
URL to an image to include with this note.
5. XML format specification
The XML Namespace for PFIF is:
The MIME type for a PFIF document is:
A valid PFIF XML document consists
of a single pfif element
containing one or more
person or note
elements, each of which contains child elements for the fields described above.
In a person element,
the person_record_id,
source_date, and
full_name fields are mandatory.
In a note element,
the note_record_id,
author_name, and
source_date fields are mandatory.
All other fields are optional.
The order of the child elements
within a person or
note element is not significant.
A note element
can exist inside or outside a person element.
When a note element appears
outside a person element,
the note
must contain a person_record_id.
Otherwise, the person_record_id field is optional,
and if present, must match the person_record_id
of the enclosing person.
The RELAX NG Schema for PFIF,
given in
RELAX NG Compact Syntax,
is as follows:
namespace pfif = "http://zesty.ca/pfif/1.4"
start = element pfif:pfif { person* & note* }
person = element pfif:person {
element pfif:person_record_id { record_id } &
element pfif:entry_date { time } ? &
element pfif:expiry_date { time } ? &
element pfif:author_name { text } ? &
element pfif:author_email { email } ? &
element pfif:author_phone { phone } ? &
element pfif:source_name { text } ? &
element pfif:source_date { time } &
element pfif:source_url { url } ? &
element pfif:full_name { text } &
element pfif:given_name { text } ? &
element pfif:family_name { text } ? &
element pfif:alternate_names { text } ? &
element pfif:description { text } ? &
element pfif:sex { sex } ? &
element pfif:date_of_birth { approx_date } ? &
element pfif:age { approx_age } ? &
element pfif:home_street { text } ? &
element pfif:home_neighborhood { text } ? &
element pfif:home_city { text } ? &
element pfif:home_state { text } ? &
element pfif:home_postal_code { text } ? &
element pfif:home_country { country_code } ? &
element pfif:photo_url { url } ? &
element pfif:profile_urls { text } ? &
note*
}
note = element pfif:note {
element pfif:note_record_id { record_id } &
element pfif:person_record_id { record_id } ? &
element pfif:linked_person_record_id { record_id } ? &
element pfif:entry_date { time } ? &
element pfif:author_name { text } &
element pfif:author_email { email } ? &
element pfif:author_phone { phone } ? &
element pfif:source_date { time } &
element pfif:author_made_contact { boolean } ? &
element pfif:status { status } ? &
element pfif:email_of_found_person { email } ? &
element pfif:phone_of_found_person { phone } ? &
element pfif:last_known_location { text } ? &
element pfif:text { text } &
element pfif:photo_url { url } ?
}
record_id = xsd:string { pattern = ".+/.+" }
time = xsd:dateTime { pattern = "\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d(\.\d+)?Z" }
email = xsd:string { pattern = ".+@.+" }
phone = xsd:string { pattern = "[\-+()\d ]+" }
url = text
sex = "female" | "male" | "other"
approx_date = xsd:string { pattern = "\d\d\d\d(-\d\d(-\d\d)?)?" }
approx_age = xsd:string { pattern = "\d+(-\d+)?" }
country_code = xsd:string { pattern = "[A-Z][A-Z]" }
boolean = "true" | "false"
status = "information_sought" | "is_note_author" |
"believed_alive" | "believed_missing" | "believed_dead"
6. Atom feed specifications
PFIF XML documents can be embedded into
Atom 1.0 feeds.
The PFIF document should be embedded using an XML namespace
and inserted as an immediate child
of the entry element.
Atom 1.0 defines a top-level feed element
that contains any number
of entry elements.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif
,
so the top-level element should look like this:
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:pfif="http://zesty.ca/pfif/1.4">
...
</feed>
The rest of this section offers recommendations
on how applications should populate the standard Atom elements
so that the feed will make sense to existing feed-reading software.
Nonetheless, the embedded PFIF document takes precedence
over any redundant information that appears in Atom elements.
Two kinds of PFIF Atom feeds are defined here:
person feeds in which each entry contains a person,
and note feeds in which each entry contains a note.
A person feed is roughly analogous
to a blog feed containing blog entries;
a note feed is roughly analogous
to a comment feed on a particular blog entry.
For example, one application might subscribe to a person feed
in order to aggregate missing person records from other databases;
another application might subscribe to a note feed
in order to display a stream of notes with updates about a particular person.
6.1. Atom person feeds
An Atom person feed provides at least the following elements
within the feed element:
- id
-
This element should contain a unique URI associated with this feed.
This might be the URL to the website
that corresponds to the database or service providing this feed.
- title
-
This element should contain the name of this feed.
This should include the title of the database or service
providing this feed.
- subtitle
-
This element should contain a phrase or sentence describing this feed.
This would be the place to explain how this feed is produced,
for example:
"Scraped daily by FooMatic 2.3 from http://example.org/".
- updated
-
This element should contain the date and time in UTC
that this feed was last updated,
given in "yyyy-mm-ddThh:mm:ssZ" format.
- link
-
This element should contain a URL from which this feed can be retrieved.
This element should have a
rel
attribute whose value
is self
.
An Atom person feed provides at least the following elements
within each entry element:
- pfif:person
-
This element contains child elements for the fields of the
person record,
as well as zero or more
pfif:note elements.
A service wishing to provide a complete export
would include all the note records
associated with the person here.
- id
-
This element should contain a URI string consisting of the
scheme "pfif:" followed by the value of the
person_record_id field.
- title
-
This element should contain the value of the
full_name field.
- author
-
This element should contain a
name element containing the value of the
author_name field and an
email element containing the value of the
author_email field
in the person record.
- updated
-
This element should contain the value of the
source_date field
in the person record.
- content
-
This element should contain a human-readable
HTML formatting of the information
in the person record.
It is up to the application to decide how to format the content.
- source
-
This element should contain a copy of the
title element of this feed.
This element may also contain copies of any
other child elements of the feed element.
6.2. Atom note feeds
An Atom note feed provides at least the following elements
within the feed element:
- id
-
This element should contain a unique URI associated with this feed.
This might be the URL to the website
that corresponds to the database or service providing this feed.
- title
-
This element should contain the name of this feed.
This should include the title of the database or service
providing this feed,
followed by a more specific title that describes how the notes
were selected from the database or service.
For example, for a note feed about a particular person,
the title could be the title of the service
followed by the first name and last name of the person in question.
- subtitle
-
This element should contain a phrase or sentence describing this feed.
This would be the place to explain how this feed is produced,
for example: "Exported by CiviCRM 1.1, http://www.example.org/."
- updated
-
This element should contain the date and time in UTC
that this feed was last updated,
given in "yyyy-mm-ddThh:mm:ssZ" format.
- link
-
This element should contain a URL from which this feed can be retrieved.
This element should have a
rel
attribute whose value
is self
.
An Atom note feed provides at least the following elements
within each entry element:
- pfif:note
-
This element contains child elements for the fields of the
note record.
- id
-
This element should contain a URI string consisting of the
scheme "pfif:" followed by the value of the
note_record_id field.
- title
-
This element should contain an excerpt
of the text field.
- author
-
This element should contain a name element
containing the value of the author_name field
and an email element
containing the value of the author_email field
in the note record.
- updated
-
This element should contain the value of the
source_date field
in the note record.
- content
-
This element should contain an HTML formatting of the
text field
in the note record.
It is up to the application to decide how to format the content.
PFIF XML documents can be embedded into
RSS 2.0 feeds.
(In RSS 2.0 terminology, this section defines an RSS 2.0 module.)
The PFIF document should be specified using an XML namespace
and embedded as an immediate child
of the item element.
RSS 2.0 defines two main elements,
channel and item,
that are enclosed in a top-level rss element.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif
,
so the top-level element should look like this:
<rss version="2.0" xmlns:pfif="http://zesty.ca/pfif/1.4">
...
</rss>
The rest of this section offers recommendations
on how applications should populate the standard RSS elements
so that the feed will make sense to existing feed-reading software.
Nonetheless, the embedded PFIF document takes precedence
over any redundant information that appears in RSS elements.
As in the preceding section,
two kinds of PFIF RSS feeds are defined here:
person feeds in which each item contains a person,
and note feeds in which each item contains a note.
An RSS person feed provides at least the following elements
within the channel element:
- title
-
This element should contain the name of this feed,
which should include the title of the database or service
providing this feed.
- description
-
This element should contain a phrase or sentence describing this feed.
This is the place to explain how this feed is produced, for example:
"Scraped daily by FooMatic 2.3 from http://example.org/".
- lastBuildDate
-
This element should contain the date and time in UTC
that this feed was last updated,
given in RFC 822
date format, for example: "Sat, 07 Sep 2002 00:00:01 GMT".
- link
-
This element should contain a URL to the website
that corresponds to the database or service providing this feed.
An RSS person feed provides at least the following elements
within each item element:
- pfif:person
-
This element contains child elements for the fields of the
person record,
as well as zero or more
pfif:note elements.
A service wishing to provide a complete export
would include all the note records
associated with the person here.
- guid
-
This element should contain the value of the
person_record_id field.
- title
-
This element should contain the value of the
full_name field.
- author
-
This element should contain the value of the
author_email field,
followed by a space and the value of the
author_name field enclosed in parentheses.
- pubDate
-
This element should contain the date in the
source_date field
in the person record,
converted to RFC 822
date format, for example: "Sat, 07 Sep 2002 00:00:01 GMT".
The timezone MUST be GMT and the year MUST have four digits.
- description
-
This element should contain a human-readable
HTML formatting of the information
in the person record.
It is up to the application to decide how to format the description.
- source
-
This element should contain the value of the
source_name field.
- link
-
This element should contain the value of the
source_url field.
An RSS note feed provides at least the following elements
within the channel element:
- title
-
This element should contain the name of this feed.
This should include the title of the database or service
providing this feed,
followed by a more specific title that describes how the notes
were selected from the database or service.
For example, for a note feed about a particular person,
the title could be the title of the service
followed by the first name and last name of the person in question.
- description
-
This element should contain a phrase or sentence describing the feed.
This is the place to explain how the feed is produced, for example:
"Scraped daily by FooMatic 2.3 from http://www.example.org/".
- lastBuildDate
-
This element should contain the date and time in UTC
that this feed was last updated,
given in RFC 822
date format, for example: "Sat, 07 Sep 2002 00:00:01 GMT".
- link
-
This element should contain a URL to the website
that corresponds to the database or service providing this feed.
For a note feed about a particular person,
this link could point to the web page for that person's record.
An RSS note feed provides at least the following elements
within each item element:
- pfif:note
-
This element contains child elements for the fields of the
note record.
- guid
-
This element should contain the value of the
note_record_id field.
- author
-
This element should contain the value of the
author_email field,
followed by a space and the value of
author_name field enclosed in parentheses.
- pubDate
-
This element should contain the date in the
source_date field
in the note record,
converted to RFC 822
date format, for example: "Sat, 07 Sep 2002 00:00:01 GMT".
The timezone MUST be GMT and the year MUST have four digits.
- description
-
This element should contain an HTML formatting of the
text field
in the note record.
It is up to the application to decide how to format the description.
8. Suggested relational database schema
This section suggests a possible relational database schema
for storing PFIF data.
The exact details of a database design are up to each application;
this is one possible starting point.
A relational database could store PFIF records in two tables,
person and
note, for the two types of records.
PERSON table:
string person_record_id primary key
datetime entry_date
datetime expiry_date
string author_name
string author_email
string author_phone
string source_name
datetime source_date
string source_url
string full_name
string given_name
string family_name
string alternate_names
text description
string sex
string date_of_birth
string age
string home_street
string home_neighborhood
string home_city
string home_state
string home_postal_code
string home_country
string photo_url
string profile_urls
NOTE table:
string note_record_id primary key
string person_record_id foreign key not null
string linked_person_record_id foreign key or null
datetime entry_date
string author_name
string author_email
string author_phone
datetime source_date
boolean author_made_contact
string status
string email_of_found_person
string phone_of_found_person
string last_known_location
text text
string photo_url
To link a foreign person record
with a local person record,
the application adds a note
associated with the local person record,
with a linked_person_record_id field
containing the person_record_id
of the foreign record.
The other fields of the note
describe the circumstances of the decision to merge:
source_date indicates the date of the decision,
text gives the reason for the decision,
and author_name
names the person, program, or other entity that made the decision.
This specification does not dictate how an application
would decide whether to merge two records;
a merge could be initiated by a human operator
or by a software algorithm that look for records with similar data.
Recording the merge decision
in a note record
makes it possible to back out of a bad merge decision,
and recording the name of the person or program in the
author_name field
makes it possible to track down the cause of an incorrect merge.
When displaying a person record,
the application can then look for
all the non-empty linked_person_record_id
fields among the notes that belong to that person record,
and display all the linked records or a merged view of the linked records.
9. Changes from previous versions
9.1. Changes from PFIF 1.1 to PFIF 1.2
person records gained four new fields:
sex,
date_of_birth,
age, and
home_country.
The home_zip field
was replaced with home_postal_code.
To upgrade from a PFIF 1.1 repository,
export the old home_zip values in the
home_postal_code field
and set the home_country field
to US
in records
whose home_state refers to a U. S. state
or home_postal_code field contains a U. S. zip code.
note records gained three new fields:
person_record_id,
linked_person_record_id, and
status.
In the PFIF XML format,
note elements became allowed
outside of person elements.
Aside from the
note_record_id and
person_record_id fields,
which had to appear first,
the rest of the child elements became permissible in any order.
Atom entries and RSS items came to contain
individual pfif:person and
pfif:note elements
with no enclosing pfif:pfif element.
9.2. Changes from PFIF 1.2 to PFIF 1.3
The source_date field became mandatory
on person records.
Records can be updated by (and only by) their original repository,
and the source_date must be updated
when a record changes.
person records
gained the mandatory full_name field;
first_name and
last_name became optional.
person records
gained the new expiry_date field,
with conformance requirements for data deletion
and propagation of the expiry date.
In the PFIF XML format,
all the child elements of
person elements and
note elements
became permissible in any order.
9.3. Changes from PFIF 1.3 to PFIF 1.4
person records
gained the optional alternate_names field
and the optional profile_urls field.
In person records,
the first_name field
was renamed to given_name
and the last_name field
was renamed to family_name.
In person records,
the description field
replaced the old other field.
note records
gained the photo_url field.
In note records,
the found field was renamed to
author_made_contact.
In note records,
there is now a convention for specifying geographic coordinates
in the existing last_known_location field.
10. Acknowledgements
The initial data model on which the first version of PFIF was based
is due to the CiviCRM team, David Geilhufe, and Kieran Lal.
Luke Blanshard, Tony Chang, Josh Kleinpeter,
Kieran Lal, Jonathan Plax, Gabe Wachob, Ka-Ping Yee,
Steve Hakusa, Mark Prutsalis, Lee Schumacher,
the Missing Persons Community of Interest
(tci_missingpersonsspam@googlegroups.com),
and other participants on the working group list
(pfifspam@googlegroups.com)
contributed to the current design of PFIF.