The mset Attribute

Editor’s Draft, 24 July 2014

This version:
http://berkmancenter.github.io/cache-link/mset-attribute.html
Editors:
Ryan Westphal (Berkman Center for Internet and Society)
Herbert Van de Sompel (Los Alamos National Laboratory)
Michael L. Nelson (Old Dominion University)
Version History:
Commit History
Participate:
Public mailing list
Github

Abstract

The mset attribute is intended to give authors a way to provide temporal context to the target content of a hyperlink, links to copies of the target content of a hyperlink, or both. The primary purpose of the mset attribute is to address [Internet censorship], [link rot], [content drift], and [denial-of-service attack]s.

Table of Contents

1 Introduction

This specification provides developers with a means to declare a temporal reference for a hyperlink and zero or more copy links.

This is achieved by introducing mset to HTML, as an attribute for the a element to support specifying dates and multiple copies of an external resource. Copies can vary in presentation, format, and origin.

The a element remains backwards compatible with legacy user agents which will ignore the mset attribute. The presence of the mset attribute does not affect the default user agent behavior for navigating a hyperlink.

The mset attribute differs significantly from the link element (which also relates documents to one another) in that the mset attribute relates the target of the a element to other documents where the link element relates the current document to other documents.

1.1 When to use mset

The mset attribute can be used any time a publisher authors a document which references other content via hyperlinks. At its most basic, the publisher can provide only the date when target content was referenced while authoring a document.

If one or more copies of the target content are known to exist, the mset attribute should be used embed the locations of these copies into the a element in the current document.

The mset attribute is not a replacement for the href attribute. When the original date of reference is unknown and there are no known copies of target content, developers should omit the mset attribute.

1.2 Examples of usage

The mset attribute can be set to one or more reference candidates separated by "," (U+002C) characters. These examples show some (but not all) variations of the mset attribute.

This example shows the most basic usage of the mset attribute to declare the date external content was referenced while authoring the current document. The locations of existing copies can be omitted.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="2014-03-17">
  HTML 2.0
</a>
This example shows a basic usage of the mset attribute to declare the location of a copy of target content. The date of reference while authoring the current document can be omitted, however, it is recommended.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="http://web.archive.org/web/20140211110349/http://www.w3.org/MarkUp/html-spec/html-spec_toc.html">
  HTML 2.0
</a>
This example shows that both the date of reference and location of a copy can be used in the same mset attribute. They should be separated by a "," (U+002C) character.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="2014-03-17,
    http://web.archive.org/web/20140211110349/http://www.w3.org/MarkUp/html-spec/html-spec_toc.html">
  HTML 2.0
</a>
This example shows pointing to more than one copy of external target content by separating multiple reference candidates with "," (U+002C) characters.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="http://web.archive.org/web/20140211110349/http://www.w3.org/MarkUp/html-spec/html-spec_toc.html,
    http://perma.cc/RL33-W794">
  HTML 2.0
</a>
This example shows usage of the optional datetime value to declare the date the copy of the external content was made if it is known. datetime value are separated from the copy [URL] by one or more space characters.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="http://web.archive.org/web/20140211110349/http://www.w3.org/MarkUp/html-spec/html-spec_toc.html 2014-02-14,
  http://perma.cc/RL33-W794 2014-03-10">
  HTML 2.0
</a>
This example declares the date external content was reference as well as two copies and the dates those copies were made.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="2014-03-17,
  http://web.archive.org/web/20140211110349/http://www.w3.org/MarkUp/html-spec/html-spec_toc.html 2014-02-14,
  http://perma.cc/RL33-W794 2014-03-10">
  HTML 2.0
</a>
This example shows that, if a location of a copy of target content is known, a developer can include the relationship between the target content and the copy. The relationship must be one of the keywords defined by link type or HTML5 link type extension definitions.

The memento link type, for example, is an extension which declares that the copy is "a fixed resource that will no longer change state."

<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="http://perma.cc/RL33-W794 memento">
  HTML 2.0
</a>
This example shows that, if a location of a copy of target content is known, the date the copy of external content was made is known, and the relationship between the target content and the copy is known, a developer can include all three pieces of information in one reference candidate.
<a
  href="http://www.w3.org/MarkUp/html-spec/html-spec_toc.html"
  mset="http://perma.cc/RL33-W794 2014-03-10 memento">
  HTML 2.0
</a>

2 Definitions

The following terms are used throughout this specification so they are gathered here for the reader’s convenience. The following list of terms is not exhaustive; other terms are defined throughout this specification.

The following terms are defined by the [HTML] specification:

browsing context, skip whitespace, collect a sequence of characters, set of comma-separated tokens, space characters, split a string on spaces, hyperlink, following hyperlinks, resolve a URL navigate fetch datetime value, valid non-empty URL, link type, HTML5 link type extension, Context menus, menu element, contextmenu event, show event, contextmenu attribute, and show.

3 Additions to the a Element

The a element (when an href attribute is present) represents a hyperlink to other parts of the current document or a hyperlink to other documents.

3.1 The mset Attribute

partial interface HTMLAnchorElement {
  attribute DOMString mset;
};

If the a element is a hyperlink, the mset attribute can also be present. If present, its value must consist of one or more reference candidates, each separated from the next by a "," (U+002C) COMMA character. This attribute allows authors to provide a temporal context to the target content of a hyperlink, links to copies of the target content of a hyperlink, or both.

The mset attribute must be omitted if the href attribute is not present.

A reference candidate declares one of two additional properties of the external resource of a hyperlink:

temporal reference
The date at which the external resource was referenced in the context of when the current document, article, or section was authored.
copy link
The location of a copy of the content of the external resource.

A reference candidate must not be both a temporal reference and a copy link.

If the reference candidate is a temporal reference, it must consist of the following components, in order:

  1. Zero or more space characters.
  2. A datetime value representing the date the external resource was referenced.
  3. Zero or more space characters.

If the reference candidate is a copy link, it must consist of the following components, in order:

  1. Zero or more space characters.
  2. A valid non-empty URL referencing a copy of the content of the hyperlink’s external resource
  3. Zero or more space characters.
  4. Optionally, a datetime value representing the date at which the copy was made.
  5. Zero or more space characters.
  6. Optionally, a link type representing the relationship of between the external resource of the hyperlink and the copy specified in this reference candidate.
  7. Zero or more space characters.

4 Processing model

This spec defines a method for including additional references to copies of a hyperlink’s target content. Providing a user access to these references is not defined by this specification. However, this section suggests behaviors which user agents can choose to adopt to make the references discoverable.

The presence of the mset attribute must not affect the default user agent behavior described in following hyperlinks.

For example, following a hyperlink to missing content must trigger a [[GET request]] to the target, produce a response with a [[404 status code]], and present the user with the target server’s 404 page if one exists.

4.2 Suggested primary user agent behavior

In the case where, after following a hyperlink, the target content either no longer exists or is not what was expected, the user agent can provide access to additional references to copies of the content should the user go back and look for them.

4.2.1 Context menu item

A user agent can reveal additional references in a new context menu item titled, e.g., Open cached copy of link. The menu item can expand to list each copy link and, if provided in the mset attribute, the date on which the copy was made.

Primary user agent behavior, context menu listing mset references

If a user selects a copy link from the list, the user agent must perform the steps described in following hyperlinks and navigate to the selected [URL].

4.2.2 Relationship to the contextmenu event and the contextmenu attribute

When a user requests a context menu for an a element with an mset attribute (e.g., by right-click or tap and hold), the user agent must first perform the following actions:

  1. fire a contextmenu event at the element, and then
  2. if that event is not canceled and the a element has a contextmenu attribute, fire a show event at the corresponding menu element as described in Context menus

If the contextmenu event is not canceled and the a element does not have a contextmenu attribute, then the user agent should display its own context menu for the hyperlink and should include the suggested primary behavior described above.

4.3 Suggested secondary user agent behavior

If the user agent encounters an error while following a hyperlink, e.g., the user agent cannot fetch the resource because the web server of the resource cannot be found, can be found but does not respond, or can be found but responds with an server error, the user agent should display the copy links defined in the mset attribute of the hyperlink that was clicked.

At the time when the user agent queues a task to navigate the browsing context to an error page, if the user agent elects to do so, it should first store the list of copy links with the browsing context. On the error page, the user agent should display hyperlinks or [buttons] which convey all data provided by each copy link, i.e., the copy link [URL], the date a copy link was referenced, and the link type.

Example button to open list of copy links
After clicking, copy links are listed as hyperlinks

If a user clicks a copy link’s hyperlink, the user agent should follow the hyperlink.

5 Acknowledgements

Contributions from: Ryan Westphal, Genève Bergeron, Herbert Van de Sompel, Jack Cushman, Michael Nelson

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[HTML]
Ian Hickson. HTML. Living Standard. URL: http://www.whatwg.org/specs/web-apps/current-work/multipage/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. URL: http://www.ietf.org/rfc/rfc2119.txt
[URL]
Erik Arvidsson; Michael. URL. 24 May 2012. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2012/WD-url-20120524/

Informative References

Index

Property index

No properties defined.