Docs Style Guide

Spelling and grammar

American spelling and grammar

Whenever U.S. English and British (or other) English spelling or usage disagree, standard U.S. spelling and usage is preferred.

Wrong

The colour of the button is grey.

Right

The color of the button is gray.

@memoize
def check_ukus(text):
    """UK vs. US spelling usage."""
    err = "style-guide.uk-us"
    msg = "uk-vs-us-spell-check. '{}' is the preferred spelling."

    preferences = [
        ["gray",                ["grey"]],
        ["color",               ["colour"]],
        ["accessorizing",       ["accessorising"]],
        ["acclimatization",     ["acclimatisation"]],
        ["acclimatize",         ["acclimatise"]],
        ["acclimatized",        ["acclimatised"]],
    ]

    return preferred_forms_check(text, preferences, err, msg)

    # This is a sample code. The complete code can be found in the file:
    # proselint-extra.py.

Quote marks

Quote marks should generally be avoided if possible.

Smart quotes (also known as curly quotes or directional quotes) are not permitted in source files.

Avoid quote marks

Quote marks are used in prose writing to indicate verbatim text. This is rarely useful in technical writing, as verbatim text usually requires a more specific semantic markup.

Wrong

Click the button that says, "Save."

Right

Click :guilabel:`Save`.

Wrong

You may see an error message that says, "Something went wrong."

Right

You may get an error: ``Something went wrong.``

def check_quotes(text):
    """Avoid using straight quotes."""
    err = "style-guide.check-quote"
    msg = "Avoid using quote marks."
    regex = r"\"[a-zA-z0-9 ]{1,15}\""

    errors = []

    for matchobj in re.finditer(regex, text):
        start = matchobj.start()+1
        end = matchobj.end()
        (row, col) = line_and_column(text, start)
        extent = matchobj.end()-matchobj.start()
        errors += [(err, msg, row, col, start, end,
                         extent, "warning", "None")]

    return errors

Straight quotes

Any time that you do need to use quotation marks, use straight (or plain) quotes. Sphinx and Docutils will output the typographically correct quote style.

def check_curlyquotes(text):
    """Do not use curly quotes."""
    err = "style-guide.check-curlyquote"
    msg = "Do not use curly quotes. If needed use straight quotes."
    regex = r"\“[a-zA-z0-9 ]{1,15}\”"

    errors = []

    for matchobj in re.finditer(regex, text):
        start = matchobj.start()+1
        end = matchobj.end()
        (row, col) = line_and_column(text, start)
        extent = matchobj.end()-matchobj.start()
        errors += [(err, msg, row, col, start, end,
                         extent, "warning", "None")]

    return errors

Serial comma

In a comma-delineated list of items, the penultimate item should be followed by a comma.

Wrong

Apples, oranges and pears.

Right

Apples, oranges, and pears.

@memoize
def check_comma(text):
    """Use serial comma after penultimate item."""
    err = "style-guide.serial-comma"
    msg = "Use serial comma after penultimate item."
    regex = "\,\s[a-zA-Z0-9]+\sand\s"

    return existence_check(text, [regex], err, msg, require_padding=False)

A bulleted list is often more clear than an inline list.

Correct

You will need to be familiar with git, GitHub, and Python.

Possibly Better

You will need to be familiar with:

- git
- GitHub
- Python

There's no hard rule about which to use in any situation. Use your judgement: try it both ways and see which is more clear.

Direct Address

Direct address — speaking directly to the reader using the second person "you" — is preferred over passive voice ("it can be done"), first-person plural ("we can do it"), or other constructions.

First person plural ("we") should only be used when speaking of the ODK project team ("We recommend…").

Ordered and unordered lists

An ordered list is numbered. It should be used when the order of the list is essential. For example, when enumerating a series of steps in a procedure.

Wrong

- First we do this.
- And then we do this.
- And then we do this.

Right

Do this.
Do this.
Do this.

An unordered list is bulleted. It should be used for a collection of items in which order is not essential.

Wrong

apples
oranges
bananas

Right

- apples
- oranges
- bananas

Avoid Latin

Several Latin abbreviations are common in written English:

At best, these present a minor barrier to understanding. This is often made worse by unintentional misuse.

Avoid Latin abbreviations.

Wrong

If you are writing about a specific process (e.g., installing an application)...

Right

If you are writing about a specific process (for example, installing an application)...

@memoize
def check_latin(text):
    """Avoid using Latin abbreviations."""
    err = "style-guide.latin-abbr"
    msg = "Avoid using Latin abbreviations like \"etc.\", \"i.e.\"."

    list = [
        "etc\.", "etc", "\*etc\.\*", "\*etc\*",
        "i\.e\.", "ie", "\*ie\.\*", "\*ie\*",
        "e\.g\.", "eg", "\*eg\.\*", "\*eg\*",
        "viz\.", "viz", "\*viz\.\*", "\*viz\*",
        "c\.f\.", "cf", "\*cf\.\*", "\*cf\*",
        "n\.b\.", "nb", "\*nb\.\*", "\*nb\*",
        "q\.v\.", "qv", "\*qv\.\*", "\*qv\*",
        "ibid\.", "ibid", "\*ibid\.\*", "\*ibid\*",
      ]

    return existence_check(text, list, err, msg, ignore_case=True)

Etc.

Et cetera (or etc.) deserves a special mention.

Et cetera means "and all the rest," and is often used to indicate that there is more that could or should be said, but which is being omitted.

Writers often use etc. to gloss over details of the subject which they are not fully aware of. If you find yourself tempted use etc., ask yourself if you really understand the thing you are writing about.

Avoid unneeded words

Adverbs

Adverbs often contribute nothing. Common offenders include:

simply

easily

just

very

really

basically

extremely

actually

Wrong

To open the file, simply click the button.

Right

To open the file, click the button.

Wrong

You can easily edit the form by...

Right

To edit the form...

@memoize
def check_adverb(text):
    """Avoid using unneeded adverbs."""
    err = "style-guide.unneed-adverb"
    msg = "Avoid using unneeded adverbs like \"just\", \"simply\"."

    list = [
        "simply",
        "easily",
        "just",
        "very",
        "really",
        "basically",
        "extremely",
        "actually",
    ]

    return existence_check(text, list, err, msg, ignore_case=True)

Filler words and phrases

Many words and phrases provide no direct meaning. They are often inserted to make a sentence seem more formal, or to simulate a perceived style of business communication. These should be removed.

Common filler phrases and words include:

to the extent that
for all intents and purposes
when all is said and done
from the perspective of
point in time

This list is not exhaustive. These "canned phrases" are pervasive in technical writing. Remove them whenever they occur.

@memoize
def check_filler(text):
    """Avoid using filler phrases."""
    err = "style-guide.filler-phrase"
    msg = "Avoid using filler phrases like \"to the extent that\"."

    list = [
        "to the extent that",
        "when all is said and done",
        "from the perspective of",
        "point in time",
    ]

    return existence_check(text, list, err, msg, ignore_case=True)

Semicolons

Semicolons are used to separate two independent clauses which could stand as individual sentences but which the writer feels would benefit by close proximity.

Semicolons can almost always be replaced with periods (full stops). This rarely diminishes correctness and often improves readability.

Correct

These "canned phrases" are pervasive in technical writing; remove them whenever they occur.

Better

These "canned phrases" are pervasive in technical writing. Remove them whenever they occur.

@memoize
def check_semicolon(text):
    """Avoid using semicolon."""
    err = "style-guide.check-semicolon"
    msg = "Avoid using semicolon."
    regex = ";"

    return existence_check(text, [regex], err, msg, require_padding=False)

Pronouns

Third-person personal pronouns

Third-person personal pronouns are:

he/him/his
she/her/her(s)
they/them/their(s)

Note

While some people consider they/them/their to be non-standard (or "incorrect") as third-person singular, it has gained wide use as a gender-neutral or gender-ambiguous alternative to he or she.

There are two issues with personal pronouns:

gender bias
clarity

To avoid gender bias, the third person gender-neutral they/then/their(s) is preferred over he or she pronouns when writing about abstract individuals.

Wrong

The enumerator uses his device.

Right

The enumerator uses their device.

Unfortunately, they/them/their is not a perfect solution. Since it is conventionally used as a plural pronoun, it can cause confusion.

Therefore, avoid the use of personal pronouns whenever possible. They are often out of place in technical writing anyway. Rewriting passages to avoid personal pronouns often makes the writing more clear.

Correct

When using Survey, first the enumerator opens the app on their device. Then they complete the survey.

Better

To use Survey:

- open the app
- complete the survey

@memoize
def check_pronoun(text):
    """Avoid using third-person personal pronouns."""
    err = "style-guide.personal-pronoun"
    msg = "Avoid using third-person personal pronouns like \"he\", \"she\". In case of absolute need, prefer using \"they\"."

    list = [
        "he",
        "him",
        "his",
        "she",
        "her",
        "hers",
    ]

    return existence_check(text, list, err, msg, ignore_case=True)

Same

Same, when used as an impersonal pronoun, is non-standard in Modern American English. It should be avoided.

Wrong

ODK-X Survey is an Android app. The same can be used for...

Right

ODK-X Survey is an Android app. It can be used for...

Right

ODK-X Survey is an Android app that is used to...

@memoize
def check_same(text):
    """Avoid using impersonal pronoun same."""
    err = "style-guide.check-same"
    msg = "Avoid using \"The same\"."
    regex = "\. The same"

    return existence_check(text, [regex], err, msg, ignore_case=False,
                       require_padding=False)

Titles

Title case and sentence case

Document titles should be in Title Case – that is, all meaningful words are to be capitalized.

Section titles should use Sentence case – that is, only the first word should be capitalized, along with any proper nouns or other words usually capitalized in a sentence.

Verb forms

If a document or section describes a procedure that someone might do, use a verb ending in -ing. (That is, a gerund.) Do not use the "How to…" construction.

Wrong

How to install ODK-X Survey
---------------------------

Right

Installing ODK-X Survey
-----------------------

If section title is a directive to do something (for example, as a step in a procedure), use an imperative.

Installing ODK Aggregate
------------------------

Download ODK Aggregate
~~~~~~~~~~~~~~~~~~~~~~

Section content here.

@memoize
def check_howto(text):
    """Avoid using how to construct."""
    err = "style-guide.check-howto"
    msg = "Avoid using \"How to\" construction."
    regex = "(How to.*)(\n)([=~\-\"\*]+)"

    return existence_check(text, [regex], err, msg, require_padding=False)

Section labels

Section titles should almost always be preceded by labels.

The only exception is short subsections that repeat — like the Right and Wrong titles in this document.

In these cases, you may want to use the rubric directive.

def check_label(text):
    """Prefer giving a section label."""
    err = "style-guide.check-label"
    msg = "Add a section label if required."
    regex = r"(.*\n)(( )*\n)(.+\n)(([=\-~\"\']){3,})"

    errors = []
    sym_list = ['===','---','~~~','"""','\'\'\'']
    is_doc_title = True

    for matchobj in re.finditer(regex, text):
        if is_doc_title:
            is_doc_title = False
            continue
        label = matchobj.group(1)
        start = matchobj.start()+1
        end = matchobj.end()
        (row, col) = line_and_column(text, start)
        row = row + 2
        if any(word in text.splitlines(True)[row] for word in sym_list):
            row = row - 1
        col = 0
        extent = matchobj.end()-matchobj.start()
        catches = tuple(re.finditer(r"\.\. _", label))
        if not len(catches):
            errors += [(err, msg, row, col, start, end,
                         extent, "warning", "None")]

    return errors

Other titling considerations

Do not put step numbers in section titles.
Readers skim. Section titles should be clear and provide information.

Writing code and writing about code

ODK Documentation includes code samples in a number of languages. Make sure to follow generally accepted coding style for each language.

Indenting

In code samples:

Use spaces, not tabs.
Two spaces for logical indents in most languages.
- Python samples must use four spaces per indent level.
Strive for clarity. Sometimes nonstandard indentation, especially when combined with non-syntactic line breaks, makes things easier to read.
- Make sure that line breaks and indentation stay within the valid syntax of the language.

Using two spaces keeps code sample lines shorter, which makes them easier to view.

Example of indenting for clarity

HTTP/1.0 401 Unauthorized
Server: HTTPd/0.9
Date: Sun, 10 Apr 2005 20:26:47 GMT
WWW-Authenticate: Digest realm="testrealm@host.com",
                         qop="auth,auth-int",
                         nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
                         opaque="5ccc069c403ebaf9f0171e9517f40e41"
Content-Type: text/html
Content-Length: 311

Meaningful names

When writing sample code, avoid meaningless names.

Wrong

def myFunction(foo):

  for bar in foo:
     bar[foo] = foo[spam] + spam[foo]

  return foobar

XML and HTML

Some of the terms often used to describe XML and HTML code structures are imprecise or confusing. For clarity, we restrict certain terms and uses.

Likewise, coding practices and styles for XML and HTML vary widely. For the sake of clarity and consistency, samples should follow the guidelines set forth here.

Element

The following piece of code represents an element:

<element>
  Some content.
</element>

Note

An element is not a block or a tag.

Tag is defined below.
Block has a specific meaning in HTML and XML templates, and should generally be avoided outside those contexts.

Tag

A tag is the token that begins or ends an element.

<element>  <!-- The opening tag of this element. -->
  Some content.
</element> <!-- The closing tag. -->

The word tag has often been used to refer to the entire element. For clarity, we will avoid that here.

Node

The word node is often used interchangeably with element.

For clarity, we make the following distinction:

An HTML or XML document has elements, not nodes.
A node is part of a live DOM tree or other dynamic representation.
- An XML or HTML element becomes an element node in a DOM tree.
- There are also other types of nodes in a DOM tree.

Attributes and values

An element may have attributes. Attributes have values. Values are wrapped in straight double-quotes.

<element attribute="value">
  Content.
</element>

Other names for attributes, such as variables or properties, should be avoided.

Element content

The code between the opening and closing tags of an element is the content. Content can include other elements, which are called child elements.

<element>
  Content.
  <child-element>
    More content.
  </child-element>
</element>

When an element is empty, it can be called a null element.

<null-element attribute="value" />

In XML, null element tags always self-close. This is not the case in HTML.

HTML elements that are always null (for example, <img>) do not need to be self-closed.
Empty HTML elements that normally accept content have a separate closing tag.

<img src="awesome-picture.jpeg">

<script src="some-javascript.js"></script>

Capitalization

For all HTML samples, tag names and attribute names should be all lowercase.

Newly written XML examples should also be all lowercase.

XML examples that show actual code generated by tools in the ODK ecosystem should replicate that code exactly, regardless of its capitalization practice.

ODK jargon

ODK and ODK Docs

Wrong

odk-x
ODK-X docs
ODK-X documentation

Right

ODK-X
ODK-X Docs
ODK-X Documentation

Probably want to avoid…

ODK-X Documentation

@memoize
def check_odkspell(text):
    """ODK-X spelling usage."""
    err = "style-guide.spelling-odkx"
    msg = "ODK-X spell check. '{}' is the preferred usage."

    preferences = [

        ["ODK-X",                   ["Odk-x"]],
        ["ODK-X",                   ["{0} odk-x"]],
        ["ODK-X Docs",              ["ODK-X docs"]],
        ["ODK-X Documentation",     ["ODK-X documentation"]]
    ]

    return preferred_forms_check(text, preferences, err, msg, ignore_case=False)

ODK-X app and project names

ODK-X includes a number of components, including:

Survey
Tables
Services

These should always be capitalized.

The ODK-X prefix (as in, `ODK-X Survey <https://docs.odk-x.org/survey-using/>`_) should be used the first time a document mentions the app or project, or any other time it would be unclear.

A few projects should always use the ODK-X prefix:

ODK-X Survey
ODK-X Tables
ODK-X Survey
ODK-X Docs

@memoize
def check_appspell(text):
    """ODK-X spelling usage."""
    err = "style-guide.spelling-odkx"
    msg = "ODK-X spell check. '{}' is the preferred usage."

    preferences = [
        ["ODK-X Survey",             ["{0}-x survey"]],
        ["ODK-X Services",             ["{0}-x services"]],
        ["ODK-X Tables",             ["{0}-x tables"]]
    ]

    return preferred_forms_check(text, preferences, err, msg, ignore_case=False)