A Four-Fold Classification of Data

One of the vexing issues while talking about data protection is the various ways in which data can be classified. Public or private, identified or identifiable, data or metadata; the glut of classifications makes it difficult to achieve consistency.

In The Master Algorithm, Pedro Domingos classifies data into four types:

  • Data that is shared with the world at large. This includes blogs, tweets, reviews on websites, etc.
  • Data that is shared with a limited circle of people. The simplest example under this category would be interactions on social media platforms.
  • Data that is shared with companies. This can often happen without the individual knowing that a company has data belonging to him.
  • Data that individuals do not wish to share with anyone, which is quite self-explanatory.

This is a simple enough classification. That said, there is scope for an overlap between the four types of data. As Domingos is quick to point out, the data from the second category could well fit the third as well, since modes of communication are often controlled by a few companies.

Despite this caveat, the classification is useful. Its simplicity works in its favour by helping more people be a part of the conversation around data protection. And one can hope that more conversations will act as a springboard for developing fresh insights into how we look at our data.

