And why Should Data Custodian Share Non-Personal-Data (NPD)? --IP and Other Policy Concerns

Dr. Sunanda Bharti
Mar 22, 2021
8 min read

The Ministry of Electronics & Information Technology (MeitY) constituted a Committee of Experts (Committee) in September 2019, to devise a framework for governing non-personal data (NPD). This Committee released an initial report and later a revised report, which aims to provide more clarity on the definition of NPD and its categorisation—and specifically attempts to delineate personal data (PD) from NPD.

The present post briefly makes a few submissions about what intellectual property aspects should not be overlooked when it comes to any Policy (and eventually legislation) pertaining to NPD.

1. What are Policies? How are they different from Law?

Policy ideas are often the starting place for law. They often precede law; and are transformed into law. Policies should be easy to understand because they are devoid of legal jargon—the fancy language of law. Hence, while law can afford to be cryptically worded, a policy has to avoid that pitfall.

Albert Einstein maintained, that a concept...an idea...anything—if you cannot explain it to a 6 year old, you have probably not understood it yourselves. So a Policy has to be that simple to understand. We need to exlore whether the proposed/revised NPD Policy comes up to that benchmark.

‘Public Policy’ Concerns about NPD Policy

An essential requirement of a Policy is to have a clear idea about who are the beneficiaries of the same?’ The answer that the NPD Policy Report gives is --'Data Principal’—means you and I, about whom the data is being collected are collectively identified as ‘Community’ which is declared to be the beneficiary.

Can we, in aggregate, be the owners of NPD? Is it even possible to personify community? Does that not mean that community has a right over that data? Is Community a legal person then? If not, ‘community consent’ appears to be fanciful and rhetorical—it’s just a feel good phrase that can possibly have no legal connotations.

A related issue about awareness of the policy needs to be highlighted--is the community aware that such a policy is being formulated which has been projected to benefit them? Community consent, is another dimension that needs to be taken into consideration—which has been projected as the ‘opt out’ option. Meaning, currently, individuals must merely consent to opt out from anonymising their data. This is so far from reality. Effectively, people who would form these communities have little control over how their data is used—more so because India is a huge country, large percentage of the population is illiterate and even those who have had education might not have any sense about what exactly is NPD and the implications associated with sharing that data. Moreover, researchers, and advanced AI or technlogy (as it is cautioned) can be used to re-trace the data holder or the data principal, making this whole big anonymisation a futile exercise.

So consent is problematic. Think of it--most of us might just end up clicking ‘yes’ or ‘no’ to that dialog box which might ask for consent. From the legal (contract law) perspective, one needs to actively think and question whether this (clicking of an option through a dialog box) even qualifies as consent?

What Terms are defined in the Policy?

Ideally the task of providing with definitions etc is more properly a reserve of law and not of policy. Law should define in clear/unabiguous terms the various important concepts around which it is framed. However, that does not mean that a policy paper can afford to just brush the matter aside and not throw light on the ‘core concepts’ for which the policy took birth in the first place—and the NPD policy does not fulfill that parameter. For instance, the revised Policy fails to satisfactorily define ‘Public Good’. Likewise, ‘HVD’ or High Value Data (which is the focal point of the NPD policy) is not defined in concrete terms. It has been loosely/vaguely described as data that is for public good; which benefits the community at large. Further, there is a need to seek out the reasoning behind defining public good in the way the Policy has defined it. A Policy needs to explain, so that the law need not.

To elaborate, the Report goes on to give various examples of what constitutes ‘public good’, such as, poverty alleviation, job creation, education, the creation of new businesses and so on. The author submits that it would do the Country much good if the meaning of ‘public good’ is aligned or steered towards something concrete, instead of it being so delimited. One way to achieve that would be to assess the gaps where urgent action is required so that the start-ups and new businesses spring and folurish around that gap. Like climate change, e waste management, anti poaching mechanisms, renewable energy creation, garbage disposal—or civic waste management, senior citizens homes, or stray animal population and so on.

The author maintains that the battle of poverty allieviation, job creation, education, has already been fought and work is largely being done on those fronts. So, a Policy touted to be for ‘public good’ needs to think along certain priorities that need urgent management.

How the Policy is Proposed to be Implemented?

For NPD, this essentially translates into inquiring 1)To whom would the Data go?; and 2) what is the grievance redressal mechanism?

The answer to the first question is that this NPD (in the form of HVD) would remain with the Data Trustees. The NPD Report mandates that such aggregate data be shared with data trustees who will then create an HVD for public access out of it. And this Data Trustee could be any government organisation or non-profit private organisation, that is, a Section 8 company/Society/Trust, which would be responsible for the creation, maintenance and data-sharing of HVDs in India.

What is the need for such intermediary?-When the Data Principal (X) engaged with the Data Custodian (Uber), X agreed to the data being used by Uber, so long as it provides X with services that s/he wants. X never signed up for that data, even if it is in the anonymised form, to be shared with anyone-Data Trustee or whomsoever. By creating the intermediary of Data Trustee, the Policy proposal creates an additional potential [data]leak point. The author further submits that the [persons from]Community who want to start a new business by relying on some HVD, say, owned by Uber, may directly approach Uber for it and there is no need for any intermediary.

2. Rationale behind the Policy—What is the need? Is there any need?

Should a Policy be devised only when there is a palpable need for the same amongst the members of the community? Or can a Policy be a harbinger of change in itself—policy makers taking the lead into deciding what is best for the community. The author would like to align with this latter understanding. After all, it one way in which a welfare state may work.

But, we need to assess it in the context of the NPD Policy. It is stated that NPD Policy attacks data hoarding–and by doing that, the author asserts, it depicts casual business entities as monsters for possessing something that is rightfully theirs--something on which their business depends. It is also alleged that these (so-called) data hoarders/businesses pose as entry barriers to start-ups and other new entrants in digital markets. And that is the rationale/reason proposed on why it should be made mandatory for businesses to share their raw data (subject to defined grounds).

The Data Custodians, the businesses, which collect this high value data might not have any incentive for sharing. They might not have any inclination to share—it is their intellectual Property (IP)! To proceed on the broad presumption that they would not require the data once it is made non-personal (anonymised) would mean having a ZERO understanding of what IP is and how IP operates.

Take the example of Uber—the taxi service that thrives on how it uses and exploits collected data. Lets look at what data it might be collecting--

So there would be data on (i) who takes the trip—man, woman or child, (ii) what destination they go to mostly—airport, malls, shopping, work? Outside the city or inside the city, and so on. In order to anonymise it, you remove the name, the house address, the phone number--and what does that leave us with?–I think still, there is aggregate data that is highly valuable for Uber. Like--

data on availability of public transport in an area--so that they can focus on cities that have poor transportation;
data on traffic situation that would enable them to make use of to enhance customer service experience;
data on time saving routes/ data on blocked routes/data on short-cuts which might not otherwise be thoroughfare and hence might not be there on Google maps etc;
data on which neighbourhoods will be the busiest at what time---so that they can station their cabs and drivers accordingly
data on the transport preferences of people—so that more cabs are available for say going to the airports or railway stations. And one can always add more to this list.

All this is Uber’s bread and butter! There is every reason why Uber would treat it as data under section 2(o) of the Copyright Act, 1957 and refuse to part with it. Or, if sharing is made mandatory, there could be a strong inclination not to anonymise it. Also, it is only practical and fair if Uber quotes a price for parting with it.

These are pertinent IP issues that need to be sorted out before one plunges into legislating on NPD.

How feasible is the implementation of the Policy?

Even if one accepts the ‘data hoarding’ allegation, one needs to devise judiciously balanced and prudent means to fetch this data? Three questions, giving rise to three possible solutions are proposed-

1. Should it be for some remuneration and mandatory? This would translate into letting the Data Custodian enjoy monopoly over the data for say, the first 5 years of collecting it (that is, maintain exclusivity over NPD for a limited period of time as in the case of intellectual property) before the mandate to share applies.

That is essentially a compulsary lisence sort of a mechanism? Under the Copyright Act, 1957 the Intellectual Property Appellate Board (IPAB) ascertains the situations where a compulsory license may be granted and that is case sensitive. NPD Policy could propose the same technique for data sharing.

Section 2(o) of the Copyright Act, 1957 defines literary works to include computer databases. By that logic, all collected data should qualify for protection as literary work, provided it is sufficiently original.

Further, IPAB’s decisions can be appealed before the relevant High Court –so the appellate mechanism is also sufficiently equipped.

2. Should the data be made automatically shareable after a few years of gestation?--a sort of statutory licensing that we have in case of cover versions in Copyright Act, 1957? Or,

3. Should it remain completely voluntary—as a corporate social responsibility? This can be incentivised-through tax deductions, subsidies etc. The idea here is that by voluntarily sharing the non-personalised data, businesses can demonstrate social commitment—and hence add to their reputation.

In conclusion, the author submits that data is copyrightable subject matter as per section 2(o) of the Copyright Act, 1957. And this is beyond dispute. It would be wise on part of the policy makers to respect the sanctity of the IP involved. At the same time it should be acknowledged that monopoly rights over the concerned NPD need not necessarily be antagonistically positioned against public interest. The IP regime has long since managed to balance the two competing interests in hoards of other protected subject-matters. We hence have a mechanism already in place for tackling issues such as refusal on part of the Data Custodian to share HVD. The same mechanism may be deployed in case of misuse of NPD, instead of reinventing the wheel!