LIVE | MediaNama is discussing the impact of the Personal Data Protection Bill and the Non-Personal Data Framework on Data and Artificial Intelligence #DataPoliciesandAI
Current algorithms do push us to the limit of trade-offs. However, ideas such as differential privacy can help up in navigating some of those trade-offs - @RahulAPanicker, @WadhwaniAI#DataPoliciesandAI#NAMA
If you want to find the average compensation in a company. One way is to give the sum total and that doesn't allow to identify an individual. Federated learning is another example where your data stays with you, raw data doesn't go into the cloud - @RahulAPanicker, @WadhwaniAI
Should AI be regulated? I don't think it is the right way to look at it. We should be looking at regulating applications. Do we regulate Chemistry, or do we regulate the Pharma industry which is an application area? - @RahulAPanicker, @WadhwaniAI#DataPoliciesandAI
The regulations in the financial sector will have to be different in health. It can't all be under the same criteria...bringing in checks and balances is also important-- access control etc. - @RahulAPanicker, @WadhwaniAI#DataPoliciesandAI
One of the common ways privacy is violated is by data breaches. And AI is usually used to prevent these breaches. There are algorithmic bases to mitigate some of these. We can even mitigate biases using certain techniques so that algorithms provide make more unbiased decisions...
Do we need more data for better AI? Naive answer is yes, more data always helps. But the more sophisticated answer is that there are diminishing returns -- most interesting information comes from the initial samples compared to the later samples - @RahulAPanicker, @WadhwaniAI
Deep learning algorithms are extremely data hungry. A child doesn't need to be shown 1,000 cats to recognise a cat, so we know that algorithms today are data inefficient - @RahulAPanicker, @WadhwaniAI#DataPoliciesandAI
In Informed consent, the tricky bit is the "informed". How do you inform people of all potential issues? - @RahulAPanicker, @WadhwaniAI
In medical device regulation it's not just the hardware or software, but also the literature around the use of these things. All this comes from the perspective of risk mitigation. Some of the risk will come from algorithms, some from data, and some from outsiders-@RahulAPanicker
There are many domains where low risk applications are allowed to be governed under self regulations. What that self-regulation entails is also well defined. Self-regulations is not the same as no regulation and there is perhaps a large space...
The amount of diversity in data is necessary so that the claims you make of the application can be held up. If I say I have an algorithm that is an automated radiology reading algorithm. Then I have to make sure that the test set for that has enough diversity - @RahulAPanicker
It is also on the developers to anticipate unexpected consequences of algorithms. Technologists bear responsibility - @RahulAPanicker, @WadhwaniAI
LIVE | We have started with our first panel for the day, where we are discussing Data and Utility. Experts from the lending industry will speak on how they collect data, and use AI to generate insights about borrowers.
Nothing in the public domain is personal information. If we have to help out customers and use data that is not in the public domain, then we follow a consent mechanism. What we do offer tech to help lenders in assessing a loan application - @meghnaskumar, @crediwatch
When we collect information not in the public domain, then we only offer our clients the technology to collect it. We don't collect it ourselves. We also give them the tech to purge this data - @meghnaskumar, @crediwatch#DATAPoliciesandAI
I don't think AI per say can be regulated. Use cases need to be regulated. Can you take data of a person to use for an illegitimate purpose? The answer is no. We need to come out with a privacy law. After that look at use cases of AI...
Bloomberg takes a lot of data from the public market and creates a lot of insights using those, and they have ownership over that. - @meghnaskumar, @crediwatch#DataPoliciesandAI
At the end of the day, what lenders really want to understand is the ability of people to pay returns, and their willingness to do so - Balakrishnan Narayanan, @Early_Salary#DatapoliciesandAI
Personal data and NPD is situational and context based. If I went to a doctor, I might have to revel what I ate, who I met etc. If I go to a banker, he might need much more information...
...However, the problem is when that doctor and the banker start sharing information - Balakrishnan Narayanan, @Early_Salary
We are also looking at the impact that a person's "maturity level" and greying of hair might have on a person's delinquency - Balakrishnan Narayanan, @Early_Salarypscp.tv/w/1OyKAEvYMnWKb
We ask customers to submit a selfie to asses their intention to pay a loan back - Balakrishnan Narayanan, @Early_Salary
But how do you asses that by looking at a selfie?: @nixxin
Narayanan: This project is currently under pilot, we haven't launched it yet.
As a lender, image of KYC is a part of the loan giving process. We do have images of customers and use it to understand the system on our own level - Balakrishnan Narayanan, @Early_Salary#DataPoliciesandAI
The spike in also and credit scoring, should make us think a little about RBI's lending guidelines for NBFCs. We also need to ask why the PDP Bill makes an exemption for credit scoring - @Lyciast, @tattlemade#DataPoliciesandAI
For a class of algorithms, the way current systems are set up, do need a lot of data. But the kind of algorithms that the credit scoring industry really needs, there isn't a lot of need of massive data collection - @Lyciast, @tattlemade#DataPoliciesandAI
In 2018, the Supreme Court said that we have a right to privacy even in public spaces. Personal data doesn't cease to be personal data just because it is placed in the public domain. - @VidushiMarda, #DataPoliciesandAI
The problem is that we're building tech based on extremely problematic science. And instead of thinking about foundational issues with that, we end up talking about optimising operational processes. - @VidushiMarda, #DataPoliciesandAI
There is an assumption that you can have robust self-regulation norms and that should be enough of protection. I'm sceptical of that. Self regulation contemplates people in power, deciding how they would act. - - @VidushiMarda, #DataPoliciesandAIpscp.tv/w/1OyKAEvYMnWKb
If you have 20,000 CCTVs on the road, and the talk about data protection etc., I think by that time, we're already late to the party. Design and deployment of these tools is as important as operationalise itself - @VidushiMarda, #DataPoliciesandAI
If you have to regulate a use case, then you can do that. If you look at the data, we have 60 million small biz in India and 85% of them don't have access to credit because there isn't enough data available with the lender on them. - @meghnaskumar, @crediwatch
The finance industry has always been a data driven industry. Now we're just replacing that with tech-driven solutions to collect that data. The fintech industry exists today to increase financial inclusion - @meghnaskumar, @crediwatch#DataPoliciesandAI
The pushback is because people are asking to look at the costs. What if to give loans to more people, we are targeting a certain segment of the society? You can disproportionately benefit a certain section. You can give more loans to men for instance - @Lyciast, @tattlemade
Alternate lending does concentrate power. You're rejecting or accepting people's loan applications on the basis of something that they can never contest with you on - @VidushiMarda #DataPoliciesandAIpscp.tv/w/1OyKAEvYMnWKb
How do you check your system for bias? While we create the model, we expose it to more than 3000 variables. At some point in time, more variables might not be the right way to look at - Balakrishnan Narayanan, @Early_Salary
When there's consent, I don't think we should be debating what is right or what is not. We live in a world where FB and Google are making billions off of people's data. In fintech, people will benefit from this data - @meghnaskumar, @crediwatchpscp.tv/w/1OyKAEvYMnWKb
With that, we now move on to our second panel for the day, where we will focus on the privacy implications of using AI in the context of the Personal Data Protection Bill, and the Non Personal Data Framework.
The vast majority of people have absolutely no idea of what is happening to their data. Zero. The reason is that the way data is being collected for advertising purpose is so complex. This brings me to the more frustrating part of GDPR - a massive enforcement gap. @F_Kaltheuner
In a world that's made of data, data protection authorities are left to regulate everything which really exceeds their capacity - @F_Kaltheuner#DataPoliciesandAI
From a policy perspective, it does not make sense of AI to talk about AI in abstract. That's because if we say is AI good or bad for privacy? I'd say it depends. Use cases range from inherently dangerous to absolutely wonderful things for science...
...We need to push governments and companies towards the more ethical side of the debate. We also need to talk about market dominance since the biggest AI companies in the world today are the biggest tech companies - @F_Kaltheuner#DataPoliciesandAI pscp.tv/w/1OyKAEvYMnWKb
We need to find ways to protect people from inaccurate inferences about them. If you incorrectly put someone in a group of 'reckless' people who shouldn't get credit, that has real world consequences - @F_Kaltheuner#DataPoliciesandAI
Lot of companies in the credit industry who say they use AI, actually don't use AI. In India, 90% of lending happens via PSUs or NBFCs. These companies manually look at the files before giving a loan...
From a credit perspective, we are far behind in terms of what AI can achieve. Our algorithms are so premature that they don't men take transactional data into account. - @StatsAndPushups, @creditvidya#DataPoliciesandAI
There are attributes and there are derived attributes. You go to a bank, and show them a bank statement. Then a person would check that to see how financially disciplined you are (number of cheque bounces) before giving a loan...
By using algorithms, what is the average daily balance of this person over the last 30 days. A machine can instantly make that calculation. Does this person look like a guy who could say his loan back? These are derived attributes...
The actual use of AI in India is not some super intelligent computers working. To a large extent it's being used for content moderation on platforms. In areas like facial, emotional recognition it is increasing because these tools are getting cheaper - @divijualsuspect
There are 3 main things impacted by AI tech:
1. Idea of AI's capacity to process, capture and store information 2. The implication of data driven decision on personal autonomy 3. Inferences and group privacy - @divijualsuspect#DataPoliciesandAI
I don't think we should think of regulating AI in general. It really depends on what we think AI does and regulate it from that perspective. - @divijualsuspect#DataPoliciesandAI
I would start with the use case of AI because every single use case has different implications on the rights of a person. Usually, there is a huge disproportionality in power differentials between impacted party and the end user. - @BasuArindrajit,
It is a tragedy that as a nation that's trying to get adequacy status, and as a nation that's trying to get some soft power, we are allowing state agencies to have carte blanche access to data. @BasuArindrajit on Clause 35 of PDP Bill. pscp.tv/w/1OyKAEvYMnWKb
In the Indian lending system, there are more biases in humans compared to machines - @StatsAndPushups, @creditvidya
If you see CCTNs, that relies upon records that were created during colonial times - how people are rowdies, or history sheeters because they belong to a certain tribe. There are a number of biases that are encoded in these databases - @divijualsuspect#DataPoliciesandAI
When we talk about emotional recognition, we say that it isn't really science. However, at some point it will become exact science someday. Do we even want to use that as an argument as of now? asks Lahar Appiah #DataPoliciesandAI
And that’s a wrap! We’d like to thank Flipkart, Microsoft & Facebook for their support for this discussion. We’d also like to thank our community partners The Centre for Internet and Society for their support. You can watch the discussion here:
• • •
Missing some Tweet in this thread? You can try to
force a refresh
We are about to start with our discussion on the impact of the Personal Data Protection Bill and the Non-Personal Data Framework on Data and Artificial Intelligence. #DataPoliciesandAI
Then, we’ll move on to our first panel discussion for the day where we’ll discuss data collection, using #AI to generate insights & the utility for real-world decision making. #DataPoliciesandAI
REGISTRATIONS ARE NOW CLOSED for our discussion on Data & #AI, that we are hosting tomorrow (Jan 28) - we have a great line up of speakers & a tremendous response for attending these sessions - our agenda & reading list here: medianama.com/2021/01/223-ev…
We are looking at the Impact of Data Policies on Artificial Intelligence, in context of upcoming regulations (#PDP Bill, #NPD Framework) and will kick-start our discussion with opening remarks by @RahulAPanicker from @WadhwaniAI at 2:15 pm IST
THREAD: The strange situation of India's Ministry of Information and Broadcasting (MIB) looking to regulate content on online streaming services like Netflix, Hotstar, Amazon Prime Video, Sony LIV etc. We've been tracking this since early 2018.
1. In an unexpected move (it has been rumoured for over a year), amendments in the allocation of business rules have given MIB jurisdiction over regulation of online streaming services. medianama.com/2020/11/223-ib…
2/
2. Why is this strange? Firstly, bec streaming services are not like cable TV. It's not broadcast. When you choose a show/video, you pull content. It is private viewing. What separates a streaming service like Hotstar from Porn? The content, of course, but nothing structurally
3/
The CCI investigation will look into 4 aspects of dominance by Google: 1. Exclusivity Regarding Mode of Payment for Purchase of Apps and In-App
Purchases (IAPS).
This is a major issue, because the Google Play store is imposing Google's payment gateway on apps, and then
2/
taking a 30% commission from all apps. This was the issue that got Indian founders to push back against Google earlier last month. the allegation that the CCI cites is a "take it or leave it" condition for all app developers.
#Thread: The Indian government on Aug 27 released a data management policy for National Digital Health Mission. The policy is a framework for how data with the mission, by hospitals, health companies, startups, etc, will be governed. #DigitalHealth 1/n medianama.com/2020/08/223-na…
The government has given only ONE WEEK for the public to respond to and give their inputs on the policy. It’s a blatant deadline and goes against the spirit of consultative policy making. 2/n medianama.com/2020/08/223-na…
A week is not enough for civil society to respond to concerns, or for businesses to weigh cost-benefit. In fact, it violates the govt’s own Pre-Legislative Consultative Policy, which requires a 30-day consultation period for policies released by govt ministries/departments: 3/n
We will be live-tweeting the roundtable on Algorithmic Accountability in India, hosted by Divij Joshi @divijualsuspect. You can also tune into the livestream on our YouTube channel:
.@divijualsuspect said today's discussion is on what algorithms imply for our lives and for our democracy.
Implementation and use of AI varies across states. Second, the private sector is involved in all govt projects around AI. For instance, Punjab's AI system is being developed by a Gurgaon firm.