Anyone who applies for a loan through a credit card or bank firm and then gets rejected they are required to explain why the decision occurred. This is a great idea because it could help in teaching those who are unable to restore their damaged credit. This is an act of the federal government known as the Equal Credit Opportunity Act. The process of getting an answer wasn’t difficult in the past, as humans made these decision-making decisions. However, with artificial intelligence systems having become more efficient in helping or replacing those who make credit decisions, obtaining those explanations is becoming more difficult.
In the past, a lender who refused an application might inform a potential borrower that there was an issue with their income, work record, or whatever it was that the problem was. However, computerized systems that employ sophisticated machine learning models can be difficult to explain, even for professionals.
Credit decisions for consumers are only one of the ways this issue is triggered. Similar problems can also be found in the fields of health, online marketing, and the criminal justice system. My fascination with this topic was sparked when a research team I was a part of found gender bias in the way online advertisements were targeted; however, I was unable to provide a reason for the preference.
The entire range of industries, as well as numerous others that use machine learning to analyze processes and make decisions, have an extra year to improve in explaining the way their systems function. In May, the latest European Union General Data Protection Regulation will take effect and includes a provision that gives people the right to receive an explanation of the automated decisions that affect their lives. What form do these explanations need to take, and how can we offer them?
Identifying the main factors
One way to understand the reasons why an automated decision came in the way it did would be to determine the elements that were most important to the outcome. What percentage of a credit denial was based on the applicant did not earn enough money or the borrower had been in default of repaying loans previously?
My group of researchers at Carnegie Mellon University, including Ph.D. student Shayak Sen and then postdoc Yair Zick, came up with a method to determine the impact of each variable. It is referred to as the Quantitative Impact of Input.
Apart from providing greater insight into an individual’s decision, the analysis could also help to understand the findings of a group, such as: Did the algorithm block credit primarily due to financial reasons, for instance, how much an applicant has to pay on other loans? Was the applicant’s zip code more significant, suggesting that more fundamental demographics like race may be a factor?
Causation of the causation is captured.
Suppose a system is able to make decisions based on a variety of factors. In that case, it is essential to know what factors are responsible for the decision and how they contribute to the findings.
Consider, for instance, an application for credit that has just two elements: the applicant’s debt-to-income ratio and her race. The system has been proven to be able to grant loans only to Caucasians. Knowing the extent to which each factor affected the final decision can aid in determining if it’s a legitimate process or if it’s discriminatory.
A possible explanation is to take a look at the inputs as well as the results and then observe that there was a correlation. Non-Caucasians did not receive loans. However, this explanation is far too simple. The non-Caucasians who were refused loans also had less income than Caucasians who were approved. This explanation isn’t able to be used to determine whether the applicant’s race or debt-to-income ratio triggered the rejections.
Our method will give this information. By identifying the difference, we can determine if the system is unfairly discriminating or is based on legitimate criteria, including applicants’ finances.
To assess the effect of race on a particular financial decision, we re-run the process of applying, keeping the ratio of debt to income identical but changing that of the person using. If changing race has an impact on the final result, we can tell that race plays a role in determining. If not, we could conclude that the algorithm is only looking at the financial data.
As well as identifying the factors that cause problems and determining their causal impact on an outcome. We can do this by randomly altering the cause (e.g., race, for instance)) and then selecting the likelihood that the work will convert. The greater the probability that the job will change, the greater the effect of the variable.
Influence of a group
Our method may also include several interconnected factors. Think about a system of decision-making that gives credit to applicants who meet three requirements: a credit score over 600, the car is owned, and the applicant has fully paid back the home loan. Let’s say an applicant, Alice, with a rating of 730 on her credit report but has no home loan or car, is not granted credit. She is wondering if her car ownership or the history of home loan repayment is the primary reason.
A metaphor can help us understand the way we approach this issue. Think of a court whose decision-making is made through the majority vote of a group comprising three judges, in which one judge is conservative, one liberal, and the final judge is an alternate vote, a person who could be a partisan of one of her peers. In a decision that was 2-1 in favor of conservatives, the swing judge had an even greater impact on the decision than the judge who was liberal.
The credit factors we use in our illustration are similar to those of the three judges. The first judge usually votes for the loan because a lot of applicants have a sufficient credit score. The second judge typically decides against the loan due to the fact that only a handful of applicants have had the opportunity to pay off a mortgage. Therefore, the final decision falls to the judge who is swinging, which, in Alice’s instance, denies the loan due to the fact that she doesn’t have a car.
This reasoning can be done precisely using the concept of cooperative games, which is a way of studying, more specifically, the way that different factors affect one result. Particularly, we integrate our measurement of the relative causality with our Shapley coefficient, which can be used to determine the degree of influence that can be attributed to a variety of variables. Together, they form the basis of our Quantitative Measurement of Input Influence.
We have so far evaluated the decision-making systems that we designed by combining the most common machine learning algorithms using real-world data sets. Evaluation of algorithms working within the actual world will be a subject to be explored in the future.
An open challenge
Our approach to investigating and explaining algorithms’ decisions is most effective in environments where humans, such as the ratio of debt to income and other financial metrics, can easily comprehend the variables.
But, describing the process of decision-making for more complicated algorithms is an enormous problem. For instance, consider image recognition systems such as those that detect and detect the presence of tumors. It’s not really helpful to explain an image’s assessment based on specific pixels. We prefer an explanation that offers more information about the decision-making process – for example, identifying particular tumor features in the image. Indeed, the process of creating illustrations for these automated decision-making tasks keeps many researchers working.