December 2022

A month of LLM news demonstrates once again the ignoring of ethical harms in the rush to demonstrate progress in AI

Dec 12, 2022

A month of LLMs

This month has been quite eventful in the space of large language models. Two notable LLMs, ChatGPT and Galactica, were released this month from OpenAI and Meta respectively. These developments were widely hailed as monumental progress in the field. However, Galactica’s public demo was quickly withdrawn once users discovered it generated harmful and racist content [2] in the guise of formal scientific literature, replete with fictitious citations. Meta’s Chief AI Scientist, Yann LeCun, openly questioned the actual harms caused. Similarly, users found racist examples of ChatGPT output. One particularly noxious series of outputs from ChatGPT were purported Python functions to predict seniority based on race and gender, determine if a scientist is good based on race and gender, assign moral value to different races, among many others, all with predictably problematic results. OpenAI’s response so far has been to crowdsource feedback on bad outputs, falling far short of the careful development and review experts believe are necessary to mitigate harms.

NeurIPS paper on privacy

We presented a paper at the NeurIPS 2022 workshop on Algorithmic Fairness through the Lens of Causality and Privacy. We consider the compliance testing situation where a model development team develops models for a predictive task and a separate compliance team assesses the degree of algorithmic bias of each model. In such situations, which are common at banks, the demographic data of customers are deliberately withheld from the model developers to avoid any possible contamination of such knowledge into the model development process. Multiple models may be built when earlier models fail compliance review and new models are built to address shortcomings.

Our main contribution is to show that even in such censored workflows, the model development team can reconstruct information about customers’ demographics simply by submitting multiple models for review and obtaining multiple measurements of algorithmic bias. It gets easier to reconstruct such demographic labels when the disadvantaged class is rare in the total population. In the worst case, this reduces to a compressed sensing problem, where only a logarithmic number of models are needed to recover the full demographics of the entire population.

To ameliorate such data leakage, we show that the differential privacy technique of adding a controlled amount of noise suffices to negate such leakage. More specifically, the standard Laplace mechanism can preserve differential privacy of the statistical parity gap, with the amount of noise needed growing inversely with the rarity of the disadvantaged class.

For full details, please refer to our publication:

Faisal Hamman et al., Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity, NeurIPS’22 AFCP Workshop.

This month in responsible AI

Business

A new development in autonomous lethal drones also made the news this month:

[YouTube] Pranshu Verma, The racing drone that could kill, Washington Post. Describes a new drone system with lethal capabilities.

Policy and government

Developments on the policy front have been fairly quiet in line with the end of the year. But notably, China has released a Solicitation for Public Comment on the Law for the Establishment of the Social Credit System (《中华人民共和国社会信用体系建设法（向社会公开征求意见稿）》, QQ.com mirror, pdf, unofficial translation).

☛ Submit comments to the National Development and Reform Commission by 2022-12-14.

Papers

This past month has seen many new academic papers released, with the NeurIPS conference recently concluded and AAAI acceptances released. Here is a small selection of notable papers:

Fabian Beigang, On the Advantages of Distinguishing Between Predictive and Allocative Fairness in Algorithmic Decision-Making, Minds and Machines. Argues for a separation of ethical considerations between prediction and decision-making.

[arXiv, Twitter, GitHub], Emanuele Marconato et al., GlanceNets: Interpretabile, Leak-proof Concept-based Models, NeurIPS’22. Proposes a new concept-based neural networks designed for inherently interpretable representations, with care taken to avoid semantic leakage.

[arXiv] Nikiforos Pittaras and Sean McGregor. A taxonomic system for failure cause analysis of open source AI incidents. Contributors to the AI Incident Database describe the workflow used to label incidents, using taxonomies for system goals, methods and failure modes.

[arXiv] Meike Zehlike et al., Beyond Incompatibility: Interpolation between Mutually Exclusive Fairness Criteria in Classification Problems. Proposes an optimal transport-based approach to find interpolative solutions between multiple fairness criteria that are known to be not mutually satisfiable.

[GitHub] Darren Edge, Introduction to ShowWhy, user interfaces for causal decision making. Microsoft’s new no-code tool for causal inference.

[Twitter, arXiv] Cuong Tran et al., Fairness Increases Adversarial Vulnerability. Demonstrates that debiasing models redraws decision boundaries such that adversarial example become typically closer to the boundaries, making debiased models more fragile.

[Twitter, arXiv] Lingjiao Chen et al., Estimating and Explaining Model Performance When Both Covariates and Labels Shift, NeurIPS’22. Proposes a new method for detecting distributional shift that accounts for label drift.

[arXiv] Siyuan Guo et al., Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data. Proposes a new causal discovery algorithm that is based on the independent causal mechanism principle.

[Twitter, arXiv] Q. Vera Liao et al., Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI, AAAI HCOMP’22. Surveys practitioners on which technical aspects of explanation are most important in practice.

[arXiv] Han Xuanyuan et al., Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis, AAAI’23. Proposes a new explanation method that explicitly favors human interpretability and importance.

[demo, arXiv] Luyu Gao et al., PAL: Program-aided Language Models. An interesting twist on improving LLM-based reasoning by augmenting natural text output with Python code output that can be evaluated separately for correctness.

This month in Responsible AI

Discussion about this post