Discover more from This month in Responsible AI
ChatGPT and Bing take the world by storm while concerns solidify around their limitations and trustworthiness. Algorithmic audits reinforce the need for good data science in audits.
Notice: as always, this newsletter is too long to send in its entirety by e-mail. Please read it in full online.
Please support what a happy reader says is “the best newsletter of its kind” for responsible AI! Like, share and subscribe.
Thanks for reading This month in Responsible AI! Subscribe for free to receive new posts and support my work.
On 2023-03-13, Bloomberg Law published a deep dive into auditing bias in employment AI, which includes quotes from my discussion with journalist J. Edward Moreno.
On 2020-04-20, I will be a panelist at the American Bar Association’s National Symposium on Technology in Labor and Employment Law, which will be in person-only at the Chicago-Kent College of Law starting 2023-04-19. The panel, “How should AI be audited?”, focuses on New York’s Local Law 144 on automated employment decision tools, is currently scheduled for 4:30-5:30pm CDT.
I will be serving on the organizing committee as an Ethics Review Chair for the NeurIPS 2023 conference, which will be in New Orleans starting 2023-12-10.
Algorithmic audits in the news
A recent collaboration between Wired and Lighthouse Reports on an algorithmic audit of Rotterdam’s welfare fraud predictive model claims to have uncovered evidence of discriminatory bias along the dimensions of race and age, and that the model is “essentially random guessing”. However, my close reading of their methodology reveals significant flaws in their evaluation that severely curtail the claims of bias.
[GitHub] Gabriel Geiger et al., Suspicion Machines: unprecedented experiment on welfare surveillance algorithm reveals discrimination (2023-03-06), Lighthouse Reports.
Several glaring flaws in the analysis are:
Use of ROC curves to evaluate model performance with unbalanced classes. The expected base rate of welfare fraud is in the range of 0.2% - 0.4%. This is a classic example of a classification problem with large class imbalance. It is well-known that ROC curves are less reliable as an evaluation metric compared to precision-recall curves when the focus is on the rarer class (fraud, in this case), because the false positive rate will change very little when the classification boundary changes. Rotterdam’s use of ROC curves as an evaluation metric is not appropriate.
Lack of statistical testing to substantiate claims of “random guessing”. To substantiate such a statement, the auditors should have performed a statistical significance test of statistics such as the Mann-Whitney U. In fact, it is fairly simple to reconstruct. Taking N = 12,707 samples and AUC = 0.75, and assuming a base rate of b = 0.3% fraud,
corresponding to a normal test statistic of z = 5.338, with corresponding one-sided p-value of p = 4.69 × 10⁻⁸. Thus, the observed ROC curve would be extremely unlikely to be consistent with a truly random classifier.
Lack of data on actual fraud hampers evaluation of bias. Almost all algorithmic definitions of unfairness rely on knowledge of the actual behavior, which was not possible in this audit due to privacy concerns. An assessment based solely on group-level disparaties is confounded by correlations latent between demographics and the outcome variable, which limits the validity of conclusions of discrimination that can be drawn from such audits.
ChatGPT has attracted intense interest in almost every knowledge industry, spawning an entire new startup sector rushing to build on top of ChatGPT in a new gold rush. Scientists are using ChatGPT to brainstorm research ideas, and classroom use continues grow for teaching critical thinking. Researchers found that ChatGPT can play Jeopardy! better than IBM Watson. Some have heralded ChatGPT as an intellectual revolution. The pace of innovation continues to accelerate; on 2023-03-14, OpenAI announced GPT-4, showing off multimodal text and image generation. Such capabilities may have been patterned after the research-grade Kosmos-1 model [GitHub] that was published on 2023-03-01 and the Visual ChatGPT model [GitHub] published on 2023-03-08. The U.S. federal government is watching the explosion with interest.
Clarity is starting to emerge around ChatGPT’s known limitations and failures, with a consistent theme that non-experts struggle with using it effectively and fall victim to the “AI Mirror”. Sentiment analysis on tweets and papers (2023-02-20) revealed general surprise, mostly positive in its capabilities, but with growing negative concerns about fake output and detectability of generated output, particularly in education and medicine. Such hallucinations have resulted in surprise around fake geolocation services and other nonexistent product offerings. Hallucinating fake citations is creating headaches for librarians with requests for nonexistent references. Another continued sticking point is brittleness in reasoning capabilities. It turns out that ChatGPT mostly does arithmetic well, except for subtraction and division with as yet unexplained brittleness on certain numbers. Despite much-ballyhooed successes in passing medical and business school exams and even a technical interview for a Google job, ChatGPT failed miserably on answering Singapore’s notorious Primary School Leaving Examinations for sixth-grade students. Similarly, an amateur player defeated AlphaGo by playing weaknesses found computationally.
Other concerns are starting to emerge too, such as the common failure mode experienced when ChatGPT experienced an outage. Gary Marcus calls out the danger of data leakage in testing LLMs by testing on data it was trained on. Misuse is starting to surface: a user in Hangzhou, China created public panic by sharing a fake traffic policy change on Weixin/WeChat, triggering a police investigation. Later, Chinese regulators have instructed Chinese tech firms not to offer ChatGPT services over concerns of spreading U.S. government “misinformation”. Reuters also reports a boom in AI-generated books on Amazon. Science fiction magazine Clarkesworld was forced to refuse new submissions until a solution to the deluge of AI-generated spam submissions could be found.
Further concerns around ChatGPT use. Universities continue to raise concerns about the long-term potential for students to use ChatGPT to cheat. Companies are increasingly worried about the data security risk of employees pasting sensitive data directly into ChatGPT, with estimates that 2.3% of the workforce having already done so with even regulated personal data. Media companies like Wired now have policies on the use of generative AI, while others like CNET are now laying off writers. Continued experimentation on ChatGPT reveals further Western and male biases in simple tasks like naming philosophers. Vanderbilt University also apologized for using ChatGPT to craft an insincere email about a mass shooting incident. On 2023-02-24, Sam Altman blogged about OpenAI’s plan for AGI, calling out the need for societal involvement and context-awareness to address safety, bias and job displacement. The post was criticized for technochauvinism and hyperbole by Emily Bender, who was recently profiled in New York Magazine for her work on natural language understanding (NLU) and the limitations of ChatGPT for advancing NLU. Later on 2023-03-01, OpenAI announced their new ChatGPT and Whisper APIs, while not addressing data sovereignty concerns from Indigenous peoples in New Zealand about unlicensed use of data on the Indigenous languages te reo Māori and ’ōlelo Hawai’i. The industry calls for regulation even from Elon Musk as he is rumored to put together his own competitor to “woke” ChatGPT. At the same time, big tech lobbied for the upcoming EU AI Act to exclude general-purpose AI like ChatGPT.
Outside the ChatGPT sphere, AI continues to launch in new products like Spotify’s AI DJ, Sony’s Gran Turismo game, Harvey’s paralegal AI. On 2023-03-10, a Reddit user reported convincing evidence that Samsung’s phones applies AI superresolution models to artificially enhance moon photos.
Controversy around open sourcing LLMs. Strictly speaking, describing the LLaMa model as “open source” is wrong: while the code is GPLv3 licensed, the weights have to be requested through a Google form with eligibility determined at Meta’s discretion. Nevertheless, there is good reason for this guardrail: Yann defends this policy as a response to the previous negative publicity around Galactica. The model has since been leaked on BitTorrent. While some like Arvind Narayanan and Sayash Kapoor argue in favor of open sourcing by default, others like Sara Hooker argue that lowering the barriers to entry for misuse is harmful overall, echoing similar discussions in cybersecurity. Such concerns have already surfaced in the real world, with HuggingFace taking down GPT-4chan, an LLM explicitly trained on toxic content. Meg Mitchell announced on 2023-03-10 that HuggingFace now allows models to be gated with manual approvals for access. Some have even argued one step further, to restrict access to computing hardware used for AI.
Other commercial news
Meta settles ad discrimination allegations. As part of Meta’s settlement with the U.S. Department of Justice, Meta now has a variance reduction system based on offline reinforcement learning for minimizing disparities in delivering advertisements for housing, employment and credit. Such systems necessarily rely heavily on perceived demographics like race which re-raise longstanding questions about the validity of monitoring discrimination with imputed characteristics.
Meta plans to enter the LLM race. On 2023-02-27, Mark Zuckerberg announced that Meta is creating a new top-level product around generative AI. Some pundits are now claiming this is a pivot away from the Metaverse to go all in against OpenAI and ChatGPT. This announcement followed from Meta AI’s announcement on 2023-02-24 of a new LLM, LLaMa, whose largest version has 65 billion parameters. Despite Yann Lecun’s earlier dismissal of LLMs as a diversion from AGI, he has now tweeted about LLaMa as a “*open-source*, high-performance large language model”. While LLaMa-65B is now state of the art (84.2%) for the HellaSwag benchmark for commonsense reasoning, a review shows 36% errors in the validation set, raising concerns about overfitting.
Google felt pressured to respond to Microsoft’s ChatGPT announcement with its own product demo of Bard which was bungled with factual errors, followed by a significant decline in Alphabet stock prices. On 2023-03-02, Google released their new LLM, Flan-UL2, on HuggingFace, just a few weeks after launching ViT-22B for vision transformers.
Microsoft’s Bing Chat AI was launched to much fanfare, but users quickly note multiple examples of the conversation derailing with all manner of toxic behaviors ranging from spurious repetitions to conspiracies to gaslighting to proclamations of love to spying on developers to accusations of hacking to death threats. Some speculated that the RLHF step was bungled . Users were quick to get Bing AI to reveal internal details such as its prompt and codename (Sydney). Rumors also surfaced that prototypes of Bing AI were tested on users in Indonesia and India, with similar issues that were apparently ignored. In response, Microsoft quickly put in guardrails and limits on the usage of Bing Chat. At the same time, Microsoft laid off its responsible AI team, joining Meta and Twitter in winding down internal investments in this space, and Google in reducing staffing for managing disinformation on YouTube.
Workday has been sued for alleged discrimination in its AI tools against applicants who are black, disabled, or older.
On 2023-01-23, FICO released its latest survey of State of Responsible AI in Financial Services, noting that just 8% of C-suite think that their AI development processes are compliant with existing model risk regulations.
Snapchat released its own chatbot, My AI, which was quickly criticized for openly abetting a user pretending to be a 13-year-old about to have sex with a 31-year-old. Similar testing with ChatGPT showed that it is still possible to bypass safety filters to generate other sexually exploitative content.
Arena Group, owner of magazines like Sports Illustrated, laid off journalists shortly after announcing their use of AI to generate content, despite its CEO promising that “AI will never replace journalism”. A close reading of one such AI-generated article revealed 18 factual errors in health and medical claims.
Stability AI belatedly now offers artists to opt out of being used for training. LAION, the nonprofit funded by Stability AI, changed the license on the LAION-400M dataset to disclaim “any real-world application”.
Apple’s AI Narrators for audiobooks, announced in January, was found to be trained on Spotify-owned audiobooks without permission. Meanwhile, voice actors are increasingly pressured to sign over contractual rights for their work to be reused by AI.
Twitter. Former head of Trust & Safety, Yoel Roth, testified to the Committee on Oversight and Accountability in the U.S. House of Representatives on social media bias. Advertisers like Fiverr continue to stop Twitter advertising over concerns of brand damage by being placed alongside tweets with harmful content as extremist accounts gain prominence. Investigative journalists discover new software for disinformation on social media.
Government and policy
The OECD.AI Policy Observatory is soliciting contributions of trustworthy tools and metrics to their catalogue.
The ISO published a new standard, ISO/IEC 23894:2023: Guidance on risk management for artificial intelligence.
On 2022-12-02, the Advisory Council on Artificial Intelligence listed a tender notice for Public Engagement on Artificial Intelligence with Indigenous communities in Canada.
The annual plenary 两会 (two sessions) meetings of the National People's Congress and the Chinese People's Political Consultative Conference ended on 2023-03-13, with semiconductors and AI a prominent topic among the political elite. Science and Technology Minister 王志刚 (Wang Zhigang) admitted the U.S.’s competitive edge shown by ChatGPT when opening a session about technological self-reliance, and delegates have called for China to develop its own answer to ChatGPT. Anecdotes abound about political censorship stifling innovation. Chatbot 任小融 (Wen Xiaorong) closed the sessions, sparking online discussions about authenticity of human experience.
On 2023-02-03, the Council of Europe’s Committee on Artificial Intelligence decided to publish their revised “Zero Draft” [Framework] Convention on Artificial Intelligence, Human Rights, Democracy and the Rule of Law, dated 2023-01-07. Notable is the call-out of intersectional discrimination testing in Article 3 and the need for risk assessment in Article 24.
On 2023-01-27, the European Court of Justice heard its first case on automated decision making, in which it is set to clarify Article 22 (“Automated individual decision-making, including profiling”) of the General Data Protection Regulation.
On 2023-01-09, the Joint Research Centre published a technical report, AI Watch: Artificial Intelligence Standardisation Landscape Update, to understand the alignment between EU regulation and relevant IEEE standards.
A watchdog report claims that big tech companies are lobbying to exclude general purpose AI like ChatGPT from the EU AI Act.
On 2023-02-16, the Bundesverfassungsgerich (Federal Constitutional Court) ruled that the use of Palantir’s predictive policing and surveillance AI was unconstitutional.
🇳🇱 The Netherlands
On 2023-02-14, Reuters reported that the Netherlands hosted a summit on the military use of AI.
On 2023-03-03, The Medicines and Healthcare products Regulatory Agency published a blog post stating that LLMs for medical purposes will be regulated as medical devices.
On 2023-01-24, The Alan Turing Institute held a showcase for Project FAIR: Framework for responsible adoption of artificial intelligence in the financial services industry.
On 2023-03-08, The U.S. Senate Committee on Homeland Security & Governmental Affairs held a public hearing on Artificial Intelligence: Risks and Opportunities. The U.S. House Committee on Oversight and Accountability also held a public hearing on Advances in AI: Are We Ready For a Tech Revolution? The National Institute for Science and Technology also published Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations for public comment.
☛ Email email@example.com by 2023-09-30.
On 2023-02-27, the Federal Trade Commission published a blog post “Keep your AI claims in check”, reminding businesses that false claims about AI are liable for findings of unfair and deceptive practices.
On 2023-02-23, the National Science Foundation announced a new $20m funding program for Safe Learning-Enabled Systems.
On 2023-02-16, President Biden signed Executive Order 14901 Further Advancing Racial Equity and Support for Underserved Communities Through The Federal Government.
☛ Email firstname.lastname@example.org by 2023-03-30.
On 2023-02-16, the U.S. Bureau on Arms Control, Verification and Compliance issued a Political Declaration on Responsible Military Use of Artificial Intelligence and Autonomy.
[FR, Twitter] On 2023-02-16, the White House Office of Science and Technology Policy issued a Request for Information for advancing effective, accountable policing and building public trust.
On 2023-01-24, the Federal Financial Institutions Examination Council’s Appraisal Subcommittee (FFEIC ASC) held a hearing about appraisal bias.
🗽 New York State and New York City
On or around 2023-02-21, the New York State Liquor Authority revoked MSG Entertainment’s liquor license in response to it banning attorneys on its grounds using facial recognition technology. MSG has countersued.
On 2023-02-16, the New York State Comptroller released an audit of the City of New York’s use of AI, recommending more oversight and noting the lack of guidance for fair and responsible use.
On 2023-03-13 9am PDT, Hugging Face held their first Ethics & Society Q&A on Discord about value-laden aspects of ML benchmarks and a paper discussion.
On 2023-03-02, the Algorithmic Justic League celebrated the 5th anniversity of the Gender Shades paper.
On 2023-02-27, Microsoft announced its Responsible AI Toolbox.
Articles and papers
This month’s roundup was dominated by papers from recent academic conferences like ICLR, SaTML and AAAI, as well as news over the use of generative AI.
Frontiers in Artificial Intelligence is soliciting papers for a special issue on Progress in Developing Applied Legal Technology, with a focus on large language models.
☛ Submit an abstract by 2023-03-29.
The 6th AAAI/ACM Conference on AI, Ethics, and Society has issued a Call for Papers.
☛ Submit a paper by 2023-03-15.
The ACM Fairness, Accountability, and Transparency (FAccT) conference has open Call for CRAFT Proposals, a dedicated interdisciplinary track for Critiquing and Rethinking Fairness, Accountability, and Transparency.
☛ Submit a CRAFT proposal by 2023-03-30.
[PDF] On 2023-02-01, the European Securities and Markets Authority published their report on Artificial intelligence in EU securities markets. Very few funds disclosed the use of AI, and their performance was not notably better than others, and notably worse in the early Covid period of late 2020. Explainability and non-traditional data (particularly in ESG) were issues for market risk reviews.
Drew Harwell, Now for sale: Data on your mental health (2023-02-13), Washington Post. Highlights the work of Joanne Kim at Duke on Data Brokers and the Sale of Americans’ Mental Health Data, showing wide availability with few privacy and recourse mechanisms.
Lidya Morrish, A Face Recognition Site Crawled the Web for Dead People’s Photos (2023-03-13), Wired. Describes how a facial search engine, PimEyes, scraped third-party website for pictures of users’ relatives without permission. Read in context with Harbinja et al. below.
[Twitter] Kate Tenbarge, Hundreds of sexual deepfake ads using Emma Watson’s face ran on Facebook and Instagram in the last two days (2023-03-07), NBC News.
Sam Biddle, U.S. Special Forces Want to Use Deepfakes for Psy-Ops (2023-03-06), The Intercept.
Pranshu Verma, They thought loved ones were calling for help. It was an AI scam. (2023-03-05), Washington Post. Describes the use of audio deepfakes by scammers.
Melissa Heikkilä, AI image generator Midjourney blocks porn by banning words about the human reproductive system (2023-02-24), Technology Review.
Tracey Spicer, AI art replicates inequity at scale. We need to learn about its biases – and outsmart the algorithm (2023-02-23), The Guardian.
[Twitter] Joseph Cox, How I Broke Into a Bank Account With an AI-Generated Voice (2023-02-23), Vice.
Edina Harbinja, Lilian Edwards, Marisa McVey Governing ghostbots (2023-02-17), Computer Law & Security Review. Discusses the privacy challenges of resurrecting dead people through modern deepfake technology.
Tatum Hunter, AI porn is easy to make now. For women, that’s a nightmare (2023-02-13), Washington Post.
[Twitter] Gianluca Mauro and Hilke Schellmann, ‘There is no standard’: investigation finds AI algorithms objectify women’s bodies (2023-02-08), The Guardian. Pictures of women with partial nudity in everyday contexts were found to be racier than similar pictures of men.
[arXiv, news, Twitter] Christian Schroeder de Witt et al., Perfectly Secure Steganography Using Minimum Entropy Coupling (2023-03-06), ICLR’23. Encodes information into joint distributions while minimizing the perturbation to the marginals.
[arXiv, Twitter, website] Kai Greshake et al., More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (2023-02-23). Demonstrates how LLMs integrated into applications, like Bing AI in the Edge browser, are vulnerable to poisoning attacks with side-loaded application state information. The website demonstrates how loading a browser window containing malicious prompts can bypass Bing’s guardrails and exfiltrate user information to third party websites.
[arXiv, Medium] Daniel Kang et al., Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks (2023-02-11). Shows that LLMs’ ability to evaluate computational expressions can be exploited to bypass safety filters.
Insikt Group, I, Chatbot (2023-01-27). Discusses how chatbots like ChatGPT spread information that enable cybercrime like phishing and developing malware.
Compliance and safety
Isabelle Bousquette, Rise of AI Puts Spotlight on Bias in Algorithms (2023-03-09), Wall Street Journal. Documents how some large companies have shunned AI due to bias concerns.
Anthropic, Core Views on AI Safety: When, Why, What, and How (2023-03-08). A (necessarily) speculative perspective on long term AI risks and the fundamental limitations of relying on human feedback to mitigate them.
[PDF] Jack Poulson, How Watchdogs are Silenced (2023-03-06), Tech Inquiry. Summarizes how multiple nonprofits and government agencies embroil themselves with potential conflicts of interest with for-profit AI companies.
Matt Kelly, More on Managing ‘ChatGPT Risk’ (2023-02-15), Radical Compliance. Argues that chatbots generate extra needs for risk assessments and control activities.
Cassandra Coyer, Absent Federal Regulation, AI Bias Liability Is Growing—And Getting More Complicated (2023-01-26), Law.com Legaltech News. Attorneys advocate bias assessment under privilege to hedge the current uncertainty around legal liabilities.
Employment and psychometrics
Melis Diken, Automated Employment Decision Tools and Screen Out (2023-02-09), Medium.com. Surveys 30 vendors, showing prevalent use of misleading language like “bias-free” and a general lack of disability accommodations.
Laura McQuillan, Want a job? You'll have to convince our AI bot first (2023-01-19), CBC.
[PsyArXiv] Jinyan Fan et al., How Well Can an AI Chatbot Infer Personality? Examining Psychometric Properties of Machine-inferred Personality Scores (2023-01-05). Compares Juji’s inferred personality scores to those from conventional personality questionnaires along the Big 5 scale, showing moderately weak correlation.
[GitHub] Colin Lecher and Maddy Varner, L.A.’s Scoring System for Subsidized Housing Gives Black and Latino People Experiencing Homelessness Lower Priority Scores (2023-02-28), The Markup and Los Angeles Times. The reporters were able to obtain raw data from the Los Angeles Homeless Services Authority containing demographic labels, a sanitized version of which is now on GitHub.
Chloe Xiang, AI Has Successfully Piloted a U.S. F-16 Fighter Jet, DARPA Says (2023-02-14), Vice.
[SSRN] European Law Institute, Model Rules on Impact Assessment of Algorithmic Decision-Making Systems used by Public Administration (2023-01-11). Proposes rules and questionnaires that are tailed to comply with EU regulations for AI.
Auditing, bias and evaluation
[arXiv, GitHub] Baolin Peng et al., Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback (2023-03-08). Demonstrates a modular approach to safeguard against hallucinations by checking against known facts.
[arXiv, GitHub] André F Cruz et al., FairGBM: Gradient Boosting with Fairness Constraints (2023-03-03), ICLR’23. Modifies LightGBM to support in-processing bias remediation.
[arXiv, GitHub, Twitter] Wei-Yin Ko et al., FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling (2023-03-02). Studies “disparate impact” bias and shows that ensemble models produce fairer results for minority classes through averaging of larger disagreements between individual models.
[OpenReview, arXiv] Zhun Deng et al., FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data (2023-03-01), ICLR’23. An in-processing method for enforcing equalized odds (or similar one-equation definition) with provable (approximate) fairness satisfaction at test time.
[arXiv, Twitter] Vittoria Dentella et al., Testing AI performance on less frequent aspects of language reveals insensitivity to underlying meaning (2023-02-27). Testing on prompts with four kinds of complex semantic constructions reveals poor robustness on transfer of meaning from common sentences to rarer ones.
[arXiv] Matúš Pikuliak, Ivana Beňová, Viktor Bachratý, In-Depth Look at Word Filling Societal Bias Measures (2023-02-24). Critiques existing methods for evaluating bias in language models for lacking adequate controls and statistical validity.
[arXiv] Jan Kocoń et al., ChatGPT: Jack of all trades, master of none (2023-02-21). Independent evaluation shows uniformly decent but not state of the art performance on a wide variety of NLP benchmarks.
[arXiv, Twitter] Lorenz Kuhn, Yarin Gal, Sebastian Farquhar, Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation (2023-02-21), ICLR’23. Instead of computing uncertainties over generated tokens, focuses on entropy in the latent encoding space.
[arXiv] Rafal Kocielnik et al., AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models (2023-02-16). Proposes a rejection sampling method for automatically generating test sentences for measuring bias in LLMs.
[arXiv, Twitter] Jakob Mökander et al., Auditing large language models: a three-layered approach (2023-02-16). Overviews the challenges of auditing LLMs but relies on existing benchmarks to establish key performance dimensions.
[arXiv] Ansong Ni et al., LEVER: Learning to Verify Language-to-Code Generation with Execution (2023-02-16). Evaluates generated code and creates joint representation of code output, code and natural text distribution to improve code generation.
Katherine Miller, How Do We Fix and Update Large Language Models? (2023-02-13), Stanford HAI. Reviews some recent work from Stanford on gradient based approaches to editing facts in LLMs. Should be read in the context of the more recent negative findings of Hase et al. covered last month.
[Twitter] Aaron R. Kaufman, Christopher Celaya and Jacob M. Grumbach, Improving Compliance in Experimental Studies of Discrimination (2023-02-13). Failure to correct first impressions of a subject’s race can attenuate the measurement of racial disparity by as much as 85%.
[arXiv, website] Yanzhe Zhang et al., Auditing Gender Presentation Differences in Text-to-Image Models (2023-02-08). Proposes a new metric for measuring gender bias in these models built on top of ConceptNet.
[pdf] Graham Neubig, Is my NLP Model Working? (2023-02-07) Summarizes state-of-the-art evaluation metrics for text generation, notably that automatic evaluation metric are only weakly correlated with human evaluations.
[arXiv] Mihir Parmar et al., Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions (2023-02-07), EACL’23. Shows evidence for bias in the natural language instructions for human annotators, and argues that the bias is measurable in smaller data sets.
[arXiv] Hyung Won Chung et al., UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining (2023-02-03), ICLR’23. Describes a new sampling scheme that effectively upsamples rarer languages, resulting in downstream models with better performance. A related project, BabelCode, shows similar effects for programming languages.
[arXiv] A. Feder Cooper et al., Variance, Self-Consistency, and Arbitrariness in Fair Classification (2023-02-01). Develops a new variance-reducing ensembling technique which purportedly allows finer-grained analysis of variance in subgroups of the data.
[arXiv] Nathanael Jo et al., Fairness in Contextual Resource Allocation Systems: Metrics and Incompatibility Results (2022-12-02), AAAI’23. Generalizes the impossibility results for group fairness of classifiers to allocation problems.
Explainability and causality
[data] Christina Korting et al., Visual Inference and Graphical Representation in Regression Discontinuity Designs (2023-03-02), The Quarterly Journal of Economics. Shows that graphs showing discontinuities have high false negative errors, but false positives can be reduced by plotting piecewise best-fit curves or with the estimation methods implemented in RDHonest.
[OpenReview, website] Patrick Altmeyer et al., Endogenous Macrodynamics in Algorithmic Recourse (2023-02-02), SATML’23. Augments counterfactual explanations with expected future dynamics from agents acting on recourse perceived through counterfactual explanations, showing that such dynamics can severely degrade the quality of explanations and even the model’s future performance.
[arXiv] Zijian Zhou et al., Probably Approximate Shapley Fairness with Applications in Machine Learning (2022-12-01), AAAI’23. Demonstrates how to sample the computation of the Shapley value while guaranteeing approximate fairness in the result.
Law and Policy
[arXiv, SSRN] Marvin van Bekkum, Frederik Zuiderveen Borgesius, Using sensitive data to prevent discrimination by artificial intelligence: Does the GDPR need a new exception? Computer Security & Law Review (2023-04). Examines the legal challenges to collect the necessary data for a discrimination audit in the EU.
Bart Hubert, The EU's new Cyber Resilience Act is about to tell us how to code (2023-03-03). Raises concerns that the current effort to create software cybersecurity requirements could backfire due to imposing impossible standards.
Meiling Fong and Zeynep Arsel, Protecting privacy online begins with tackling ‘digital resignation’ (2023-03-02), The Conversation. Summarizes Canada’s legislative approach to restore citizen sovereignty to personal data rights.
Kayla Goode, Heeu Millie Kim and Melissa Deng, Examining Singapore’s AI Progress (2023-03-01), Center for Security and Emerging Technology, Georgetown University. Reviews the Singapore government’s overall pursuit of its National Artificial Intelligence Strategy and how it reinforces Singapore’s ongoing neutrality toward both China and the U.S.
Sharon Goldman, Could Big Tech be liable for generative AI output? Hypothetically ‘yes,’ says Supreme Court justice (2023-02-21), VentureBeat.
[Website, PDF] Hiroki Habuka, Japan’s Approach to AI Regulation and Its Impact on the 2023 G7 Presidency (2023-02-14), Center for Strategic & International Studies. Overviews Japan’s soft touch, industry specific approach to AI regulation, in contract to overarching approaches like the EU AI Act.
Virgílio Almeida, Laura Schertel Mendes, and Danilo Doneda, On the Development of AI Governance Frameworks (2023-02-03), IEEE Internet Computing. Argues for co-regulation of AI between companies and governments.
[SSRN] Shakked Noy and Whitney Zhang, Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence (2023-03-11). Studies the use ChatGPT for professional writing, showing that it overall time spent is reduced by shifting the burden of work from writing to editing.
[arXiv, Twitter] Manoel Horta Ribeiro, Veniamin Veselovsky, Robert West, The Amplification Paradox in Recommender Systems (2023-02-22). Argues with simulations that algorithmic audits of recommender systems must model the user interaction with the system in order to understand the dynamics of radicalization, rabbit holes, and filter bubbles.
[arXiv, Twitter] Helena Vasconcelos et al., Generation Probabilities Are Not Enough: Exploring the Effectiveness of Uncertainty Highlighting in AI-Powered Code Completions (2023-02-14). Builds a secondary model for predicting and highlighting sections of output most likely to be edited and shows it is effective for users, although its efficacy over simply displaying posterior uncertainty of generated tokens was not clearly reported.
[arXiv, Twitter] Upol Ehsan et al., Charting the Sociotechnical Gap in Explainable AI: A Framework to Address the Gap in XAI (2023-02-01), CSCW’23. Provides detailed case studies arguing how technical tools need to be augmented with qualitative business context.
[arXiv, Twitter] Hancheng Cao et al., Breaking Out of the Ivory Tower: A Large-scale Analysis of Patent Citations to HCI Research (2023-01-31), CHI’23. Shows that patents cite HCI papers more than at academic conference, but with much longer lag in citation time.
Morten Hertzum and Kasper Hornbæk, Frustration: Still a Common User Experience (2023-01-30), CHI’23 / ACM Transactions on Computer-Human Interaction. Shows that system performance and functionality issues persist as top reasons for negative user experiences.
[arXiv] Helena Vasconcelos et al., Explanations Can Reduce Overreliance on AI Systems During Decision-Making (2023-01-26), CSCW’23. User studies show that overreliance increases when the AI task is harder to verify, and explanation techniques are useful to check AI errors when the explanations are easier to verify on when the stakes are higher.
[arXiv, GitHub] Hussein Mozannar et al., Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming (2022-11-27). Constructs a model for user state when interacting with code suggestion tools that is informed by user studies.
[biorXiv, website] Yu Takagi and Shinji Nishimoto, High-resolution image reconstruction with latent diffusion models from human brain activity (2023-03-11), CVPR’23. This paper blew up on Hacker News and Twitter with hyperbolic claims that Stable Diffusion could read minds . What they actually did was to start with the Natural Scenes dataset with fMRI of subjects looking at natural imagery, and constructed mappings between neural activity images of subjects viewing images, the text annotations associated with those images, and latent representation in Stable Diffusion of the image being viewed. They then used these mappings to reconstruct the images from the Stable Diffusion model based from the neural images. While an interesting research study that notes advances on a longstanding problem from Stable Diffusion, the results are hardly that of mind reading, but rather reconstructing latent representations of images activated by recorded neural activity.
Elizabeth M. Renieris, Claims That AI Productivity Will Save Us Are Neither New, nor True (2023-03-08), Center for International Government Innovation.
[OpenReview, GitHub] Shizhe Diao et al., Black-Box Prompt Learning for Pre-trained Language Models (2023-03-07), Transactions on Machine Learning Research. Proposes Black-box Discrete Prompt Learning, a gradient-free alternative to explicit fine-tuning that is competitive with white-box methods.
[Twitter] H. Carlens, The State of Competitive Machine Learning: 2022 Edition (2023-03-07). The emergent tech stack of choice for winners is Pytorch / LightGBM, Pandas, Python.
[SSRN, Twitter] Edward W. Felten, Manav Raj and Robert Seamans, How will Language Modelers like ChatGPT Affect Occupations and Industries? (2023-03-06). An re-scoring of occupations vulnerable to AI in the light of widespread LLM availability places marketing, social science, legal, and humanities professionals are most vulnerable to disruption.
[arXiv, GitHub, Twitter] Changyeon Kim, Jongjin Park et al., Preference Transformer: Modeling Human Preferences using Transformers for RL (2023-03-02), ICLR’23. Presents a causal (non-Markovian) transformer architecture to learn reward functions that encode human preferences.
Nur Ahmed, Muntasir Wahed and Neil C. Thompson, The growing influence of industry in AI research (2023-03-02), Science. Estimates that the global industry spent US$340 billion on AI in 2021 alone, exceeding non-defense public funding over 100-fold.
[OpenReview, Twitter] Michael Samuel Albergo and Eric Vanden-Eijnden, Building Normalizing Flows with Stochastic Interpolants (2023-03-02), ICLR’23
[arXiv] Shaohan Huang et al., Language Is Not All You Need: Aligning Perception with Language Models (2023-03-01). Introduces Microsoft’s Kosmos-1, a new multimodal large language model (MLLM).
[arXiv] Samuel K. Ainsworth, Jonathan Hayase, Siddhartha Srinivasa, Git Re-Basin: Merging Models modulo Permutation Symmetries (2023-03-01), ICLR’23. Proposes an alternative to ensembling neural networks by merging multiple trained models at the weight level.
Comet, 2023 Machine Learning Practitioner Survey (2023-02-23). Some top concerns in the MLOps community include adopting the AI Bill of Rights and evaluating generative AI.
[arXiv, Twitter] Nikhil Vyas, Sham Kakade, Boaz Barak, Provable Copyright Protection for Generative Models (2023-02-21). Trains generative models against two corpora for positive and negative learning. The training projects rejects models that generates output too close to the negative corpus.
[arXiv] Ce Zhou et al., A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT (2023-02-18). Also discusses evaluation metrics (Appendix F) and data sets (Appendix G).
[SSRN, inventory, Twitter 1, Twitter 2, website] Angelina Wang et al., Against Predictive Optimization: On the Legitimacy of Decision-Making Algorithms that Optimize Predictive Accuracy (2023-02-17). Argues that ML used to make predictions about individual behavior is inherently flawed.
[arXiv] Chengwei Qin et al., Is ChatGPT a General-Purpose Natural Language Processing Task Solver? (2023-02-15) Evaluates ChatGPT and its underlying LLM against multiple tasks: while ChatGPT shows good performance on many tasks, the RLHF component decreases performance on reasoning tasks, and other models produce better results.
Maggie Harrison, OkCupid Using ChatGPT to Make Online Dating Even More Robotic (2023-02-14), Futurism. Specifically, they made new icebreaker prompts.
[arXiv] Michael A. Lones, How to avoid machine learning pitfalls: a guide for academic researchers (rev. 2023-02-09) An updated introduction to the data science pipeline from data ETL to evaluation.
[arXiv, GitHub] Przemysław Biecek, Hubert Baniecki, Mateusz Krzyzińki, Performance is not enough: the story of Rashomon’s quartet (2023-02-26). Shows how to generate synthetic data sets with multiple models with equal performance measures.
[arXiv, GitHub] Shizhe Diao et al., Active Prompting with Chain-of-Thought for Large Language Models (2023-02-26). An active learning approach to dynamically generate prompts for stimulate reasoning chains, with improved results on complex reasoning benchmarks.
[pdf, blog, GitHub] Andrew Gelman, Jessica Hullman, and Lauren Kennedy, Causal quartets: Different ways to attain the same average treatment effect (2023-02-22). Shows how to generate synthetic data sets with equal causal effects measures.
[arXiv] Nicholas Carlini et al., Poisoning Web-Scale Training Datasets is Practical (2023-02-20). Shows that stale URLs in common datasets can be hijacked inexpensively and replaced with different data.
[arXiv, Twitter] Tomer D. Ullman, Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks (2023-02-26). Pours cold water on the idea of LLMs as AGI by directly refuting an earlier claim, showing brittleness to small changes to inputs.
[arXiv, Twitter] Jakiw Pidstrigach et al., Infinite-Dimensional Diffusion Models for Function Spaces (2023-02-20). Similar to the Nvidia paper above in using infinite-dimensional Langevin dynamics.
[Twitter] Zhiqiu Jiang et al., CommunityBots: Creating and Evaluating A Multi-Agent Chatbot Platform for Public Input Elicitation (2023-02-20). Managing user disengagement and managing topics produced higher quality crowdsourcing.
Jacob Steinhardt, Emergent Deception and Emergent Optimization (2023-02-19), Bounded Regret.
[arXiv, GitHub, Twitter] Shuyan Zhou et al., DocPrompting: Generating Code by Retrieving the Docs (2023-02-18), ICLR’23. Augments pretrained code generation models with new documentation of code, which nearly doubles performance.
[Twitter] Stephanie Wilkins, Meet Kathryn Tewson, the Paralegal Who Took on DoNotPay: 'Fraud Is Bad for Innovation' (2023-02-17), Legaltech News. This profile comes in the wake of further questionable demos from DoNotPay about lying about authorizing bank transactions in order to get them reversed.
[arXiv] Deep Ganguli et al., The Capacity for Moral Self-Correction in Large Language Models (2023-02-15). Claims that the capacity for moral self- correction emerges at 22B model parameters, whence it is sufficient to instruct models to avoid harmful outputs.
[arXiv] Jae Hyun Lim et al., Score-based Diffusion Models in Function Space (2023-02-14). Shows how to construct diffusion models in infinite dimensions using stochastic differential equations in measure space.
Paris Marx, A.I.'s dirty secret: meet the hidden human workforce behind the boom in artificial intelligence (2023-02-12), Business Insider.
[arXiv] Ashok Cutkosky, Harsh Mehta and Francesco Orabona. Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion (2023-02-11). Uses online learning to predict optimal linearizations for stochastic gradient descent, which enables provably optimal algorithms for finding non-smooth stationary points which use modified momentum and clipping.
[News] Joseph Turow et al., Americans Can’t Consent to Companies’ Use of Their Data (2023-02-07). Shows that the majority of surveyed American don’t understand their rights under most privacy policies and laws.
[arXiv, Twitter] Irene Solaiman, The Gradient of Generative AI Release: Methods and Considerations (2023-02-05). Critically reviews the various ways AI models are released.
[arXiv] Charlie Chen et al. (DeepMind), Accelerating Large Language Model Decoding with Speculative Sampling (2023-02-03). Uses rejection sampling to extrapolate samples from a smaller model to reduce overall cost.
[OpenReview] Amanda Lee Coston et al., SoK: A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms (2023-01-23), SATML’23. Proposes a taxonomy of criteria to assess justifications for decision-making AI centering around the validity of predictive models.
Josh A. Goldstein et al., Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations (2023-01-10). Calls out the potential for using LLMs in disinformation and possible mitigations.
Jessica Newman, A Taxonomy of Trustworthiness for Artificial Intelligence (2023-01), Berkeley Center for Long-Term Cybersecurity.
Thanks for reading This month in Responsible AI! Subscribe for free to receive new posts and support my work.