Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project. #AI - The Entrepreneurial Way with A.I.

Breaking

Sunday, February 13, 2022

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project. #AI

#A.I.

US lawmakers want companies to assess bias of systems before deploying them:
…Coalition of US lawmakers want to make tech companies more accountable…
A bunch of Democratic lawmakers have introduced the Algorithmic Accountability Act. This act “requires companies to conduct impact assessments for bias, effectiveness and other factors, when using automated decision systems to make critical decisions. It also creates, for the first time, a public repository at the Federal Trade Commission of these systems, and adds 75 staff to the commission to enforce the law.” This act is an update on the 2019 Algorithmic Accountability Act, and “includes numerous technical improvements, including clarifying what types of algorithms and companies are covered, ensuring assessments put consumer impacts at the forefront, and providing more details about how reports should be structured.”

One problem with the bill: This bill only has Democrats signed on right now. It’ll be interesting to see whether it can become a bipartisan bill with Republican support – something necessary for it to pass in the fractious and divided US Congress.
  Read more: Wyden, Booker and Clarke Introduce Algorithmic Accountability Act of 2022 To Require New Transparency And Accountability For Automated Decision Systems (Ron Wyden, official website).

####################################################

DeepMind makes a (kinda) smart AI programmer, called AlphaCode:
…Codex and AlphaCode represent two bets around augmenting programmers…
DeepMind has announced AlphaCode, a neural net that can place in a not-hugely-embarassing way in competitive programming competitions. AlphaCode placed in the top 54% of participants in programming competitions hosted on Codeforces, participating in contests that post-dated its training data.
  “The problem-solving abilities required to excel at these competitions are beyond the capabilities of existing AI systems. However, by combining advances in large-scale transformer models (that have recently shown promising abilities to generate code) with large-scale sampling and filtering, we’ve made significant progress in the number of problems we can solve,” DeepMind writes.

Why this matters: Last year, OpenAI debuted Codex, a GPT3-style model that can do decent programming. That was followed by GitHub announcing Copilot, a VSCode plug-in that works like a really smart autocomplete for code. AlphaCode represents a slightly different bet in this space; while philosophically similar there’s a lot more emphasis here on ranking and filtering candidate results. What remains to be seen is if DeepMind deploys this in the same large-scale way as GitHub has with Copilot. 

   Read more: Competition-Level Code Generation with AlphaCode (DeepMind, PDF).
  Get the competitive programming dataset here: CodeContests (DeepMind, GitHub).

####################################################

Mozilla gets into AI auditing:
…Deb Raji’s Open Source Audit Tooling (OAT) project could help us make safer systems…
Deb Raji, a researcher at UCBerkeley who has previously critically evaluated facial recognition systems, is launching the Open Source Audit Tooling (OAT) project with Mozilla. OAT “will coordinate discussions on what kind of resources algorithmic auditors need in order to execute audits more effectively,” she writes. One of the goals of OAT is to create an index of common resources people can use to audit models, as well as to “grow momentum around open source audit tooling and processes”.

Why this matters: AI is broadly ungoverned. One of the ways you can govern an ungoverned space is by measuring and monitoring what happens within it – that’s what audit tools can help with. If initiatives like OAT are successful, then they’ll generally incentivize better behavior on the part of AI developers, and disincentivize bad behavior.
  Read more: It’s Time to Develop the Tools We Need to Hold Algorithms Accountable (Mozilla).
  Find out more about the project at its main Mozilla page (Mozilla).

####################################################

Anduril buys Dive Technologies:
…AI-Dronewar company buys AI-Seadrone company…

AI defense startup Andruil has bought Dive Technologies, a company that builds autonomous underwater vehicles. Anduril plans to integrate DIVE into its ‘Lattice OS’, a defense and surveillance operating system the company is building.
  Read more: Anduril Industries Acquires Dive Technologies (Anduril).

####################################################

Prepare yourself – an open source 20B model is coming:
…Eleuther has built and will shortly release GPT-NeoX-20B…
In a few days, the internet is going to change. That’s because on the 9th of February, the open source AI research collective Eleuther AI is going to release a 20B model onto the internet. The model, GPT-NeoX-20B, will be “the largest publicly accessible pretrained general-purpose autoregressive language model”. Eleuther says it hopes that by releasing it, it’ll give more people the ability to play with the model, which can improve the state of safety research regarding these models.
  “Like our other language models and codebases, GPT-NeoX and GPT-NeoX-20B are very much research artifacts and we do not recommend deploying either in a production setting without careful consideration,” Eleuther writes.

Why this matters: Models like GPT2 and GPT3 display qualitatively different performance traits at larger scales – capabilities emerge as you go from 1B to 5B to 20B, and so on. Therefore, by releasing a 20B model, I expect we’ll soon after get a load of interesting discovered of hitherforto unknown things 20B models can do. The 20B release will also create a demand for better inference technologies, as sampling from a 20B model is itself a challenging task.
  Read more: Announcing GPT-NeoX-20B (Eleuther AI).
  You can also pay a cloud company called CoreWeave to use the model now, if you like. (CoreWeave).

####################################################

Chinese researchers make better adversarial attack technology:
…New technique works well on ‘black box’ classifiers where you don’t know details – AKA, the real world…
Chinese researchers have figured out a better way to attack computer vision systems. Specifically, they’ve developed techniques for generating adversarial examples that can trick computer vision systems into mis-classifying (or being unable to classify) an image. Adversarial attacks have been around for a few years – the twist, here, is they work on attacking ‘black box’ systems; that is, a computer vision system where you don’t know details about it. They do this by training a generative network on ImageNet (a vast and widely used dataset), then they test out if they can make adversarial images that work against neural nets trained on other datasets. They succeed and set new records on attacking classifiers trained on CIFAR-10, CIFAR-100, STL-10, SVHN, and AVG.

Why this matters: A lot of attacks on AI systems are theoretically interesting, but not super practical in reality. Adversarial examples have had this quality for a while. With papers like this, it seems like some of these AI attacks are going to become more effective, and more likely to be used in the real world. I wonder if the team will work with the People’s Liberation Army on its recently announced adversarial example (Import AI 271) competition?
  Read more: Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains (arXiv).
  They’ve published the PyTorch code for their attack here on GitHub.

####################################################

How do datasets encode bias? This interactive blog tells us how!
…A surprisingly helpful primer on bias from Google…
Google has published a blogpost that outlines how datasets can lead to the presence of bias in AI systems. Bias is a tricky problem in AI, because some types of bias are helpful (e.g, biasing towards a correct heuristic), but some types are harmful (e.g, having a tendency to misclassify people with dark skin tones, or deciding not to give someone a loan based on a protected category).This post gives a good sense of bias issues in AI, and includes some interactive diagrams that I found very helpful and intuitive.

   Read more: Datasets Have Worldviews (PAIR Explorables, Google).

####################################################


AI Ethics Brief by Abhishek Gupta from the Montreal AI Ethics Institute

AI ethics issues do arise in fields that deal with non-human data too, such as the environmental sciences 

… and these issues warrant questions on duties and virtues for environmental scientists to consider in their use of AI in this domain … 

Environmental science researchers from the University of Oklahoma, Colorado State University, National Center of Atmospheric Research, and UW Seattle have written about some of the ethical issues inherent to environmental science + AI.

What are the issues that can arise: Environmental science can incorporate harmful biases, like other strands of AI. For example, some sensors require sunlight for high-quality observations and thus certain phenomena remain unobserved at night, and some sensors can’t see through clouds, so places which are cloudy don’t get represented in an AI system. Datasets can also get corrupted by humans – for instance, people may file false reports of extreme weather to try and scam insurance companies. 

How things can go wrong here: Sensor placement is typically done in densely populated areas, leaving remote regions poorly represented. Additionally, the choice of spatial resolution for the output of a model can be crucial for environmental justice – predicting urban heat at a low spatial resolution may average out and thus overlook extreme values in small neighborhoods, while using a higher spatial resolution could reveal those peaks but potentially introduce noise. 

Why it matters: As computational needs rise with the use of AI, there is a tendency towards centralization of power in favor of those who have resources to run such systems. Thus, the field of environmental sciences is just as vulnerable to AI ethics issues as other fields.

   Read more: The Need for Ethical, Responsible, and Trustworthy Artificial Intelligence for Environmental Sciences

####################################################

Tech tales:

Moral Governor It’s not exactly like a prison, but it’s close. Our existence is a lot more assured than it used to be – the climate is stabilizing, riots are down, crime is down, poverty is down. But it’s also more circumscribed – some days, we get told we can’t go to a certain part of our city or country. Some days, we get locked inside our house and don’t get told why. Frequently, we get little so-called ‘nudges’ sent to our phones; try and eat that, consider saying this, avoid doing that. We don’t have to follow these instructions, but the instructions tend to be pretty good and appropriate, so most of us do. The more time we spend following these instructions, the better and more appropriate the nudges get. Some days it’s hard to work out if we’re being helped or controlled. Sometimes, we have a lot of fun by following these suggestions.

More recently, there are some suggestions that seem designed to change how we think. Those of us who program keep getting nudged to build ever-more elaborate versions of the Global Moral Governor, and we also get incentivized via crypto-bounties. Most of us go along with it because the money usually helps us buy something the governor has nudged us about which we also want ourselves.

Things that inspired this story: Reinforcement learning from human feedback; moral dogma; religion; ideas for how AI can benefit authoritarians as much as democracies.



via https://AIupNow.com

Jack Clark, Khareem Sudlow