AI Week: Google Goes All Out on I/O as Regulations Increase

Keeping up with an industry that moves as fast as AI is a difficult task. So until an AI can do it for you, here’s a helpful roundup of the past week’s stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

This week, Google dominated the AI ​​news cycle with a range of new products launching at its annual I/O developer conference. They range from a code-generating AI meant to compete with GitHub’s Copilot to an AI music generator that turns text prompts into short songs.

A fair amount of these tools appear to be legitimate labor savers, rather than marketing fluff, that is. I’m particularly intrigued by Project Tailwind, a note-taking app that leverages AI to organize, summarize, and analyze files in a personal Google Docs folder. But they also expose the limitations and shortcomings of even today’s best AI technologies.

Take PaLM 2, for example, Google’s newest Large Language Model (LLM). PaLM 2 will power Google’s updated Bard chat tool, the company’s competitor to OpenAI’s ChatGPT, and serve as the base model for most of Google’s new AI features. But while PaLM 2 can write code, email, and more like comparable LLMs, it also answers questions in toxic and biased ways.

Google’s music generator is also quite limited in what it can achieve. As I wrote on my hands, most of the songs I’ve created with MusicLM sound passable at best, and at worst, like a four-year-old loose in a JACKDAW.

Much has been written about how AI will replace jobs, potentially the equivalent of 300 million full-time jobs, according to a report by Goldman Sachs. in a survey Per Harris, 40% of workers familiar with ChatGPT, OpenAI’s AI-powered chatbot tool, are worried it will replace their jobs entirely.

Google’s AI is not the end of everything. Indeed, the company possibly behind in the AI ​​race. But it is an undeniable fact that Google employs some of the best AI researchers in the world. And if this is the best they can manage, it’s a testament to the fact that AI is far from a solved problem.

Here are the other AI headlines of note from the past few days:

  • Meta brings generative AI to ads: This week, Meta announced an AI sandbox of sorts for advertisers to help create alternate copy, background generation via text prompts, and image cropping for Facebook or Instagram ads. The company said the features are available to select advertisers right now and will expand access to more advertisers in July.
  • Context added: Anthropic has expanded the context window for Claude, its flagship text generation AI model, still in preview, from 9,000 tokens to 100,000 tokens. The context window refers to the text that the model considers before generating additional text, while the tokens represent raw text (eg, the word “fantastic” would be split into the tokens “fan”, “tas”, and “tic”). Historically and even today, poor memory has been an impediment to the usefulness of text-generating AI. But larger context windows could change that.
  • Anthropic touts ‘constitutional AI’: Larger context windows are not the only differentiator for anthropic models. This week, the company detailed “constitutional AI,” its internal AI training technique that aims to imbue AI systems with “values” defined by a “constitution.” Unlike other approaches, Anthropic argues that constitutional AI makes the behavior of systems easier to understand and simpler to adjust as needed.
  • An LLM built for research: The Allen Institute for AI Research (AI2), a non-profit organization, announced that it plans to train a research-focused LLM called the Open Language Model, adding to the large and growing open source library. AI2 sees the Open Language Model, or OLMo for short, as a platform and not just a model, which will allow the research community to take each component AI2 creates and use it themselves or seek to improve it.
  • New background for AI: In other AI2 news, AI2 Incubator, the nonprofit’s AI startup fund, is accelerating again to three times its previous size: $30 million vs. $10 million. Twenty-one companies have passed through the incubator since 2017, attracting about $160 million in additional investment and at least one major acquisition: XNOR, an AI acceleration and efficiency team that Apple later acquired for about $200 million.
  • The EU introduces rules for generative AI: In a series of votes in the European Parliament, MEPs this week backed a series of amendments to the bloc’s AI bill, including setting out requirements for so-called foundational models that underpin generative AI technologies like OpenAI’s ChatGPT. The amendments place the onus on basic model providers to apply security controls, data governance measures, and risk mitigations before putting their models on the market.
  • A universal translator: Google is testing a powerful new translation service that shrinks video into a new language while lip-syncing the speaker to words they never said. It could be very useful for many reasons, but the company was candid about the potential for abuse and the steps taken to prevent it.
  • Automated explanations: LLMs along the lines of OpenAI ChatGPT are often said to be a black box, and certainly, there is some truth to that. In an effort to peel off its layers, OpenAI is developing a tool to automatically identify which parts of an LLM are responsible for which of their behaviors. The engineers behind it stress that it’s in the early stages, but the code to run it is available open source on GitHub starting this week.
  • IBM launches new AI services: At its annual Think conference, IBM announced IBM Watsonx, a new platform that offers tools for building AI models and provides access to pre-trained models to generate computer code, text and more. The company says the launch was motivated by the challenges many companies still experience when implementing AI in the workplace.

Other machine learning

Image Credits: landing AI

Andrew Ng’s new company landing AI is taking a more intuitive approach to creating computer vision training. Making a model understand what you want to identify in the images is quite laborious, but his “visual cue” technique it allows you to do a few strokes and figure out your intention from there. Anyone who has to build segmentation models is saying “OMG, finally!” Probably many grad students currently spending hours masking organelles and household objects.

Microsoft has applied broadcast models in a unique and interesting way, essentially using them to generate an action vector instead of an image, having trained it on many observed human actions. It’s still very early days and diffusion isn’t the obvious solution to this, but since they’re stable and versatile, it’s interesting to see how they can be applied beyond purely visual tasks. His paper will be featured in ICLR later this year.

Image Credits: Goal

Meta is also pushing the limits of AI with image link, which it claims is the first model that can process and integrate data from six different modalities: images and video, audio, 3D depth data, thermal information, and motion or position data. This means that in its small machine learning embedding space, an image can be associated with a sound, a 3D shape, and various text descriptions, any of which can be queried or used to make a decision. It’s a step towards “general” AI in that it absorbs and associates data more like the brain, but it’s still basic and experimental, so don’t get too excited just yet.

If these proteins touch each other… what happens?

Everyone was excited about AlphaFold, and for good reason, but in reality structure is only a small part of the complex science of proteomics. How these proteins interact is important and hard to predict, but this new EPFL PeSTo Model try to do just that. “It focuses on important atoms and interactions within the protein structure,” said lead developer Lucien Krapp. “It means that this method effectively captures complex interactions within protein structures to enable accurate prediction of protein binding interfaces.” Although not exact or 100% reliable, not having to start from scratch is very helpful for researchers.

The feds are going crazy with AI. The president even dropped into a meeting with a group of top AI CEOs say how important it is to get this right. Maybe a bunch of corporations aren’t necessarily the right ones to ask, but at least they’ll have some ideas worth considering. But they already have lobbyists, don’t they?

I’m more excited about it new AI research centers popping up with federal funding. Basic research is badly needed to counter the product-focused work being done by companies like OpenAI and Google, so when you have AI centers with mandates to research things like social sciences (at CMU)or climate change and agriculture (at the U of Minnesota), feels like green fields (both figuratively and literally). Although I also want to give a little shout out to this Meta-research on forest measurement.

Making AI together on a big screen: it’s science!

There are many interesting conversations about AI. I thought this interview with UCLA (my alma mater, go Bruins) scholars Jacob Foster and Danny Snelson it was interesting. Here’s a great idea about LLMs to pretend you came up with this weekend when people are talking about AI:

These systems reveal how formally consistent most writing is. The more generic the formats that simulate these predictive models, the more successful they are. These developments push us to recognize the normative functions of our forms and potentially transform them. After the introduction of photography, which is very good at capturing a representational space, the pictorial environment developed into impressionism, a style that completely rejected precise representation in favor of the materiality of painting itself.

Definitely using that!


Scroll to Top