News + Trends

Google I/O: "AI" appears once a minute in the keynote - that says it all

14.5.2024

Translation: machine translated

From homework help and searching for photos to planning your next holiday trip: Google's AI tools are designed to help you in all situations. Google presented the latest possibilities at the keynote of this year's I/O developer conference.

For two hours at the start of I/O, it was all about AI under the motto "Making AI helpful for everyone". The English abbreviation AI was used a total of 120 times in the presentation. Google had its AI counted. Overall, the focus was on which of Google's own products it is integrated into, how it can help and which Large Language Models (LLM) are running in the background. While some tools will soon be usable on your smartphone, you will have to wait for others. Some even require the use of a paid service from Google.

AI helps with the search

Search is still Google's centrepiece. Accordingly, the company is trying to make it even more indispensable with AI tools. For example, with "AI Overviews". These are AI overviews as search results that provide answers to very specific questions and draw on Google's entire data pool. This also includes Google Maps and reviews of shops, restaurants and so on. An example would be a search for a yoga studio that also offers Pilates, is well rated and can be reached within fifteen minutes. AI Overviews are available in the USA from today - and should be available to a billion people around the world by the end of the year. They should therefore also be coming to Europe in the next few months.

You can use Google's search function for planning. For example, you can create a meal plan for four people for a week, taking one or two intolerances into account. The AI should also be able to create training plans.

The expansion of Google's LLMs, such as Gemini 1.5 Pro, enables the new "Ask with Video" tool. As with an image search, you will be able to insert a video into the search window in the coming weeks. Google says that Gemini 1.5 Pro already has the largest context window with one million tokens. The value indicates how much input data the AI can process. According to Google, the million corresponds to one hour of video, 30,000 lines of code and 14,000 pages of text. Google wants to double this framework to two million tokens this year.

Google is proud of the capacities of its AI models.
Source: Google

Have photos found

If you have uploaded pictures to Google Photos, you can ask questions and the AI will provide answers. So you no longer have to look through all your holiday pictures to find out the name of that delicious restaurant on holiday. You simply ask: "Which restaurant did we go to in Rome?" Once you have photographed it, you get the right answer. One example from Google is the question of when your child learnt to swim. The AI then searches for the oldest pictures of the swimming child.

Tools designed to support creative people

Several new tools are designed to support creative professionals in their work - and not replace them. The "Music AI Sandbox", for example, makes suggestions to musicians about what else they could try out when working on a song. Wyclef Jean is one of the artists who has already been allowed to try out the tool and delivers his first song created with it:

With "Veo", Google is working on a generative model for videos. Like the music tool, it is currently only available to selected people. Director and actor Donald Glover is set to publish a video or short film created with Veo in the near future.

Google is already accepting applications to try out "Imagen 3". The third generation of the generative image creation tool is designed to understand more extensive prompts and generate more details and fewer artefacts.

Android gets even more AI built in

On Android, Google seems to be slowly replacing the Assistant with Gemini. The AI tool is intended to act as a better, modern (voice) assistant. With "On Device AI", the tool can use personal data that remains on the device during processing.

Google is building AI into Android. However, often only its own Pixel devices benefit first.
Source: Google

The AI search functions already mentioned will also be available on Android. Google is also adding learning content to Circle-to-search with its new LearnLM AI model. For example, you can use it to circle school assignments and get not just the answer, but an explanation or the solution. Google wants to further expand the subject areas and complexity of the tasks. Google is also using LearnLM elsewhere to build a tutor for learners or an assistant for teachers.

Gemini can also create references if you want to know something on your smartphone. For example, the tool understands that your question relates to the video you are watching. You can also generate an image in chats using the keyboard or, more precisely, the Gemini button and insert it into the ongoing conversation.

Help in many situations

If you have to deal with various points for a task, Gemini will help you complete it. For example, your ordered shoes are too small: send Gemini a photo with the comment that the shoes need to be returned. The AI recognises which parcel service is responsible for collecting them and arranges an appointment for you - using the appropriate email for the return process.

I found the demonstration by Project Astra even more impressive. A talking AI with access to a smartphone camera that also works in smart glasses. It can tell you when it recognises something you are looking for. But it can also remember where something is. It recognises the surroundings and can describe what it sees, answer questions or even explain code.

Another tool can convert data into a conversation, Google calls it "Audio Overwiews". If the data entered is textbooks, the content can be explained as a dialogue. You can also ask questions at any time, which are then answered.

Use AI responsibly

Google is committed to using AI responsibly. The aim is to find gaps and errors in its own models, but also to prevent the misuse of AI. Content created by an AI is given an irremovable watermark via SynthID, which makes it clearly recognisable as artificially created. Google is now extending this system from photos to music and videos. An open source watermarking system for texts is due to be released in the coming months.

Header image: Google

32 people like this article

Jan Johannsen

Senior Editor

Jan.Johannsen@galaxus.de

As a primary school pupil, I used to sit in a friend's living room with many of my classmates to play the Super NES. Now I get my hands on the latest technology and test it for you. In recent years at Curved, Computer Bild and Netzwelt, now at Digitec and Galaxus.

26 comments

later