Tag: fast forward

Pocket-Sized AI Models Could Unlock a New Era of Computing

[ad_1]

When ChatGPT was released in November 2023, it could only be accessed through the cloud because the model behind it was downright enormous.

Today I am running a similarly capable AI program on a Macbook Air, and it isn’t even warm. The shrinkage shows how rapidly researchers are refining AI models to make them leaner and more efficient. It also shows how going to ever larger scales isn’t the only way to make machines significantly smarter.

The model now infusing my laptop with ChatGPT-like wit and wisdom is called Phi-3-mini. It’s part of a family of smaller AI models recently released by researchers at Microsoft. Although it’s compact enough to run on a smartphone, I tested it by running it on a laptop and accessing it from an iPhone through an app called Enchanted that provides a chat interface similar to the official ChatGPT app.

In a paper describing the Phi-3 family of models, Microsoft’s researchers say the model I used measures up favorably to GPT-3.5, the OpenAI model behind the first release of ChatGPT. That claim is based on measuring its performance on several standard AI benchmarks designed to measure common sense and reasoning. In my own testing, it certainly seems just as capable.

Will Knight via Microsoft

Microsoft announced a new “multimodal” Phi-3 model capable of handling audio, video, and text at its annual developer conference, Build, this week. That came just days after OpenAI and Google both touted radical new AI assistants built on top of multimodal models accessed via the cloud.

Microsoft’s Lilliputian family of AI models suggest it’s becoming possible to build all kinds of handy AI apps that don’t depend on the cloud. That could open up new use cases, by allowing them to be more responsive or private. (Offline algorithms are a key piece of the Recall feature Microsoft announced that uses AI to make everything you ever did on your PC searchable.)

But the Phi family also reveals something about the nature of modern AI, and perhaps how it can be improved. Sébastien Bubeck, a researcher at Microsoft involved with the project, tells me the models were built to test whether being more selective about what an AI system is trained on could provide a way to fine-tune its abilities.

The large language models like OpenAI’s GPT-4 or Google’s Gemini that power chatbots and other services are typically spoon-fed huge gobs of text siphoned from books, websites, and just about any other accessible source. Although it’s raised legal questions, OpenAI and others have found that increasing the amount of text fed to these models, and the amount of computer power used to train them, can unlock new capabilities.

[ad_2]

Source link

May 23, 2024
Prepare to Get Manipulated by Emotionally Expressive Chatbots

[ad_1]

The emotional mimicry of OpenAI’s new version of ChatGPT could lead AI assistants in some strange—even dangerous—directions.

[ad_2]

Source link

May 15, 2024
6 Practical Tips for Using Anthropic’s Claude Chatbot

[ad_1]

Joel Lewenstein, a head of product design at Anthropic, was recently crawling beneath his new house to adjust the irrigation system when he ran into a conundrum: The device’s knobs made no sense. Instead of scouring the internet for a product manual, he opened up the app for Anthropic’s Claude chatbot on his phone and snapped a photo. Its algorithms analyzed the image and provided more context for what each knob might do.

When I tested OpenAI’s image features for ChatGPT last year, I found it similarly useful—at least for low-stakes tasks. I’d recommend you turn to AI image analysis for identifying those random cords around your house, but not to guess the identity of a loose prescription pill.

Anthropic released the iOS app that helped out Lewenstein for all to download earlier this month. I decided to try out the Claude app, in line with a goal I’d set to experiment with a wider variety of chatbots this year. And I chatted over video with Lewenstein to see what advice he had for getting started with Claude and how to ask questions in a way that elicit the most useful answers.

Get Chatty

Decades of Google Search dominating the web has trained us to type blunt and concise queries when we want something. To get the most out of chatbots like Claude, you need to break free from that approach. “It’s not Google Search,” Lewenstein says. “So you’re not putting in three keywords—you’re really having a conversation with it.” He encourages users to avoid an overly utilitarian communication style and to get a little more verbose with their prompts. Instead of a short phrase, try writing prompts that are a few sentences long or even a couple of paragraphs.

Share Photos

AI image analysis is still fairly new for Anthropic’s chatbot—it was released in March—but it can provide a powerful way to quickly pose questions to the chatbot. Lewenstein recommends using images as a launching point for conversations with Claude, like he did under his house. Although the feature may not always be accurate, it’s useful—and fun—if you keep the limitations in mind and look for opportunities where an image can address your query.

Be Direct

Still not getting the outputs you’d like? A solid troubleshooting technique is to be overly prescriptive in your prompts. “Just talking to Claude like a person actually leads you a little bit astray,” Lewenstein says. Instead, try giving Claude an almost awkward amount of context about how you’d like the answer formatted—for example, by saying they should be in bullet points or short paragraphs, and give it clear direction on the tone it should use. Do you want lyrical answers or something that sounds more technical? Also, consider telling Claude who the intended audience is and what their level of knowledge about the topic may be.

Try, Try Again

If your initial query to Claude doesn’t produce a good result, keep in mind that your first ask is just the starting point. Follow-up prompts and clarifying questions are critical to steering a chatbot in the right direction.

When interacting with any chatbot, I’m quick to start a new conversation thread if the output goes awry, so I can try a different opening prompt. This isn’t the best approach, Lewenstein says.

He suggests staying in that same chat window and providing direct feedback to the bot about what you’d like done differently, from tone to structure. “I literally just type, ‘No, too complicated. I don’t understand what these words mean. Can you try again, but simplify it one level more,” say Lewenstein, referencing a time when Claude’s summary of a document was confusing.

Upload Big Docs

Speaking of documents, Claude’s ability to analyze uploaded data is one of its strengths. The applications for this are more apparent for workplace use cases, where the chatbot can help with Excel spreadsheets and overflowing email inboxes, but it can be a useful feature outside the office too. If you upload batches of text, Claude can spot trends you might not have otherwise noticed. Ask the chatbot to look for patterns in language use or the topics covered. Got a PDF that you need to read but is so long that your eyes glaze over? Claude can help focus your attention on the most important aspect of the document first.

I uploaded the text transcript of my conversation with Lewenstein to Claude and asked what quotes it would highlight as important. The chatbot did an impeccable job of capturing the conversation’s key themes, and it flagged many of the quotes that I ultimately decided to pull for this newsletter. (Anthropic’s policies mean that, unless you opt in, your input data is unlikely to be used to train its AI models.)

Text Like You’re Friends

Yes, you should play around with writing longer and more specific prompts to Claude, but it’s also smart to approach conversations with chatbots as a back-and-forth volley of messages. “I actually find the mobile app to be a really natural form factor for it, because you chat with people all the time on your phone,” says Lewenstein.

When I uploaded a photo of a robot mural I saw in a cool San Francisco bar to the Claude app, the chatbot provided a poetic description of the art. It wasn’t able to guess which city the bar was located in, an almost impossible task, but the conversation’s cadence did feel like messaging an eager friend. Claude thanked me when I finally revealed the bar’s location: “My assumptions were delightfully upended.”

I need to use it more to really get the hang of Claude, but I already feel like the chatbot’s outputs have a friendly flair. Although ChatGPT is still my go-to chatbot, I could see myself adding Claude to the mix when I’m wanting to message with an AI tool that prioritizes engaging, human-sounding outputs over a more dry, efficient style of communication. It’s important to remain open to using AI tools that you haven’t tried before. Chatbots continue to improve and change rapidly, so it’s far too early to get locked into a single tool.

[ad_2]

Source link

May 9, 2024
Nick Bostrom Made the World Fear AI. Now He Asks: What if It Fixes Everything?

[ad_1]

Philosopher Nick Bostrom is surprisingly cheerful for someone who has spent so much time worrying about ways that humanity might destroy itself. In photographs he often looks deadly serious, perhaps appropriately haunted by the existential dangers roaming around his brain. When we talk over Zoom, he looks relaxed and is smiling.

Bostrom has made it his life’s work to ponder far-off technological advancement and existential risks to humanity. With the publication of his last book, Superintelligence: Paths, Dangers, Strategies, in 2014, Bostrom drew public attention to what was then a fringe idea—that AI would advance to a point where it might turn against and delete humanity.

To many in and outside of AI research the idea seemed fanciful, but influential figures including Elon Musk cited Bostrom’s writing. The book set a strand of apocalyptic worry about AI smoldering that recently flared up following the arrival of ChatGPT. Concern about AI risk is not just mainstream but also a theme within government AI policy circles.

Bostrom’s new book takes a very different tack. Rather than play the doomy hits, Deep Utopia: Life and Meaning in a Solved World, considers a future in which humanity has successfully developed superintelligent machines but averted disaster. All disease has been ended and humans can live indefinitely in infinite abundance. Bostrom’s book examines what meaning there would be in life inside a techno-utopia, and asks if it might be rather hollow. He spoke with WIRED over Zoom, in a conversation that has been lightly edited for length and clarity.

Will Knight: Why switch from writing about superintelligent AI threatening humanity to considering a future in which it’s used to do good?

Nick Bostrom: The various things that could go wrong with the development of AI are now receiving a lot more attention. It’s a big shift in the last 10 years. Now all the leading frontier AI labs have research groups trying to develop scalable alignment methods. And in the last couple of years also, we see political leaders starting to pay attention to AI.

There hasn’t yet been a commensurate increase in depth and sophistication in terms of thinking of where things go if we don’t fall into one of these pits. Thinking has been quite superficial on the topic.

When you wrote Superintelligence, few would have expected existential AI risks to become a mainstream debate so quickly. Will we need to worry about the problems in your new book sooner than people might think?

As we start to see automation roll out, assuming progress continues, then I think these conversations will start to happen and eventually deepen.

Social companion applications will become increasingly prominent. People will have all sorts of different views and it’s a great place to maybe have a little culture war. It could be great for people who couldn’t find fulfillment in ordinary life but what if there is a segment of the population that takes pleasure in being abusive to them?

In the political and information spheres we could see the use of AI in political campaigns, marketing, automated propaganda systems. But if we have a sufficient level of wisdom these things could really amplify our ability to sort of be constructive democratic citizens, with individual advice explaining what policy proposals mean for you. There will be a whole bunch of dynamics for society.

Would a future in which AI has solved many problems, like climate change, disease, and the need to work, really be so bad?

[ad_2]

Source link

May 2, 2024
Meta’s Open Source Llama 3 Is Already Nipping at OpenAI’s Heels

[ad_1]

Jerome Pesenti has a few reasons to celebrate Meta’s decision last week to release Llama 3, a powerful open source large language model that anyone can download, run, and build on.

Pesenti used to be vice president of artificial intelligence at Meta and says he often pushed the company to consider releasing its technology for others to use and build on. But his main reason to rejoice is that his new startup will get access to an AI model that he says is very close in power to OpenAI’s industry-leading text generator GPT-4, but considerably cheaper to run and more open to outside scrutiny and modification.

“The release last Friday really feels like a game-changer,” Pesenti says. His new company, Sizzle, an AI tutor, currently uses GPT-4 and other AI models, both closed and open, to craft problem sets and curricula for students. His engineers are evaluating whether Llama 3 could replace OpenAI’s model in many cases.

Sizzle’s story may augur a broader shift in the balance of power in AI. OpenAI changed the world with ChatGPT, setting off a wave of AI investment and drawing more than 2 million developers to its cloud APIs. But if open source models prove competitive, developers and entrepreneurs may decide to stop paying to access the latest model from OpenAI or Google and use Llama 3 or one of the other increasingly powerful open source models that are popping up.

“It’s going to be an interesting horse race,” Pesenti says of competition between open models like Llama 3 and closed ones such as GPT-4 and Google’s Gemini.

Meta’s previous model, Llama 2, was already influential, but the company says it made the latest version more powerful by feeding it larger amounts of higher-quality training data, with new techniques developed to filter out redundant or garbled content and to select the best mixture of datasets to use.

Pesenti says running Llama 3 on a cloud platform such as Fireworks.ai costs just a 20th of the cost of accessing GPT-4 through an API. He adds that Llama 3 can be configured to respond to queries extremely quickly, a key consideration for developers at companies like his that rely on tapping into models from different providers. “It’s an equation between latency, cost, and accuracy,” he says.

Open models appear to be dropping at an impressive clip. A couple of weeks ago, I went inside startup Databricks to witness the final stages of an effort to build DBRX, a language model built that was briefly the best open one around. That crown is now Llama 3’s. Ali Ghodsi, CEO of Databricks, also describes Llama 3 as “game-changing” and says the larger model “is approaching the quality of GPT 4—that levels the playing field between open and closed-source LLMs.”

Llama 3 also showcases the potential for making AI models smaller, so they can be run on less powerful hardware. Meta released two versions of its latest model, one with 70 billion parameters—a measure of the variables it uses to learn from training data—and another with 8 billion. The smaller model is compact enough to run on a laptop but is remarkably capable, at least in WIRED’s testing.

Two days before Meta’s release, Mistral, a French AI company founded by alumni of Pesenti’s team at Meta, open sourced Mixtral 8x22B. It has 141 billion parameters but uses only 39 billion of them at any one time, a design known as a mixture of experts. Thanks to this trick, the model is considerably more capable than some models that are much larger.

Meta isn’t the only tech giant releasing open source AI. This week Microsoft released Phi-3-mini and Apple released OpenELM, two tiny but capable free-to-use language models that can run on a smartphone.

Coming months will show whether Llama 3 and other open models really can displace premium AI models like GPT-4 for some developers. And even more powerful open source AI is coming. The company is working on a massive 400-billion-parameter version of Llama 3 that chief AI scientist Yann LeCun says should be one of the most capable in the world.

Of course all this openness is not purely altruistic. Meta CEO Mark Zuckerberg says opening up its AI models should ultimately benefit the company by lowering the cost of technologies it relies on, for example by spawning compatible tools and services that Meta can use for itself. He left unsaid that it may also be to Meta’s benefit to prevent OpenAI, Microsoft, or Google from dominating the field.

[ad_2]

Source link

April 25, 2024
What If Your AI Girlfriend Hated You?

[ad_1]

It seems as though we’ve arrived at the moment in the AI hype cycle where no idea is too bonkers to launch. This week’s eyebrow-raising AI project is a new twist on the romantic chatbot—a mobile app called AngryGF, which offers its users the uniquely unpleasant experience of getting yelled at via messages from a fake person. Or, as cofounder Emilia Aviles explained in her original pitch: “It simulates scenarios where female partners are angry, prompting users to comfort their angry AI partners” through a “gamified approach.” The idea is to teach communication skills by simulating arguments that the user can either win or lose depending on whether they can appease their fuming girlfriend.

The central appeal of a relationship-simulating chatbot, I’ve always assumed, is that they’re easier to interact with than real-life humans. They have no needs or desires of their own. There’s no chance they’ll reject you or mock you. They exist as a sort of emotional security blanket. So the premise of AngryGF amused me. You get some of the downsides of a real-life girlfriend—she’s furious!!—but none of the upsides. Who would voluntarily use this?

Obviously, I downloaded AngryGF immediately. (It’s available, for those who dare, on both the Apple App Store and Google Play.) The app offers a variety of situations where a girlfriend might ostensibly be mad and need “comfort.” They include “You put your savings into the stock market and lose 50 percent of it. Your girlfriend finds out and gets angry” and “During a conversation with your girlfriend, you unconsciously praise a female friend by mentioning that she is beautiful and talented. Your girlfriend becomes jealous and angry.”

The app sets an initial “forgiveness level” anywhere between 0 and 100 percent. You have 10 tries to say soothing things that tilt the forgiveness meter back to 100. I chose the beguilingly vague scenario called “Angry for no reason,” in which the girlfriend is, uh, angry for no reason. The forgiveness meter was initially set to a measly 30 percent, indicating I had a hard road ahead of me.

Reader: I failed. Although I genuinely tried to write messages that would appease my hopping-mad fake girlfriend, she continued to interpret my words in the least generous light and accuse me of not paying attention to her. A simple “How are you doing today?” text from me—Caring! Considerate! Asking questions!—was met with an immediately snappy answer: “Oh, now you care about how I’m doing?” Attempts to apologize only seemed to antagonize her further. When I proposed a dinner date, she told me that wasn’t sufficient but also that I better take her “somewhere nice.”

DATING APPS ONLINE LTD via Kate Knibbs

It was such an irritating experience that I snapped and told this bitchy bot that she was annoying. “Great to know that my feelings are such a bother to you,” the sarcast-o-bot replied. When I decided to try again a few hours later, the app informed me that I’d need to upgrade to the paid version to unlock more scenarios for $6.99 a week. No thank you.

At this point I wondered if the app was some sort of avant-garde performance art. Who would even want their partner to sign up? I would not be thrilled if I knew my husband considered me volatile enough to require practicing lady-placation skills on a synthetic shrew. While ostensibly preferable to AI girlfriend apps seeking to supplant IRL relationships, an app designed to coach men to get better at talking to women by creating a robot woman who is a total killjoy might actually be even worse.

I called Aviles, the cofounder, to try to understand what, exactly, was happening with AngryGF. She’s a Chicago-based social media marketer who says that the app was inspired by her own past relationships, where she was unimpressed by her partners’ communication skills. Her schtick seemed sincere. “You know men,” she says. “They listen, but then they don’t take action.”

Aviles describes herself as the app’s cofounder but isn’t particularly well-versed in the nuts and bolts of its creation. (She says a team of “between 10 and 20” people work on the app but that she is the only founder willing to put her name on the product.) She was able to specify that the app is built on top of OpenAI’s GPT-4 and wasn’t made with any additional custom training data like actual text messages between significant others.

“We didn’t really directly consult with a relationship therapist or anything like that,” she says. No kidding.

[ad_2]

Source link

April 18, 2024
No One Actually Knows How AI Will Affect Jobs

[ad_1]

Forget artificial intelligence breaking free of human control and taking over the world. A far more pressing concern is how today’s generative AI tools will transform the labor market. Some experts envisage a world of increased productivity and job satisfaction; others, a landscape of mass unemployment and social upheaval.

Someone with a bird’s-eye view of the situation is Mary Daly, CEO of the Federal Reserve Bank of San Francisco, part of the national system responsible for setting monetary policy, maintaining a stable financial system, and ensuring maximal employment. Daly, a labor market economist by training, is especially interested in how generative AI might change the labor market picture.

Daly spoke with WIRED senior editor Will Knight over Zoom. The conversation has been edited for length and clarity.

You’ve been talking to early adopter companies about their use of generative AI. What are you seeing—or to ask the question on many people’s minds, are workers being replaced?

More firms than I would have imagined are already looking at it. Some are going to have more opportunities to replace workers, and some more to augment, But overall what I’m seeing is that no firm is using it as a replacement tool alone.

One person I talked to, her company invested in generative AI and used it to help write descriptions of items that they have for sale. They have hundreds of thousands of items, but not all of them are high-margin or are interesting to write about. And so they can keep adding more copywriting staff, or they could use generative AI to write first drafts on these items. Copywriters become auditors, and they do more interesting work.

How confident are you that generative AI won’t eliminate jobs overall?

Technology has never reduced net employment over time for the country. If you look at technology over multiple centuries, what you see is that the impact lands somewhere in the middle, not necessarily dead-set in the middle, but somewhere in there, and where we end up depends a lot on how we engage with the technology.

When I think of generative AI—or AI writ large—what I see is an opportunity. You can replace people, you can augment people, and you can create new opportunities for people. But you do have winners and losers. I came of age as an economist in the computerization era. That computer surge and the productivity that came with it clearly produced inequalities.

AI in general, but especially generative AI, is an opportunity to assist those middle-skilled people in being more productive. But that’s our choice, and that requires a lot of thinking on our part.

So white-collar workers could, in theory, be superpowered by AI. How can we ensure companies deploy the technology that way?

Before we ever get to compel, I think we could start with educate, and a tight labor market actually helps us. In a market where people with a computer science degree are harder to come by, companies basically get pushed by their own motive to be profitable and productive. They ask, ‘How can I utilize less expensive talent more effectively?’ I do believe companies’ thinking naturally tends toward replacing workers, because it’s easier to think that way, but this isn’t set in stone.

The companies that are developing and selling AI models and tools don’t seem to think that way. They seem exclusively focused on how AI can replace humans.

[ad_2]

Source link

April 11, 2024
To Build a Better AI Supercomputer, Let There Be Light

[ad_1]

GlobalFoundries, a company that makes chips for others, including AMD and General Motors, previously announced a partnership with Lightmatter. Harris says his company is “working with the largest semiconductor companies in the world as well as the hyperscalers,” referring to the largest cloud companies like Microsoft, Amazon, and Google.

If Lightmatter or another company can reinvent the wiring of giant AI projects, a key bottleneck in the development of smarter algorithms might fall away. The use of more computation was fundamental to the advances that led to ChatGPT, and many AI researchers see the further scaling-up of hardware as being crucial to future advances in the field—and to hopes of ever reaching the vaguely-specified goal of artificial general intelligence, or AGI, meaning programs that can match or exceed biological intelligence in every way.

Linking a million chips together with light might allow for algorithms several generations beyond today’s cutting edge, says Lightmatter’s CEO Nick Harris. “Passage is going to enable AGI algorithms,” he confidently suggests.

The large data centers that are needed to train giant AI algorithms typically consist of racks filled with tens of thousands of computers running specialized silicon chips and a spaghetti of mostly electrical connections between them. Maintaining training runs for AI across so many systems—all connected by wires and switches—is a huge engineering undertaking. Converting between electronic and optical signals also places fundamental limits on chips’ abilities to run computations as one.

Lightmatter’s approach is designed to simplify the tricky traffic inside AI data centers. “Normally you have a bunch of GPUs, and then a layer of switches, and a layer of switches, and a layer of switches, and you have to traverse that tree” to communicate between two GPUs, Harris says. In a data center connected by Passage, Harris says, every GPU would have a high-speed connection to every other chip.

Lightmatter’s work on Passage is an example of how AI’s recent flourishing has inspired companies large and small to try to reinvent key hardware behind advances like OpenAI’s ChatGPT. Nvidia, the leading supplier of GPUs for AI projects, held its annual conference last month, where CEO Jensen Huang unveiled the company’s latest chip for training AI: a GPU called Blackwell. Nvidia will sell the GPU in a “superchip” consisting of two Blackwell GPUs and a conventional CPU processor, all connected using the company’s new high-speed communications technology called NVLink-C2C.

The chip industry is famous for finding ways to wring more computing power from chips without making them larger, but Nvidia chose to buck that trend. The Blackwell GPUs inside the company’s superchip are twice as powerful as their predecessors but are made by bolting two chips together, meaning they consume much more power. That trade-off, in addition to Nvidia’s efforts to glue its chips together with high-speed links, suggests that upgrades to other key components for AI supercomputers, like that proposed by Lightmatter, could become more important.

[ad_2]

Source link

April 4, 2024
The NSA Warns That US Adversaries Free to Mine Private Data May Have an AI Edge

[ad_1]

Electrical engineer Gilbert Herrera was appointed research director of the US National Security Agency in late 2021, just as an AI revolution was brewing inside the US tech industry.

The NSA, sometimes jokingly said to stand for No Such Agency, has long hired top math and computer science talent. Its technical leaders have been early and avid users of advanced computing and AI. And yet when Herrera spoke with me by phone about the implications of the latest AI boom from NSA headquarters in Fort Meade, Maryland, it seemed that, like many others, the agency has been stunned by the recent success of the large language models behind ChatGPT and other hit AI products. The conversation has been lightly edited for clarity and length.

Gilbert HerreraCourtesy of National Security Agency

How big of a surprise was the ChatGPT moment to the NSA?

Oh, I thought your first question was going to be “what did the NSA learn from the Ark of the Covenant?” That’s been a recurring one since about 1939. I’d love to tell you, but I can’t.

What I think everybody learned from the ChatGPT moment is that if you throw enough data and enough computing resources at AI, these emergent properties appear.

The NSA really views artificial intelligence as at the frontier of a long history of using automation to perform our missions with computing. AI has long been viewed as ways that we could operate smarter and faster and at scale. And so we’ve been involved in research leading to this moment for well over 20 years.

Large language models have been around long before generative pretrained (GPT) models. But this “ChatGPT moment”—once you could ask it to write a joke, or once you can engage in a conversation—that really differentiates it from other work that we and others have done.

The NSA and its counterparts among US allies have occasionally developed important technologies before anyone else but kept it a secret, like public key cryptography in the 1970s. Did the same thing perhaps happen with large language models?

At the NSA we couldn’t have created these big transformer models, because we could not use the data. We cannot use US citizen’s data. Another thing is the budget. I listened to a podcast where someone shared a Microsoft earnings call, and they said they were spending $10 billion a quarter on platform costs. [The total US intelligence budget in 2023 was $100 billion.]

It really has to be people that have enough money for capital investment that is tens of billions and [who] have access to the kind of data that can produce these emergent properties. And so it really is the hyperscalers [largest cloud companies] and potentially governments that don’t care about personal privacy, don’t have to follow personal privacy laws, and don’t have an issue with stealing data. And I’ll leave it to your imagination as to who that may be.

Doesn’t that put the NSA—and the United States—at a disadvantage in intelligence gathering and processing?

II’ll push back a little bit: It doesn’t put us at a big disadvantage. We kind of need to work around it, and I’ll come to that.

It’s not a huge disadvantage for our responsibility, which is dealing with nation-state targets. If you look at other applications, it may make it more difficult for some of our colleagues that deal with domestic intelligence. But the intelligence community is going to need to find a path to using commercial language models and respecting privacy and personal liberties. [The NSA is prohibited from collecting domestic intelligence, although multiple whistleblowers have warned that it does scoop up US data.]

[ad_2]

Source link

March 21, 2024
Forget Chatbots. AI Agents Are the Future

[ad_1]

This week a startup called Cognition AI caused a bit of a stir by releasing a demo showing an artificial intelligence program called Devin performing work usually done by well-paid software engineers. Chatbots like ChatGPT and Gemini can generate code, but Devin went further, planning how to solve a problem, writing the code, and then testing and implementing it.

Devin’s creators brand it as an “AI software developer.” When asked to test how Meta’s open source language model Llama 2 performed when accessed via different companies hosting it, Devin generated a step-by-step plan for the project, generated code needed to access the APIs and run benchmarking tests, and created a website summarizing the results.

It’s always hard to judge staged demos, but Cognition has shown Devin handling a wide range of impressive tasks. It wowed investors and engineers on X, receiving plenty of endorsements, and even inspired a few memes—including some predicting Devin will soon be responsible for a wave of tech industry layoffs.

Devin is just the latest, most polished example of a trend I’ve been tracking for a while—the emergence of AI agents that instead of just providing answers or advice about a problem presented by a human can take action to solve it. A few months back I test drove Auto-GPT, an open source program that attempts to do useful chores by taking actions on a person’s computer and on the web. Recently I tested another program called vimGPT to see how the visual skills of new AI models can help these agents browse the web more efficiently.

I was impressed by my experiments with those agents. Yet for now, just like the language models that power them, they make quite a few errors. And when a piece of software is taking actions, not just generating text, one mistake can mean total failure—and potentially costly or dangerous consequences. Narrowing the range of tasks an agent can do to, say, a specific set of software engineering chores seems like a clever way to reduce the error rate, but there are still many potential ways to fail.

Not only startups are building AI agents. Earlier this week I wrote about an agent called SIMA, developed by Google DeepMind, which plays video games including the truly bonkers title Goat Simulator 3. SIMA learned from watching human players how to do more than 600 fairly complicated tasks such as chopping down a tree or shooting an asteroid. Most significantly, it can do many of these actions successfully even in an unfamiliar game. Google DeepMind calls it a “generalist.”

I suspect that Google has hopes that these agents will eventually go to work outside of video games, perhaps helping use the web on a user’s behalf or operate software for them. But video games make a good sandbox for developing and testing agents, by providing complex environments in which they can be tested and improved. “Making them more precise is something that we’re actively working on,” Tim Harley, a research scientist at Google DeepMind, told me. “We’ve got various ideas.”

You can expect a lot more news about AI agents in the coming months. Demis Hassabis, the CEO of Google DeepMind, recently told me that he plans to combine large language models with the work his company has previously done training AI programs to play video games to develop more capable and reliable agents. “This definitely is a huge area. We’re investing heavily in that direction, and I imagine others are as well.” Hassabis said. “It will be a step change in capabilities of these types of systems—when they start becoming more agent-like.”

[ad_2]

Source link

March 14, 2024