Tag: algorithms

  • Students Are Likely Writing Millions of Papers With AI

    Students Are Likely Writing Millions of Papers With AI

    [ad_1]

    Students have submitted more than 22 million papers that may have used generative AI in the past year, new data released by plagiarism detection company Turnitin shows.

    A year ago, Turnitin rolled out an AI writing detection tool that was trained on its trove of papers written by students as well as other AI-generated texts. Since then, more than 200 million papers have been reviewed by the detector, predominantly written by high school and college students. Turnitin found that 11 percent may contain AI-written language in 20 percent of its content, with 3 percent of the total papers reviewed getting flagged for having 80 percent or more AI writing. (Turnitin is owned by Advance, which also owns Condé Nast, publisher of WIRED.) Turnitin says its detector has a false positive rate of less than 1 percent when analyzing full documents.

    ChatGPT’s launch was met with knee-jerk fears that the English class essay would die. The chatbot can synthesize information and distill it near-instantly—but that doesn’t mean it always gets it right. Generative AI has been known to hallucinate, creating its own facts and citing academic references that don’t actually exist. Generative AI chatbots have also been caught spitting out biased text on gender and race. Despite those flaws, students have used chatbots for research, organizing ideas, and as a ghostwriter. Traces of chatbots have even been found in peer-reviewed, published academic writing.

    Teachers understandably want to hold students accountable for using generative AI without permission or disclosure. But that requires a reliable way to prove AI was used in a given assignment. Instructors have tried at times to find their own solutions to detecting AI in writing, using messy, untested methods to enforce rules, and distressing students. Further complicating the issue, some teachers are even using generative AI in their grading processes.

    Detecting the use of gen AI is tricky. It’s not as easy as flagging plagiarism, because generated text is still original text. Plus, there’s nuance to how students use gen AI; some may ask chatbots to write their papers for them in large chunks or in full, while others may use the tools as an aid or a brainstorm partner.

    Students also aren’t tempted by only ChatGPT and similar large language models. So-called word spinners are another type of AI software that rewrites text, and may make it less obvious to a teacher that work was plagiarized or generated by AI. Turnitin’s AI detector has also been updated to detect word spinners, says Annie Chechitelli, the company’s chief product officer. It can also flag work that was rewritten by services like spell checker Grammarly, which now has its own generative AI tool. As familiar software increasingly adds generative AI components, what students can and can’t use becomes more muddled.

    Detection tools themselves have a risk of bias. English language learners may be more likely to set them off; a 2023 study found a 61.3 percent false positive rate when evaluating Test of English as a Foreign Language (TOEFL) exams with seven different AI detectors. The study did not examine Turnitin’s version. The company says it has trained its detector on writing from English language learners as well as native English speakers. A study published in October found that Turnitin was among the most accurate of 16 AI language detectors in a test that had the tool examine undergraduate papers and AI-generated papers.

    [ad_2]

    Source link

  • OpenAI’s GPT Store Is Triggering Copyright Complaints

    OpenAI’s GPT Store Is Triggering Copyright Complaints

    [ad_1]

    For the past few months, Morten Blichfeldt Andersen has spent many hours scouring OpenAI’s GPT Store. Since it launched in January, the marketplace for bespoke bots has filled up with a deep bench of useful and sometimes quirky AI tools. Cartoon generators spin up New Yorker–style illustrations and vivid anime stills. Programming and writing assistants offer shortcuts for crafting code and prose. There’s also a color analysis bot, a spider identifier, and a dating coach called RizzGPT. Yet Blichfeldt Andersen is hunting only for one very specific type of bot: Those built on his employer’s copyright-protected textbooks without permission.

    Blichfeldt Andersen is publishing director at Praxis, a Danish textbook purveyor. The company has been embracing AI and created its own custom chatbots. But it is currently engaged in a game of whack-a-mole in the GPT Store, and Blichfeldt Andersen is the man holding the mallet.

    “I’ve been personally searching for infringements and reporting them,” Blichfeldt Andersen says. “They just keep coming up.” He suspects the culprits are primarily young people uploading material from textbooks to create custom bots to share with classmates—and that he has uncovered only a tiny fraction of the infringing bots in the GPT Store. “Tip of the iceberg,” Blichfeldt Andersen says.

    It is easy to find bots in the GPT Store whose descriptions suggest they might be tapping copyrighted content in some way, as Techcrunch noted in a recent article claiming OpenAI’s store was overrun with “spam.” Using copyrighted material without permission is permissable in some contexts but in others rightsholders can take legal action. WIRED found a GPT called Westeros Writer that claims to “write like George R.R. Martin,” the creator of Game of Thrones. Another, Voice of Atwood, claims to imitate the writer Margaret Atwood. Yet another, Write Like Stephen, is intended to emulate Stephen King.

    When WIRED tried to trick the King bot into revealing the “system prompt” that tunes its responses, the output suggested it had access to King’s memoir On Writing. Write Like Stephen was able to reproduce passages from the book verbatim on demand, even noting which page the material came from. (WIRED could not make contact with the bot’s developer, because it did not provide an email address, phone number, or external social profile.)

    OpenAI spokesperson Kayla Wood says it responds to takedown requests against GPTs made with copyrighted content but declined to answer WIRED’s questions about how frequently it fulfills such requests. She also says the company proactively looks for problem GPTs. “We use a combination of automated systems, human review, and user reports to find and assess GPTs that potentially violate our policies, including the use of content from third parties without necessary permission,” Wood says.

    New Disputes

    The GPT store’s copyright problem could add to OpenAI’s existing legal headaches. The company is facing a number of high-profile lawsuits alleging copyright infringement, including one brought by The New York Times and several brought by different groups of fiction and nonfiction authors, including big names like George R.R. Martin.

    Chatbots offered in OpenAI’s GPT Store are based on the same technology as its own ChatGPT but are created by outside developers for specific functions. To tailor their bot, a developer can upload extra information that it can tap to augment the knowledge baked into OpenAI’s technology. The process of consulting this additional information to respond to a person’s queries is called retrieval-augmented generation, or RAG. Blichfeldt Andersen is convinced that the RAG files behind the bots in the GPT Store are a hotbed of copyrighted materials uploaded without permission.

    [ad_2]

    Source link

  • Here’s Proof the AI Boom Is Real: More People Are Tapping ChatGPT at Work

    Here’s Proof the AI Boom Is Real: More People Are Tapping ChatGPT at Work

    [ad_1]

    Ever since the rollout of ChatGPT in November 2022, many people in science, business, and media have been obsessed with AI. A cursory look at my own published work during that period fingers me as among the guilty. My defense is that I share with those other obsessives a belief that large language models are the leading edge of an epochal transformation. Maybe I’m swimming in generative Kool-Aid, but I believe AI advances within our grasp will change not only the way we work, but the structure of businesses, and ultimately the course of humanity.

    Not everyone agrees, and in recent months there’s been a backlash. AI has been oversold and overhyped, some experts now opine. Self-styled AI-critic-in-chief Gary Marcus recently said of the LLM boom, “It wouldn’t surprise me if, to some extent, this whole thing fizzled out.” Others claim that AI is mired in the “trough of disillusionment.”

    This week we got some data that won’t resolve the larger questions but provides a snapshot of how the US, if not the world, views the advent of AI and large language models. The Pew Research Center—which did similar probes during the rise of the internet, social media, and mobile devices—released a study of how ChatGPT was being used, regarded, and trusted. The sample was taken between February 7 and 11 of this year.

    Some of the numbers at first seem to indicate that the LLM controversy might be a parochial disagreement that most people don’t care about. A third of Americans haven’t heard of ChatGPT. Just under a quarter have used it. Oh, and for all the panic about how AI is going to flood the public square with misinformation about the 2024 election? So far, only 2 percent of Americans have used ChatGPT to get information about the presidential election season already underway.

    More broadly, though, data from the survey indicates that we’re seeing a powerful technology whose rise is just beginning. If you accept Pew’s sample as indicative of all Americans, millions of people are indeed familiar with ChatGPT. And one thing in particular stands out: While 17 percent of respondents said they have used it for entertainment and an identical number says they’ve tried it to learn something new, a full 20 percent of adults say that they have used ChatGPT for work. That’s up dramatically from the 12 percent who responded affirmatively when the same question was asked six months earlier—a rise of two-thirds.

    When I spoke to Colleen McClain, a Pew research associate involved in the study, she agreed that it seems to track with other huge technological shifts. “If you look at our trend charts over time on internet access, smartphones, social media, certainly some of them show this uptick,” she says. For some technologies there had been a leveling off, she adds. But in the ones she mentioned, the plateau came only when so many people came on board that there weren’t many stragglers left.

    What’s crazy about that sudden jump in ChatGPT business use from 12 percent to 20 percent is that we’re only at the beginning stages of humans collaborating with these models. And the tools to fully make use of ChatGPT are in a nascent status. That’s changing fast. OpenAI, ChatGPT’s creator, is going full tilt, and AI giants Microsoft and Google are still in the process of diverting their workforces to redesign every product line to integrate conversational AI. And startups like Sierra, which is building agents for corporate customers, are enabling bespoke usages that take advantage of multiple models. As this process continues, more people will use AI tools. And since the foundation models are getting exponentially better—am I hearing that GPT5 will show up this year?—that will make them even more compelling. This raises the possibility that the quality of virtually all work will reside in how well one can draw out the talents of a robot collaborator.

    What past technology can help us understand the trajectory of the rocket ship we’re on? While the near limitless ceiling of AI makes it hard to find an analog, I suggest the uptake of spreadsheets. Dan Bricklin and Bob Frankston invented them in 1978, and a year later the concept was embodied in VisiCalc, which at the time ran only on Apple computers. Spreadsheets had a phenomenal and disruptive effect on the business world. More than mere accounting tools, they triggered an era of business innovation and shook up the flow of information inside companies. Yet it took a few years before the business world widely adopted spreadsheets. The turning point came with a new and more powerful product called Lotus 1, 2, 3, which ran on the IBM PC. The current and near-future startups in the AI world, like Sierra, are all hoping to become the Lotuses of our era—but also to be much more consequential and lasting. Spreadsheets are largely limited to the business domain. LLMs can seemingly mess with anything.

    [ad_2]

    Source link

  • Perplexity’s Founder Was Inspired by Sundar Pichai. Now They’re Competing to Reinvent Search

    Perplexity’s Founder Was Inspired by Sundar Pichai. Now They’re Competing to Reinvent Search

    [ad_1]

    Aravind Srinivas credits Google CEO Sundar Pichai for giving him the freedom to eat eggs.

    Srinivas remembers the moment seven years ago when an interview with Pichai popped up in his YouTube feed. His vegetarian upbringing in India had excluded eggs, as it had for many in the country, but now, in his early twenties, Srinivas wanted to start eating more protein. Here was Pichai, a hero to many aspiring entrepreneurs in India, casually describing his morning: waking up, reading newspapers, drinking tea—and eating an omelet.

    Srinivas shared the video with his mother. OK, she said: You can eat eggs.

    Pichai’s influence reaches far beyond Srinivas’ diet. He too is CEO of a search company, called Perplexity AI, one of the most hyped-up apps of the generative AI era. Srinivas is still taking cues from Pichai, the leader of the world’s largest search engine, but his admiration is more complicated.

    “It’s kind of a rivalry now,” Srinivas says. “It’s awkward.”

    Srinivas and Pichai both grew up in Chennai, India, in the south Indian state of Tamil Nadu—though the two were born 22 years apart. By the time Srinivas was working toward his PhD in computer science at UC Berkeley, Pichai had been crowned chief executive of Google.

    For his first research internship, Srinivas worked at Google-owned DeepMind in London. Pichai also got a new job that year, becoming CEO of Alphabet as well as Google. Srinivas found the work at DeepMind invigorating, but he was dismayed to find that the flat he had rented sight unseen was a disaster—a “crappy home, with rats,” he says—so he sometimes slept in DeepMind’s offices.

    He discovered in the office library a book about the development and evolution of Google, called In the Plex, penned by WIRED editor at large Steven Levy. Srinivas read it over and over, deepening his appreciation of Google and its innovations. “Larry and Sergey became my entrepreneurial heroes,” Srinivas says. (He offered to list In the Plex’s chapters and cite passages from memory; WIRED took his word for it.)

    Shortly afterwards, in 2020, Srinivas ended up working at Google’s headquarters in Mountain View, California, as a research intern working on machine learning for computer vision. Slowly, Srinivas was making his way through the Google universe, and putting some of his AI research work to good use.

    Then, in 2022, Srinivas and three cofounders—Denis Yarats, Johnny Ho, and Andy Konwinski—teamed up to try and develop a new approach to search using AI. They started out working on algorithms that could translate natural language into the database language SQL, but determined this was too narrow (or nerdy). Instead they pivoted to a product that combined a traditional search index with the relatively new power of large language models. They called it Perplexity.

    Perplexity is sometimes described as an “answer” engine rather than a search engine, because of the way it uses AI text generation to summarize results. New searches create conversational “threads” on a particular topic. Type in a query, and Perplexity responds with follow up questions, asking you to refine your ask. It eschews direct links in favor of text-based or visual answers that don’t require you to click away to somewhere else to get information.

    [ad_2]

    Source link

  • Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content

    Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content

    [ad_1]

    In 2023, OpenAI told the UK parliament that it was “impossible” to train leading AI models without using copyrighted materials. It’s a popular stance in the AI world, where OpenAI and other leading players have used materials slurped up online to train the models powering chatbots and image generators, triggering a wave of lawsuits alleging copyright infringement.

    Two announcements Wednesday offer evidence that large language models can in fact be trained without the permissionless use of copyrighted materials.

    A group of researchers backed by the French government have released what is thought to be the largest AI training dataset composed entirely of text that is in the public domain. And the nonprofit Fairly Trained announced that it has awarded its first certification for a large language model built without copyright infringement, showing that technology like that behind ChatGPT can be built in a different way to the AI industry’s contentious norm.

    “There’s no fundamental reason why someone couldn’t train an LLM fairly,” says Ed Newton-Rex, CEO of Fairly Trained. He founded the nonprofit in January 2024 after quitting his executive role at image generation startup Stability AI because he disagreed with its policy of scraping content without permission.

    Fairly Trained offers a certification to companies willing to prove that they’ve trained their AI models on data that they either own, have licensed, or is in the public domain. When the nonprofit launched, some critics pointed out that it hadn’t yet identified a large language model that met those requirements.

    Today, Fairly Trained announced it has certified its first large language model. It’s called KL3M and was developed by Chicago-based legal tech consultancy startup 273 Ventures, using a curated training dataset of legal, financial, and regulatory documents.

    The company’s cofounder Jillian Bommarito says the decision to train KL3M in this way stemmed from the company’s “risk-averse” clients like law firms. “They’re concerned about the provenance, and they need to know that output is not based on tainted data,” she says. “We’re not relying on fair use.” The clients were interested in using generative AI for tasks like summarizing legal documents and drafting contracts, but didn’t want to get dragged into lawsuits about intellectual property as OpenAI, Stability AI, and others have been.

    Bommarito says that 273 Ventures hadn’t worked on a large language model before but decided to train one as an experiment. “Our test to see if it was even possible,” she says. The company has created its own training data set, the Kelvin Legal DataPack, which includes thousands of legal documents reviewed to comply with copyright law.

    Although the dataset is tiny (around 350 billion tokens, or units of data) compared to those compiled by OpenAI and others that have scraped the internet en masse, Bommarito says the KL3M model performed far better than expected, something she attributes to how carefully the data had been vetted beforehand. “Having clean, high-quality data may mean that you don’t have to make the model so big,” she says. Curating a dataset can help make a finished AI model specialized to the task its designed for. 273 Ventures is now offering spots on a waitlist to clients who want to purchase access to this data.

    Clean Sheet

    Companies looking to emulate KL3M may have more help in the future in the form of freely available infringement-free datasets. On Wednesday, researchers released what they claim is the largest available AI dataset for language models composed purely of public domain content. Common Corpus, as it is called, is a collection of text roughly the same size as the data used to train OpenAI’s GPT-3 text generation model and has been posted to the open source AI platform Hugging Face.

    The dataset was built from sources like public domain newspapers digitized by the US Library of Congress and the National Library of France. Pierre-Carl Langlais, project coordinator for Common Corpus, calls it a “big enough corpus to train a state-of-the-art LLM.” In the lingo of big AI, the dataset contains 500 million tokens, OpenAI’s most capable model is widely believed to have been trained on several trillions.

    [ad_2]

    Source link

  • 8 Google Employees Invented Modern AI. Here’s the Inside Story

    8 Google Employees Invented Modern AI. Here’s the Inside Story

    [ad_1]

    The last two weeks before the deadline were frantic. Though officially some of the team still had desks in Building 1945, they mostly worked in 1965 because it had a better espresso machine in the micro-kitchen. “People weren’t sleeping,” says Gomez, who, as the intern, lived in a constant debugging frenzy and also produced the visualizations and diagrams for the paper. It’s common in such projects to do ablations—taking things out to see whether what remains is enough to get the job done.

    “There was every possible combination of tricks and modules—which one helps, which doesn’t help. Let’s rip it out. Let’s replace it with this,” Gomez says. “Why is the model behaving in this counterintuitive way? Oh, it’s because we didn’t remember to do the masking properly. Does it work yet? OK, move on to the next. All of these components of what we now call the transformer were the output of this extremely high-paced, iterative trial and error.” The ablations, aided by Shazeer’s implementations, produced “something minimalistic,” Jones says. “Noam is a wizard.”

    Vaswani recalls crashing on an office couch one night while the team was writing the paper. As he stared at the curtains that separated the couch from the rest of the room, he was struck by the pattern on the fabric, which looked to him like synapses and neurons. Gomez was there, and Vaswani told him that what they were working on would transcend machine translation. “Ultimately, like with the human brain, you need to unite all these modalities—speech, audio, vision—under a single architecture,” he says. “I had a strong hunch we were onto something more general.”

    In the higher echelons of Google, however, the work was seen as just another interesting AI project. I asked several of the transformers folks whether their bosses ever summoned them for updates on the project. Not so much. But “we understood that this was potentially quite a big deal,” says Uszkoreit. “And it caused us to actually obsess over one of the sentences in the paper toward the end, where we comment on future work.”

    That sentence anticipated what might come next—the application of transformer models to basically all forms of human expression. “We are excited about the future of attention-based models,” they wrote. “We plan to extend the transformer to problems involving input and output modalities other than text” and to investigate “images, audio and video.”

    A couple of nights before the deadline, Uszkoreit realized they needed a title. Jones noted that the team had landed on a radical rejection of the accepted best practices, most notably LSTMs, for one technique: attention. The Beatles, Jones recalled, had named a song “All You Need Is Love.” Why not call the paper “Attention Is All You Need”?

    The Beatles?

    “I’m British,” says Jones. “It literally took five seconds of thought. I didn’t think they would use it.”

    They continued collecting results from their experiments right up until the deadline. “The English-French numbers came, like, five minutes before we submitted the paper,” says Parmar. “I was sitting in the micro-kitchen in 1965, getting that last number in.” With barely two minutes to spare, they sent off the paper.

    [ad_2]

    Source link

  • Forget Chatbots. AI Agents Are the Future

    Forget Chatbots. AI Agents Are the Future

    [ad_1]

    This week a startup called Cognition AI caused a bit of a stir by releasing a demo showing an artificial intelligence program called Devin performing work usually done by well-paid software engineers. Chatbots like ChatGPT and Gemini can generate code, but Devin went further, planning how to solve a problem, writing the code, and then testing and implementing it.

    Devin’s creators brand it as an “AI software developer.” When asked to test how Meta’s open source language model Llama 2 performed when accessed via different companies hosting it, Devin generated a step-by-step plan for the project, generated code needed to access the APIs and run benchmarking tests, and created a website summarizing the results.

    It’s always hard to judge staged demos, but Cognition has shown Devin handling a wide range of impressive tasks. It wowed investors and engineers on X, receiving plenty of endorsements, and even inspired a few memes—including some predicting Devin will soon be responsible for a wave of tech industry layoffs.

    Devin is just the latest, most polished example of a trend I’ve been tracking for a while—the emergence of AI agents that instead of just providing answers or advice about a problem presented by a human can take action to solve it. A few months back I test drove Auto-GPT, an open source program that attempts to do useful chores by taking actions on a person’s computer and on the web. Recently I tested another program called vimGPT to see how the visual skills of new AI models can help these agents browse the web more efficiently.

    I was impressed by my experiments with those agents. Yet for now, just like the language models that power them, they make quite a few errors. And when a piece of software is taking actions, not just generating text, one mistake can mean total failure—and potentially costly or dangerous consequences. Narrowing the range of tasks an agent can do to, say, a specific set of software engineering chores seems like a clever way to reduce the error rate, but there are still many potential ways to fail.

    Not only startups are building AI agents. Earlier this week I wrote about an agent called SIMA, developed by Google DeepMind, which plays video games including the truly bonkers title Goat Simulator 3. SIMA learned from watching human players how to do more than 600 fairly complicated tasks such as chopping down a tree or shooting an asteroid. Most significantly, it can do many of these actions successfully even in an unfamiliar game. Google DeepMind calls it a “generalist.”

    I suspect that Google has hopes that these agents will eventually go to work outside of video games, perhaps helping use the web on a user’s behalf or operate software for them. But video games make a good sandbox for developing and testing agents, by providing complex environments in which they can be tested and improved. “Making them more precise is something that we’re actively working on,” Tim Harley, a research scientist at Google DeepMind, told me. “We’ve got various ideas.”

    You can expect a lot more news about AI agents in the coming months. Demis Hassabis, the CEO of Google DeepMind, recently told me that he plans to combine large language models with the work his company has previously done training AI programs to play video games to develop more capable and reliable agents. “This definitely is a huge area. We’re investing heavily in that direction, and I imagine others are as well.” Hassabis said. “It will be a step change in capabilities of these types of systems—when they start becoming more agent-like.”



    [ad_2]

    Source link

  • Google DeepMind’s Latest AI Agent Learned to Play ‘Goat Simulator 3’

    Google DeepMind’s Latest AI Agent Learned to Play ‘Goat Simulator 3’

    [ad_1]

    “SIMA takes one step further and shows stronger generalization to new games,” he says. “The number of environments is still very small, but I think SIMA is on the right track.

    A New Way to Play

    SIMA shows DeepMind putting a new twist on game playing agents, an AI technology the company has pioneered in the past.

    In 2013, before DeepMind was acquired by Google, the London-based startup showed how a technique called reinforcement learning, which involves training an algorithm with positive and negative feedback on its performance, could help computers play classic Atari video games. In 2016, as part of Google, DeepMind developed AlphaGo, a program that used the same approach to defeat a world champion of Go, an ancient board game that requires subtle and instinctive skill.

    For the SIMA project, the Google DeepMind team collaborated with several game studios to collect keyboard and mouse data from humans playing 10 different games with 3D environments, including No Man’s Sky, Teardown, Hydroneer, and Satisfactory. DeepMind later added descriptive labels to that data to associate the clicks and taps with the actions users took, for example whether they were a goat looking for its jetpack or a human character digging for gold.

    The data trove from the human players was then fed into a language model of the kind that powers modern chatbots, which had picked up an ability to process language by digesting a huge database of text. SIMA could then carry out actions in response to typed commands. And finally, humans evaluated SIMA’s efforts inside different games, generating data that was used to fine-tune its performance.

    The SIMA AI software was trained using data from humans playing 10 different games featuring 3D environments.

    Courtesy of Google DeepMind

    After all that training, SIMA is able to carry out actions in response to hundreds of commands given by a human player, like “Turn left” or “Go to the spaceship” or “Go through the gate” or “Chop down a tree.” The program can perform more than 600 actions, ranging from exploration to combat to tool use. The researchers avoided games that feature violent actions, in line with Google’s ethical guidelines on AI.

    “It’s still very much a research project,” says Tim Harley, another member of the Google DeepMind team. “However, one could imagine one day having agents like SIMA playing alongside you in games with you and with your friends.”

    Video games provide a relatively safe environment to task AI agents to do things. For agents to do useful office or everyday admin work, they will need to become more reliable. Harley and Besse at DeepMind say they are working on techniques for making the agents more reliable.

    Updated 3/13/2024, 10:20 am ET: Added comment from Linxi “Jim” Fan.

    [ad_2]

    Source link

  • The Dark Side of Open Source AI Image Generators

    The Dark Side of Open Source AI Image Generators

    [ad_1]

    Whether through the frowning high-definition face of a chimpanzee or a psychedelic, pink-and-red-hued doppelganger of himself, Reuven Cohen uses AI-generated images to catch people’s attention. “I’ve always been interested in art and design and video and enjoy pushing boundaries,” he says—but the Toronto-based consultant, who helps companies develop AI tools, also hopes to raise awareness of the technology’s darker uses.

    “It can also be specifically trained to be quite gruesome and bad in a whole variety of ways,” Cohen says. He’s a fan of the freewheeling experimentation that has been unleashed by open source image-generation technology. But that same freedom enables the creation of explicit images of women used for harassment.

    After nonconsensual images of Taylor Swift recently spread on X, Microsoft added new controls to its image generator. Open source models can be commandeered by just about anyone and generally come without guardrails. Despite the efforts of some hopeful community members to deter exploitative uses, the open source free-for-all is near-impossible to control, experts say.

    “Open source has powered fake image abuse and nonconsensual pornography. That’s impossible to sugarcoat or qualify,” says Henry Ajder, who has spent years researching harmful use of generative AI.

    Ajder says that at the same time that it’s becoming a favorite of researchers, creatives like Cohen, and academics working on AI, open source image generation software has become the bedrock of deepfake porn. Some tools based on open source algorithms are purpose-built for salacious or harassing uses, such as “nudifying” apps that digitally remove women’s clothes in images.

    But many tools can serve both legitimate and harassing use cases. One popular open source face-swapping program is used by people in the entertainment industry and as the “tool of choice for bad actors” making nonconsensual deepfakes, Ajder says. High-resolution image generator Stable Diffusion, developed by startup Stability AI, is claimed to have more than 10 million users and has guardrails installed to prevent explicit image creation and policies barring malicious use. But the company also open sourced a version of the image generator in 2022 that is customizable, and online guides explain how to bypass its built-in limitations.

    Meanwhile, smaller AI models known as LoRAs make it easy to tune a Stable Diffusion model to output images with a particular style, concept, or pose—such as a celebrity’s likeness or certain sexual acts. They are widely available on AI model marketplaces such as Civitai, a community-based site where users share and download models. There, one creator of a Taylor Swift plug-in has urged others not to use it “for NSFW images.” However, once downloaded, its use is out of its creator’s control. “The way that open source works means it’s going to be pretty hard to stop someone from potentially hijacking that,” says Ajder.

    4chan, the image-based message board site with a reputation for chaotic moderation is home to pages devoted to nonconsensual deepfake porn, WIRED found, made with openly available programs and AI models dedicated solely to sexual images. Message boards for adult images are littered with AI-generated nonconsensual nudes of real women, from porn performers to actresses like Cate Blanchett. WIRED also observed 4chan users sharing workarounds for NSFW images using OpenAI’s Dall-E 3.

    That kind of activity has inspired some users in communities dedicated to AI image-making, including on Reddit and Discord, to attempt to push back against the sea of pornographic and malicious images. Creators also express worry about the software gaining a reputation for NSFW images, encouraging others to report images depicting minors on Reddit and model-hosting sites.



    [ad_2]

    Source link

  • Roundtables – The AI Economy

    Roundtables – The AI Economy

    [ad_1]

    The AI Economy

    Speakers: Mat Honan, Editor in chief and David Rotman, Editor at large

    There’s no doubt that generative AI will impact the economy—but how, exactly, remains an open question. Despite fears that these AI tools will upend workers and exacerbate wealth inequality, early evidence suggests the technology could actually help level the playing field for some. But only if we deploy it in the right ways. 

    Meanwhile, the demand for chips that underpin modern AI including generative tools is expected to grow significantly. And the US is spending billions to reshore the industry. Global competition for these chips is fierce, with both countries and companies now making unprecedented investments in the sector.

    Related Coverage

    [ad_2]

    Source link