In February 2024, Reddit struck a $ 60 million deal with Google to allow the use of search giant data on the platform to train its artificial intelligence Dello. Especially absent from the discussions were radical users, whose data was sold.
These deals reflect the reality of the modern Internet: big tech companies own all our DATA Naline data and decide what to do with that data. Surprisingly, many platforms make their data a monarch, and the fastest growing way to complete it is to sell it to AI companies today, which are huge tech companies using data to train more powerful models.
The decentralized platform Wana, which started as a class project in MIT, is on the mission to return the power to users. The company has created a full user-owner network that allows individuals to upload their data and rule how they are used. AI developers can pitch users on ideas for new models delts, and if users agree to contribute their data to training, they are proportional to the models.
Everyone has the idea of giving a stake in AI systems that will shape our society more and more while unalwing the new pool of data to advance the technology.
“This data is needed to create a better AI systems,” says Anna Kazlauskas ’19, co-founder of Wan. ” “We have created a decentralized system to get better data – which sits within big tech companies today – while still allowing users to maintain final ownership.”
From economics to blockchain
Many high school students have pictures of POP PP stars or athletes on their bedroom walls. U.S. in Kazlausk Treasury secretary was a picture of Janet Yelen.
Kazlausk Make sure he would become an economist, but he became one of the five students to join the MIT Bitcoin Club in 2015, and that experience led her into the world of blockchains and cryptocurrency.
From his dorm room of the MG Cagrager House, he began mining the cryptocurrency ethrium. She was sometimes tangled with campus dumpsters in search of computer chips.
“It was interested in everything around me,” Kazlauskas, “he was interested in everything around the computer wide and networking. “From a blockchain point of view, it includes distributed systems and how they can transfer economic power, as well as how artificial intelligence and econometrics can transfer.”
Kazlausk met Art Abal at the former media lab class emergent Ventures, who was then studying at Harvard University, and the pair decided to work on new ways to get data to train AI systems.
“Our question was: How can a large number of people contribute to these AI systems using a more delivered network?” Kazaluska remembers.
Kazlausk and Abal were trying to consider the status quo, where most models are trained by scraping public data on the Internet. Large tech companies often buy large datasets from other companies.
The founders’ approach developed for years and after graduation, the experience of Kazlauscas working in financial blockchain company cells was reported. But Kazlausk has credited his time in MIT to help him think of these problems, and emerging enterprises, instructor for Ramesh Rocker, still helps Wana think about AI research questions.
“Just got an open opportunity to make, hack and explore,” says Kazlauskas. “I think the morality on MIT is really important. It’s just to make things, watch what works and continue to repeat.”
Today, the dish takes advantage of a few well -known laws that allow most of the large tech platform users to directly export their data. Users can upload that information to the encrypted digital V Lets Let’s in the vans and distribute them to train the models as appropriate.
AI engineers can suggest ideas for new open source models, and people can pool their data to train the model. In the blockchain world, the data pool is called Data DAOS, used for a decentralized autonomous organization. Data can also be used to create personalized AI models and agents.
In the way, data is used in such a way that the user’s privacy preserves because the system does not reveal identifying information. Once the model is created, users maintain ownership so that whenever they are used, they should be proportionally compensated based on how much they helped to train their data.
“From a developer’s point of view, you can now create these hyper-individual health programs that you eat, how you sleep, consider how you exercise.” “Those requests are not possible today because of the walls of the large tech companies.”
Crowd Source, User -owned A.I.
Last year, the machine-learning engineer proposed to use the user data to train the AI model that could produce reddit posts. Over 140,000 dish users contributed to their reddet data, including posts, comments, messages and more. Users decided on the conditions in which the model could be used, and after making it, the model maintained.
Wana has enabled the same initiative with user-contributed data from social media platform X; Sleep blow data from sources such as ORA rings; And more. There is also a collaboration that combines the data pool to create comprehensive AI applications.
“Let us say that users have spotife data, reddit data and fashion data,” Explains Kazlausk. “Generally, spotifi will not collaborate with those types of companies, and are really regulated against it. But users can do it if they offer access cess, so these cross-platform datasets can be used to create really powerful models.”
The van has more than 1 million users and more than 20 live data DAOS. More than 300 additional data pools have been proposed by users on Vana’s system, and Kazlauskas says many products will come this year.
“I think there is a lot of promise in generalized AI models, personal medicine and new customer applications, as it is difficult to combine all data or get into the first place,” says Kazlauscas.
Data pool also allows users’ groups to fulfill the most powerful tech companies today.
“Today, big tech companies have created these data motes, so the best datasets are not available to anyone,” says Kazlaoscus. “It is a collective action problem, where my own data is not worth it, but the data pool with thousands or millions is really worth it. Wan allows that pool to tie the pool. It is a win-win: users benefit from the rise of AI because they are in models.