Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
MLCOCONMONS, a non -profit operation group for Amnesty International, cooperated with the AI Dev platform that embraces to issue one of the largest audio recordings of the world’s field of artificial research.
Data set, called Speech of people who are not subject to supervisionIt contains more than a million hours of sound extending at least 89 different languages. MLCOCOMONS says it was supposed to be created by the desire to support research and development in “different areas of speech technology”.
The organization wrote in a Blog post Thursday. “We expect many ways to continue the research community in building and developing, especially in the fields of improving low -resources language speech models, enhancing speech recognition through different dialects and dialects, and new applications in speech creation.”
It is an impressive goal, definitely. But artificial intelligence data groups such as the speech of non -supervisory people can carry risks for researchers who choose to use them.
Data is one of those risks. Records came in the speech of people who are not subject to supervision from Archive.org, and non -profit organizations may be famous for the Wayback Machine Web Archive tool. Since many shareholders in Archive.org speak English-and American-all records in the speech of people who are not under the supervision of the English language presented to the Americans, For all readme on the official project page.
This means that without accurate liquidation, artificial intelligence systems such as speech recognition and mixing models that were trained in the discourse of people who are not subject to supervision can show some of the same biases. They may struggle, for example, in order to copy the English language spoken by a non -original speaker, or face a problem in generating artificial sounds in languages other than the English language.
The discourse of people who are not subject to supervision may also contain records of unknown persons that their sounds are used for artificial intelligence research purposes – including commercial applications. While MLCOCONS says that all records in the data group are the public field or are available under Creative Commons, there are no possible errors.
According to the analysis of the Massachusetts Institute of TechnologyHundreds of artificial intelligence training sets lack licensing information and contain errors. Creative advocates, including Ed Newton-Rex, CEO of AI who focuses on non-profit ethics trained fairly, has made the issue that the creators should not be asked to “cancel the subscription” in artificial intelligence data groups because of the burden of arduousness imposed on these Creators.
“Many creators (for example SquaresPace users) have no meaningful way to hack,” Newton Rex books In a post in x June. For the creators who He can Cancel the subscription, there are many methods of canceling the intertwined subscription, which (1) incredibly confusing and (2) is sadly incomplete in its coverage. Even if there is an ideal comprehensive subscription process, it will be very not fair to put the burden of canceling the subscription to the creators, given that the Importer IQ uses their work to compete with them-many have simply realized that they can cancel the subscription. “
MLCOCOMONS says he is committed to updating, maintaining and improving the quality of the discourse of people who are not underlined. But given the potential defects, they are supposed to practice serious caution.