ELYZA Co., Ltd., an AI (artificial intelligence) startup from the University of Tokyo Matsuo Lab, announced on August 26 that it has succeeded in developing a generative summary model in Japanese. The summary AI “ELYZA DIGEST” using the same model has been released to the public as a demo site. This AI can summarize sentences of about 900 characters on average, which takes about 5 minutes for humans, in just 10 seconds or less.
This AI can summarize the entered text data into 3 lines. We have adopted a “generated summary” that has few successful cases in Japan, and AI generates a summary sentence from scratch based on the read text.
It supports not only beautiful sentences with few typographical errors such as books, novels and news articles, but also messy sentences and character strings such as minutes and dialogue texts. By pasting the URL, you can also summarize from all the text on the page.
Humans summarize sentences that take about 5 minutes in less than 10 seconds
ELYZA evaluates the accuracy of summarization, so it can correctly complement “accuracy (whether there is an inaccurate description in the original text)” and “fluency (whether there are grammatical, spelling, or structural mistakes / subject omissions). (Are there any / excessive repetitive expressions?) ”, We compared and verified the abstracts created by AI and humans.
In terms of “accuracy,” AI is able to output 90% of all articles without problems, and it can be said that it is able to generate summary sentences with accuracy comparable to humans.
In “fluency”, the ratio of output with some mistakes was higher than that of human summaries. As for the breakdown, in addition to so-called grammatical mistakes, there were some parts where the sentence was a little difficult to read because the appropriate subject could not be complemented in the abstract sentence for the omission of the subject in the original sentence which is common in Japanese. .. ELYZA says it will work to improve these points so that it can generate more human-readable summaries.
Regarding the efficiency of summarization, the articles used in this verification averaged about 900 characters. AI can be summarized in less than 10 seconds per article, while it took about 5 minutes for humans.
ELYZA has been researching and developing large-scale language models since 2018 when “BERT” appeared, and in 2020, Japanese-specialized AI that utilizes large-scale language models after BERT and the company’s own large-scale data set. Developed the engine “ELYZA Brain”. Since then, not only the improvement of ELYZA Brain, but also the improvement specialized for tasks has been continuously implemented, and ELYZA DI GEST was released as one of the results.
ELYZA is enthusiastic to improve the productivity of white-collar workers who handle words by improving this AI. Specifically, it is intended for use cases such as inputting medical charts, reading contract documents and judicial precedents in lawyer work, creating dialogue memos for operators in call centers, and creating manuscripts for articles in the media.
Summary by cutting interjections such as “Ah” and “Eh” that are peculiar to colloquialism
“ELYZA DIGEST” was also used in a demonstration experiment started on July 1 with Sompo Japan Insurance Co., Ltd., a group company of SOMPO Holdings Co., Ltd. In this demonstration experiment, we aim to develop an AI that summarizes from dialogue text after voice recognition for the summary creation work at the customer center.
When actually summarizing the dialogue text using “ELYZA DIGEST”, it is said that a valid summary sentence can be generated even if there are mistakes in interjections such as “ah” and “uh” peculiar to colloquialism and speech recognition.