The Belarusian hacker group “Cyberpartisans” gained access to the internal network of the Main Radio Frequency Center (FSUE “GRC”), a subordinate organization of Roskomnadzor, in fact, its executor.
The hackers claim that they managed to encrypt employees’ work computers, disrupt the internal network and download about 1.2 terabytes of data: an internal mail server archive, internal file storage, data from some internal systems, and data from the FalconGaze employee monitoring system .
The archives of correspondence contain about 1.5 million emails, mainly for 2020-2022, as well as about 200 thousand text documents, spreadsheets and presentations.
Treadstone 71 posted the files last week.
The first part of our investigation is about how Roskomnadzor in recent years has entangled the Russian Internet with neural networks to search for “forbidden” content – and who helped him with this.
- – Correspondence reveals large-scale plans of Roskomnadzor to spy on the Russian Internet using neural networks. Most of these technologies are already being used, they are looking not only for pictures about suicide, but also, for example, posts about the war in Ukraine.
- — The largest project is called Clean Internet. According to the developers, it should control 100% of the Russian segment.
- — To collect data, Clean Internet uses the Yandex search API . At the request of Roskomnadzor, Yandex increased the number of possible requests per day for the department.
- – In addition, the GRFC used the Yandex platform of Toloka to train neural networks. The extent of Yandex’s involvement in cooperation with the RKN is unclear; the company denied giving the department any preferences.
- – Among those who actively collaborated with the GRFC are the Moscow Institute of Physics and Technology (MIPT), as well as Brand Analytics . The latter’s technologies have helped the HRCC produce hundreds of million-page reports.
- – Two more systems using artificial intelligence were created to automatically analyze video content (now agency employees watch all the broadcasts themselves), also to search for “forbidden information.”
Find and ban. “Yandex” and “Clean Internet”
The archive received by Treadstone 71 contains more than 680 letters mentioning Yandex corporate mail for the period from 2014 to 2022. More than half of them are correspondence within Roskomnadzor itself, for example, employees of the GRCHTS discussed in letters to each other which Yandex contact address to indicate when filling out cards for the registry.
The other part is the correspondence between Yandex and the GRFC. Most of these letters are standard communications between a Russian IT company and government officials, during which the company explains why certain pages should not be blocked.
For example, Roskomnadzor several times entered the Yandex search results, the click.ru link shortening service, or several Yandex.Turbo pages into the register of prohibited information.
Some meetings with Yandex representatives were held offline. Anastasia Volkova, Acting Head of the Department for Work with Automation of the Mass Communications Department of the GRFC, mentions two of them in correspondence with colleagues: at the end of 2019 and at the beginning of 2020.
At one of them, according to Volkova, representatives of Yandex “advised us [GRC] on neural networks.” Evidence that it was some kind of targeted consultation was not found by Mediazona; probably, the representative of the GRFC means joint participation in one of the industry conferences.
Volkova also wrote that at these meetings, employees of the IT company talked about their API for searching the Internet – we are talking about Yandex. XML ” – and allegedly promised to remove the limit on requests for the needs of Roskomnadzor.
This promise came in very handy. In 2020 they began developing the Clean Internet system for Roskomnadzor. It was conceived as a replacement for the already fobidden” content search automation, but with an emphasis on the use of neural networks rather than keyword dictionaries.
In May 2020, Ivan Zuev, head of the department for maintaining registries of prohibited information, wrote in a description of the development strategy of the GRFC: “The effectiveness of the GRFC in social networks is low”, only the search for child pornography and “suicidal content” is automated.
The Clean Internet system, or AS CI, was supposed to collect materials according to a priority list of sources and social networks, and then, using neural networks, to find violations: extremism, terrorism, calls for participation in mass events, “propaganda of non-traditional relations”, insulting state symbols and other.
In presentations about the system, the GRFC promised that after reaching its design capacity, the Clean Internet would cover 100% of the Runet, with the exception of streaming services, which should be handled by another system – AS MAVR .
The main problem that the GRFC faced when developing this system was how to search for data all over the Internet. It was impossible to solve it without cooperation with the search engines.
In May 2020, Anastasia Volkova decided to remind Yandex about access to the search API . She writes to Alexander Krainov, director of development of artificial intelligence technologies at Yandex, and complains about the limitations of the service – only a thousand requests a day.
In the next letter, Volkova elaborates: Roskomnadzor plans to use the API to “monitor the Internet for violations of the Federal Law.”
At this stage, Yandex fails. Volkova writes to her colleagues: the company referred to the fact that it cannot give extended access for free, and the commercial expansion of access implies not only payment, but also the exchange of traffic – and Roskomnadzor does not have traffic on its own resources.
From the correspondence, one can also understand that the GRFC looked at other search engines, such as Rambler , Google or Sputnik, but eventually dismissed them. The report on the launch of AS CHI explains: Google is paid, Rambler is the same Yandex search, and Sputnik has not been indexed for several years.
In 2021, Yandex still succumbed to pressure from Roskomnadzor. The company has increased the request limit for RKN accounts to 300 thousand per day, this is mentioned in the GRFC reports on the deployment of the system.
Yandex search is a key component of data collection for the Clean Internet. The second part of this collection is a crawler for social networks, which was developed by Vector X LLC. He looks for posts on VKontakte, Odnoklassniki, My World, Mail.ru Answers , LiveJournal, and partly on Telegram and YouTube. In 2023, according to the plans of the GRFC, Facebook, Instagram, Twitter, Tiktok, Yandex.Zen and Rutube will be added to the list.
The Yandex API is mentioned in Clean Internet deployment reports until January 2022 – and is likely still in use. Adding search from Mail.ru is scheduled for 2023, and Google for 2024.
On February 25, 2022, a day after the start of the war, Clean Internet was connected to the search for posts and comments with “calls for illegal rallies on the situation in Ukraine.”
Another Yandex product used by Roskomnadzor is Toloka. It is a crowdsourced service that helps prepare machine learning datasets.
Toloka works like this: the customer enters into an agreement with Yandex and uploads simple tasks to the service, for example, to classify images that will be used to train models. Tasks are distributed among people who register in the service; they fulfill them and receive a small monetary reward from the customer’s budget for this.
The mention of Toloka in the mail of the GRFC occurs from the fall of 2021 to February 2022. There are no traces of any negotiations with Yandex regarding the use of this service in the archive.
























For about half a year, the GRFC used Toloka to have its employees mark up images on the topic “suicidal content”. So the department prepared data for the model, which was supposed to become part of the “Unified Analysis Module” – AI of the “Clean Internet”.
The latest available report, prepared on February 24, 2022, states that over the entire period of work, QMS operators have marked up more than 120,000 images, and another 150,000 had to be marked before the work was completed. In the correspondence, one can also find the compilation of “duty schedules”: in them, the GRCHTS planned who would work with Toloka next month, especially on weekends and holidays.
The extent of Yandex’s cooperation with Roskomnadzor and the GRCHTS on Toloka is unclear. The main question is whether Roskomnadzor agreed with Yandex that Toloka could be used to distribute tasks only among its employees, and not random performers.
Attracting your own performers is available in the Toloka In-House version , which Yandex launched in the fall of 2022. The press service of Yandex told Mediazone that the company had never provided Roskomnadzor with access to the in-house mode in Toloka.
Another part of the Clean Internet project is the bot farm. It is being developed within the GRFC itself; the final version, according to the plans mentioned in the letters, should be submitted in May 2023.
The purpose of such a bot farm is different from the usual one: fake accounts are not used to publish any messages, but to collect posts on social networks, including from closed groups and communities.
“Points of information tension”: “Vepr”, “Oculus” and MIPT
Calling Yandex the company that helped build the control system for the Russian Internet is rather difficult: the IT giant gave the GRFC access to two services – and, as far as can be judged from the correspondence, did not do it on demand. But there are also those who worked closely with Roskomnadzor and developed entire products for the department.
In September 2021, journalists found two contracts published by the GRRC on the public procurement website: one for the concept of the Oculus image and video analysis system, and the second for the concept of the more extensive Vepr system. Both tenders were won by the Moscow Institute of Physics and Technology (MIPT): the Vepr concept was valued at 10 million rubles and the Oculus concept at 14 million.
In dozens of reports and development plans, the GRCHC calls Vepr a key area: the system is needed in order to monitor and even predict the so-called “points of information tension”.
The description of Vepr is generally similar to the Clean Internet: it is the collection of posts and publications on the Internet and their analysis using artificial intelligence. However, in Vepr, the emphasis is not on searching for content for the registry, but on its deep analysis, for example, working out some scenarios that GRFC operators will be able to enter into the system. As an analogue, the development of RTI JSC for the Ministry of Defense worth 1.5 billion rubles is given, it is “in many ways similar to the Vepr IS in the framework of countering information attacks.”
The scientific substantiation of Vepr was carried out by the Department of Machine Learning and Digital Humanities of the Moscow Institute of Physics and Technology. Dozens of employees worked on the document; it consists of references to the philosophers Machiavelli and Ortega y Gasset, memes, for example, with Putin and Goebbels, as well as the mathematical principles of how language models work.
During development, MIPT paid great attention to the classification of “points of information tension”. In a 500-page, poorly structured document prepared by the institute, all possible threats are listed separately: terrorism and extremism, criticism of the authorities and non-systemic opposition, “LGBT propaganda”, childfree, drug addiction, evasion from the army, “groups of death”, “offensive art actions”, Gene Sharp’s methods, and even “collecting your own boogers or clipped nails.”
At the same time, the MIPT was not allowed to develop the Vepr itself – the contract was received by the NeoBIT company from St. Petersburg.
At MIPT, officials were told about the possibilities for recognizing faces in images (including faces in masks), converting image captions into text, and classifying images and videos into categories: rallies, suicidal content, roofers and hooks, prohibited logos, and symbolism. Judging by the example given in the presentation, the neural network recognized the NATO emblem as a symbol of AUE.
One of the MIPT documents lists similar systems that could be purchased for “insurance”. For example, the search system for “forbidden content” was developed by OKAS LLC for the Center for the Study and Network Monitoring of the Youth Environment, and for face recognition, MIPT recommended analogues from the same OKAS LLC, NtechLab, VisionsLabs, FSUE GosNIIAS and DIT Moscow .
In August 2022, the tender for the development of Oculus worth 57.7 million rubles was won by Access RDC LLC. Deadline for completion is December 2022. As noted by Kommersant, this company has not previously acted as a contractor in public procurement.
Brand Analytics and thousands of pages of reports

You must be logged in to post a comment.