Meta’s shift in how it moderates content has sparked worry, with fresh data revealing a surge in harmful posts since the company loosened its stricter rules. The latest Integrity Report, released after the January policy update, shows an increase in violent and harassing content on Facebook, even though the number of removed posts and enforcement steps have dropped sharply. This report offers the first clear look at the effects of CEO Mark Zuckerberg’s choice to pull back on proactive moderation across Facebook, Instagram, and Threads.
The findings have sparked concern about the possible downsides of Meta’s new approach, which focuses on reducing moderation mistakes and allowing more political speech. However, this shift seems to be leading to a noticeable rise in harmful content.
Rise in Harmful Posts
Meta reported that violent and graphic content on Facebook increased from 0.06–0.07% in late 2024 to 0.09% in the first quarter of 2025. While these numbers may look small, they reflect a large amount of content on a platform used by billions of people.
Similarly, the rate of bullying and harassment increased over the same period, with Meta attributing the rise to a spike in violations in March. “There was a small increase in the prevalence of bullying and harassment content from 0.06-0.07 per cent to 0.07-0.08 per cent on Facebook due to a spike in sharing of violating content in March” according to the report. These numbers signal a reversal of previously declining trends, raising questions about the effectiveness of Meta’s current enforcement strategy.
The recent spike in harmful content on Meta platforms comes as the company significantly reduces its content removals. In the first quarter of 2025, only 3.4 million posts were actioned under Meta’s hate speech policy — the lowest number since 2018. Spam removals also saw a steep drop, from 730 million at the end of 2024 to 366 million in early 2025. Fake account takedowns on Facebook fell from 1.4 billion to 1 billion. Meta has yet to release similar figures for Instagram.
These changes follow Meta’s shift away from wide-scale proactive moderation. The company is now focusing on only the most severe violations, such as child exploitation and terrorism. Posts related to sensitive topics like immigration, gender identity, and race — once moderated — are now treated as political speech and face less scrutiny.
Meta has also narrowed its definition of hate speech. It now only applies to direct attacks and dehumanising language. Content previously flagged for showing contempt or exclusion is now allowed under the revised rules.
Fact-Checking Revamp
Another major change came in early 2025 when Meta ended its third-party fact-checking partnerships in the U.S. Instead, it introduced a crowd-sourced fact-checking tool called Community Notes, which is now active on Facebook, Instagram, Threads, and more recently, on Reels and Threads replies.
While Meta has not yet released data on how frequently these notes are used or how effective they have been, the company says further updates will be provided in future reports. Some experts have raised concerns about the potential for bias or manipulation in a system that relies heavily on user-generated input without established editorial oversight.
Despite the rise in certain types of harmful content, Meta is positioning the new moderation approach as a success, particularly in reducing enforcement errors. According to the company, moderation mistakes dropped by approximately 50 per cent in the United States between the last quarter of 2024 and the first quarter of 2025.
Meta has not detailed how it calculates this figure, but says future reports will include metrics specifically tracking error rates to improve transparency. The company noted that it is working to “strike the right balance” between under-enforcement and overreach.
Teen Safety Still a Priority, Says Meta
While Meta is scaling back moderation for adults, the company says it will continue strong protections for teens. Safeguards against bullying and harmful content remain in place for younger users, and new “Teen Accounts” are being introduced to better filter inappropriate material across Facebook, Instagram, and other platforms.
Meta also stressed the growing role of artificial intelligence in content moderation. Advanced language models are now helping detect and assess posts, with some even outperforming human moderators. In certain cases, AI tools are confidently removing content from review queues when no clear policy violation is found.