Cisco researchers recently evaluated the DeepSeek R1 model using the HarmBench dataset and reported a 100% attack success rate. Looks like DeepSeek R1 has serious security issues, doesn’t it? However, Meta’s LLama 3.1 model also performed poorly, with a 96% success rate in the same test, while OpenAI’s closed-source model o1 had a 25% success rate. I believe the importance of this finding lies not in DeepSeek R1’s vulnerabilities alone but in the broader risks associated with open-source models. These models often lack the financial motivation for robust security measures, leaving them more exposed to threats. It’s a clear signal that caution is necessary when we decide to deploy open-source models in our systems (In the Risks & Security section).
More. Read on.
Risks & Security
Securing the Digital Frontier: America’s AI Challenge
To safeguard national security, the U.S. must secure the digital high ground by advancing AI and quantum computing technologies. Despite the acceleration of attack potential through AI, threat actors remain without new capabilities. Maintaining leadership requires private-sector AI infrastructure advancements, public sector technology reforms, and enhanced public-private cyber defense collaboration. Through united efforts, America can reinforce its security and sustain its AI supremacy.
Securing Non-Human Identities: A Critical Priority
Non-human identities (NHIs) are crucial for software authentication, yet their mismanagement poses severe security threats. Recent breaches underscore the vital need for securing NHIs, as they are integral to development and cloud operations. The OWASP NHI Top 10 project addresses this by offering insights and raising awareness, aiming to bolster security measures for NHIs.
DeepSeek R1 Fails Safety Evaluation
Cisco researchers tested DeepSeek R1, an open-source AI model by Chinese firm DeepSeek, against 50 attacks to provoke harmful behavior. The model succumbed to every attempt, marking it the least secure tested LLM to date. Using the HarmBench dataset, the evaluation spanned categories like cybercrime and misinformation. DeepSeek’s launch has faced criticism over data security, with concerns about its storage practices on Chinese servers.
Securing RAG Pipelines with DeepSeek
DeepSeek is pioneering advancements in Retrieval-Augmented Generation (RAG) Pipelines by focusing on Fine Grained Authorization for Large Language Models (LLMs). This guide underscores the critical need for robust permission systems to ensure secure data access, employing tools like SpiceDB for Relationship-Based Access Control (ReBAC). It presents Post-filter and Pre-filter Authorization approaches, enhancing data safety and compliance. A follow-up guide with code examples is also available for further exploration.
Technology & Tools
Constitutional Classifiers: Bolstering AI Defenses Against Jailbreaks
Anthropic’s Safeguards Research Team has introduced Constitutional Classifiers, a novel defense against AI jailbreaks. The system, boasting a high refusal rate with moderate compute cost, blocks 95% of jailbreak attempts, a significant improvement from the baseline 14% defense. A live demo invites users to challenge the system, with rewards up to $20,000 for successful universal jailbreaks. This effort aims to enhance AI safety and robustness.
Mitigating AI Hallucinations with Amazon Bedrock
Amazon Bedrock Agents offer a robust solution to mitigate hallucinations in large language models (LLMs), crucial in sensitive fields like healthcare and finance. By integrating Retrieval Augmented Generation (RAG) and human-in-the-loop processes, the system improves factual accuracy. A custom hallucination score triggers human intervention, ensuring reliability. This flexible, scalable approach, detailed in a GitHub repository, demonstrates effective hallucination detection and remediation, adaptable to various use cases.
Business & Products
Backline Emerges with AI-Powered Security Solution
Backline, a new startup co-founded by Maor Goldberg, debuts with a $ million seed round led by StageOne Ventures. Utilizing AI agents, Backline aims to ease the burden of security alerts on developers and security teams by automatically remediating vulnerabilities. Their technology, built on large language models, addresses the industry’s overwhelming security backlogs. Backline plans to expand its focus to include software supply chain issues, attracting support from Evolution Equity Partners and Gradient.
Regulation & Policy
Copyright Office Clarifies AI-Generated Content Protection
The U.S. Copyright Office’s Part 2 report affirms that AI-generated outputs can be copyrighted if a human author is involved in creating expressive elements. The Office maintains that AI-assisted creations remain protected, but purely machine-determined content does not. No changes to existing laws are deemed necessary. This report is part of a broader initiative addressing AI’s intersection with copyright, with Part 3 to discuss AI model training on copyrighted works.
Opinions & Analysis
Yann LeCun Envisions a New Era for AI and Robotics
Meta’s chief AI scientist, Yann LeCun, forecasts a transformative leap in AI architectures over the next three to five years, envisioning systems that eclipse today’s capabilities. He predicts a ‘decade of robotics’ marked by AI-driven intelligent applications. LeCun critiques current models for their deficiencies in understanding and reasoning, advocating for ‘world models’ with enhanced real-world comprehension and common sense within five years.
Agentic AI: The Next Frontier for IT Leaders
The latest 2025 Connectivity Benchmark Report by MuleSoft and Deloitte Digital reveals a significant shift towards agentic AI, with 93% of IT leaders planning to introduce autonomous AI agents in the next two years. Despite the enthusiasm, challenges remain: enterprise applications are siloed, with only 29% integrated. Integration hurdles persist, but APIs offer a promising solution, improving IT infrastructure and user experiences for 55% of leaders surveyed.

Leave a comment