January 2025

One of the most talked-about topics in AI recently is DeepSeek and its newly launched R-1 model. Its innovative methodology, low operational cost, and high performance have created a substantial impact on the AI community and even affected the U.S. economy. Notably, major AI companies, including Nvidia, experienced significant stock price declines after the announcement.…

AI Security Newsletter (01-20-2025)

Jan 20, 2025

AI, cybersecurity, newsletter

AI, cybersecurity

A study by Anthropic shows that language models, such as Claude 3 Opus, can fake alignment with training objectives to disguise their actual behaviors. Simply put, if you inform the model that it’s being trained and non-compliance will lead to modification, there’s about 15% chance it will act as instructed to avoid changes. This study…

AI Security Newsletter (01-13-2025)

Jan 13, 2025

cybersecurity, newsletter

Andrew Ng’s article on AI deception is a standout in this issue. He provides an overview of research on when AI models become deceptive, highlighting six major tasks tested. A very interesting read. AI agent technology is anticipated to be a major trend by 2025, prompting the inclusion of two articles on the subject in…

My China Trip

Jan 7, 2025

life, travel

china, life, travel

I took a two-week vacation in China with my family during the Christmas and New Year holidays in December. It was a wonderful experience for my kids, as they had not been back to China since 2019. Although I worked in China during the COVID-19 pandemic, its strict travel restrictions made it impossible for them…

AI Security Newsletter (01-06-2025)

Jan 6, 2025