Google DeepMind researchers have developed a new AI system that excels in fact-checking, outperforming human annotators and saving costs, but critics question what 'superhuman' really means in this context.
SydeLabs launched its red teaming solution, SydeBox, on March 1, 2024, and has since seen adoption from 15+ enterprises that have detected over 10,000 vulnerabilities across 50+ applications/models.
MineOS enters the high-stakes race for enterprise AI governance with a new module designed to peer inside the black box of AI systems and provide the visibility needed for oversight and control.
According to AI21 Labs Jamba can outperform traditional transformer-based models on generative reasoning tasks as measured by benchmarks such as HellaSwag.
This AI Impact Tour stop will bring together leaders in generative AI and enterprise security, and features case studies from industry giants like Honeywell and Ally Financial, showcasing how they're both leveraging generative AI applications, but also using AI to revolutionize security operations.
Beyond working to block out safety and security-threatening prompt injection attacks, Microsoft has also introduced tooling to focus on the reliability of gen AI apps.
According to Lightning AI, the compiler Thunder achieves up to a 40% speed-up for training LLMs when compared to unoptimized code in real-world scenarios.
While LLMs excel at semantic interpretation, their ability to interpret complex spatial and visual recognition differences is limited. Gaps in these two areas are why jailbreak attacks launched with ASCII art succeed.
ValidMind, a regulatory compliance platform for AI risk management at banks, raises $8.1 million in seed funding to automate model validation and documentation.
According to Zscaler's report, manufacturing generates the most AI traffic, accounting for 20.9% of all AI/ML transactions, followed by finance and insurance (19.9%) and services (16.8%).
There are more than 8,500 performance results in the MLCommons' latest benchmark, testing all manner of combinations and permutations of hardware, software and AI inference use cases.