News
Hub researchers use concept-based attack to stress test AI safety
A method for testing AI safety by using human-like concepts to trick generative models into making mistakes, has been developed by Hub researchers.
A method for testing AI safety by using human-like concepts to trick generative models into making mistakes, has been developed by Hub researchers.