What’s really on AI researchers' minds?

13 May

Among the activities at last month's generative AI research retreat was a simple but revealing exercise: propose topics, vote on them, and discuss the winners. The result was an unfiltered look at the burning questions that are on our researchers’ minds.

Five people sitting round a table — Yuhua Li (centre) leads a discussion during the round table session at the Gen AI Hub Retreat

Who should be judging the quality of generative examples?
Should the quality of generative samples be the only criteria for outputs?

These were just two of the questions raised during a discussion led by Yukun Lai of Cardiff University, which explored the evaluation of subjective criteria for GenAI results.

Of the 20 subjects proposed for discussion, Yukun's "evaluation of subjective criteria for GenAI results" drew the most votes, with questions around how to better reflect human diversity and creativity in Gen AI outputs among the key considerations.

The group covered the need for standard benchmarks for better creativity to generate good content and how quantitative metrics can be used when humans’ perspectives are so diverse, when it comes to creativity.

The group also wrestled with the question of who should be judging the quality of generative outputs, particularly when stakeholders with varying needs will measure success differently. This led to a broader discussion about what belongs in any benchmark beyond quality alone, from ethical constraints to more subjective, domain-specific criteria.

Person with a microphone standing with other people sitting round a table — Arshdeep Singh (left) feeds back following the discussion exploring the evaluation of subjective criteria for GenAI results, led by Yukun Lai (second from left).

Also among the four topics was the challenge of representing multimodal data in a single unified space, how to build AI models capable of handling all data types simultaneously. Unlike the human brain, which processes text, images, audio and video together with ease, AI models encode each separately, demanding enormous computing power and lengthy development times.

The discussion, proposed and facilitated by Yuhua Li from Cardiff University, explored the idea of different modalities with an encoder in the middle so we learn all the data modalities at the same time, map raw data into shared latent space and use learning strategies to better align different data modalities such as contrastive learning. They observed how new computing paradigms are shifting away from traditional encoding measures.

After the session, Yuhua reflected on the discussion: “We explored how today’s multimodal systems are typically built by learning separate representations for each modality and aligning them after the fact, through techniques such as contrastive learning.

“While effective, this paradigm comes at a high cost: scaling generative models on top of these representations requires enormous computational resources and long development cycles.

“The discussion, therefore, pointed toward a more ambitious vision: rethinking how multimodal data are represented in the first place, and exploring new paradigms that could encode different modalities natively within a truly unified space, in a way that is both more efficient and more aligned with how humans integrate information.”

The other two round table discussions covered AI for science, proposed by Joshua Placidi of Imperial College London, and structured, or constrained prediction from Generative AI, which was suggested by Neill Campbell of UCL.

Here are the ten most popular topics that were proposed for the round table discussions :

Evaluation of subjective criteria for Gen AI results*
Structured (or constrained) prediction from generative AI. How do we generate non-vector data or incorporate domain knowledge (eg physics or chemistry) most effectively? *
Unified representation of multimodal data in a single shared space like human brains, so that it can be fed into generative models for learning and inference*
AI for Science, moving beyond the image and text domains*
Is the future of generative AI dominated by a few general foundation models, or by an ecosystem of specialised, domain-aware models?
Generalisation beyond what has been memorized in training data; pattern independent reasoning
How to validate multi-agent systems. How to minimise compounding errors or collusion or mitigate against black swan events?
"Representation learning in Gen AI
Reliability and generalisation remain areas which if improved could increase adoption of gen AI in Pharma. What is the solution? Data collection? New architectures? Both?
4D Content Generation: Integrating multimodal generative AI and VR technologies to synthesize high-fidelity, dynamic 4D scenes and characters.

If you are interested in exploring any of these areas further, check out our Live Projects Board to discover which researchers are grappling with these challenges.

Yukun Lai recently took part in our Meet the Researchers Q&A. You can read it on our website.

*Topic selected for discussion

Creative AIGenerative AI

Rosie Niven

Rosie joined the hub from the regional university consortium Science and Engineering Sourh where she was a Communications and Events Manager. Since 2020 she has held a number of communications roles at UCL. Previously a journalist, Rosie has worked in higher education organisations since 2014, including Jisc and Universities UK where she edited the Efficiency Exchange website.

What’s really on AI researchers' minds?

Meet the researcher – Yukun Lai

The AI Hub in Generative Models