Week Ending 12.10.2023

December 11, 2023 Craig Smith

RESEARCH WATCH: 12.10.2023

SPONSORED BY

Digimarc digital watermarks invisibly guard your digital assets to protect against misuse, prove copyright ownership, and verify authenticity. In an era of artificial intelligence, don’t leave your images and other digital content exposed. Demand superior content protection and maintain trust in your brand with Digimarc.

Checkout Digimarc - https://www.digimarc.com/

A Distributed ADMM-based Deep Learning Approach for Thermal Control in Multi-Zone Buildings

The method to optimize temperature setpoints in multi-zone buildings for demand response using distributed optimization and deep learning. This has applications in reducing peak power usage and operating costs in smart buildings and grids.

Authors: Vincent Taboga, Hanane Dagdougui

Link: https://arxiv.org/abs/2312.05073v1

Date: 2023-12-08

Summary:

The surge in electricity use, coupled with the dependency on intermittent renewable energy sources, poses significant hurdles to effectively managing power grids, particularly during times of peak demand. Demand Response programs and energy conservation measures are essential to operate energy grids while ensuring a responsible use of our resources This research combines distributed optimization using ADMM with Deep Learning models to plan indoor temperature setpoints effectively. A two-layer hierarchical structure is used, with a central building coordinator at the upper layer and local controllers at the thermal zone layer. The coordinator must limit the building's maximum power by translating the building's total power to local power targets for each zone. Local controllers can modify the temperature setpoints to meet the local power targets. The resulting control algorithm, called Distributed Planning Networks, is designed to be both adaptable and scalable to many types of buildings, tackling two of the main challenges in the development of such systems. The proposed approach is tested on an 18-zone building modeled in EnergyPlus. The algorithm successfully manages Demand Response peak events.

--------------------------------------------------------------------------------------------------------

KwaiAgents: Generalized Information-seeking Agent System with Large Language Models

The paper introduces an agent system called KwaiAgents that uses large language models for generalized information seeking. This could enable more capable and flexible conversational AI assistants across domains like education, e-commerce, and social media.

Authors: Haojie Pan, Zepeng Zhai, Hao Yuan, Yaojia Lv, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin

Link: https://arxiv.org/abs/2312.04889v1

Date: 2023-12-08

Summary:

Driven by curiosity, humans have continually sought to explore and understand the world around them, leading to the invention of various tools to satiate this inquisitiveness. Despite not having the capacity to process and memorize vast amounts of information in their brains, humans excel in critical thinking, planning, reflection, and harnessing available tools to interact with and interpret the world, enabling them to find answers efficiently. The recent advancements in large language models (LLMs) suggest that machines might also possess the aforementioned human-like capabilities, allowing them to exhibit powerful abilities even with a constrained parameter count. In this paper, we introduce KwaiAgents, a generalized information-seeking agent system based on LLMs. Within KwaiAgents, we propose an agent system that employs LLMs as its cognitive core, which is capable of understanding a user's query, behavior guidelines, and referencing external documents. The agent can also update and retrieve information from its internal memory, plan and execute actions using a time-aware search-browse toolkit, and ultimately provide a comprehensive response. We further investigate the system's performance when powered by LLMs less advanced than GPT-4, and introduce the Meta-Agent Tuning (MAT) framework, designed to ensure even an open-sourced 7B or 13B model performs well among many agent systems. We exploit both benchmark and human evaluations to systematically validate these capabilities. Extensive experiments show the superiority of our agent system compared to other autonomous agents and highlight the enhanced generalized agent-abilities of our fine-tuned LLMs.

--------------------------------------------------------------------------------------------------------

HuRef: HUman-REadable Fingerprint for Large Language Models

The paper proposes a fingerprinting method called HuRef to identify the base model of large language models, enabling copyright protection and model identification. This supports responsible AI practices around licensing and auditing for deployed LLMs.

Authors: Boyi Zeng, Chenghu Zhou, Xinbing Wang, Zhouhan Lin

Link: https://arxiv.org/abs/2312.04828v1

Date: 2023-12-08

Summary:

Protecting the copyright of large language models (LLMs) has become crucial due to their resource-intensive training and accompanying carefully designed licenses. However, identifying the original base model of an LLM is challenging due to potential parameter alterations through fine-tuning or continued pretraining. In this study, we introduce HuRef, a human-readable fingerprint for LLMs that uniquely identifies the base model without exposing model parameters or interfering with training. We first observe that the vector direction of LLM parameters remains stable after the model has converged during pretraining, showing negligible perturbations through subsequent training steps, including continued pretraining, supervised fine-tuning (SFT), and RLHF, which makes it a sufficient condition to identify the base model. The necessity is validated by continuing to train an LLM with an extra term to drive away the model parameters' direction and the model becomes damaged. However, this direction is vulnerable to simple attacks like dimension permutation or matrix rotation, which significantly change it without affecting performance. To address this, leveraging the Transformer structure, we systematically analyze potential attacks and define three invariant terms that identify an LLM's base model. We make these invariant terms human-readable by mapping them to a Gaussian vector using a convolutional encoder and then converting it into a natural image with StyleGAN2. Our method generates a dog image as an identity fingerprint for an LLM, where the dog's appearance strongly indicates the LLM's base model. Experimental results across various LLMs demonstrate the effectiveness of our method, the generated dog image remains invariant to different training steps, including SFT, RLHF, or even continued pretraining with augmented vocabulary in a new language.

--------------------------------------------------------------------------------------------------------

AI safety by debate via regret minimization

The paper analyzes whether debate between AI systems based on regret minimization can ensure safe and beneficial AI systems. This provides a technique to enhance reliability and oversight for real-world AI deployments.

Authors: Xinyi Chen, Angelica Chen, Dean Foster, Elad Hazan

Link: https://arxiv.org/abs/2312.04792v1

Date: 2023-12-08

Summary:

We consider the setting of AI safety by debate as a repeated game. We consider the question of efficient regret minimization in this setting, when the players are either AIs or humans, equipped with access to computationally superior AIs. In such a setting, we characterize when internal and external regret can be minimized efficiently. We conclude with conditions in which a sequence of strategies converges to a correlated equilibrium.

--------------------------------------------------------------------------------------------------------

The Graph Lottery Ticket Hypothesis: Finding Sparse, Informative Graph Structure

The paper studies graph lottery tickets, which are extremely sparse subgraphs that can match full graph performance. This shows promise for finding compact informative graph structure to improve graph-based learning.

Authors: Anton Tsitsulin, Bryan Perozzi

Link: https://arxiv.org/abs/2312.04762v1

Date: 2023-12-08

Summary:

Graph learning methods help utilize implicit relationships among data items, thereby reducing training label requirements and improving task performance. However, determining the optimal graph structure for a particular learning task remains a challenging research problem. In this work, we introduce the Graph Lottery Ticket (GLT) Hypothesis - that there is an extremely sparse backbone for every graph, and that graph learning algorithms attain comparable performance when trained on that subgraph as on the full graph. We identify and systematically study 8 key metrics of interest that directly influence the performance of graph learning algorithms. Subsequently, we define the notion of a "winning ticket" for graph structure - an extremely sparse subset of edges that can deliver a robust approximation of the entire graph's performance. We propose a straightforward and efficient algorithm for finding these GLTs in arbitrary graphs. Empirically, we observe that performance of different graph learning algorithms can be matched or even exceeded on graphs with the average degree as low as 5.

--------------------------------------------------------------------------------------------------------

Digital Life Project: Autonomous 3D Characters with Social Intelligence

The Digital Life Project presents a framework to create autonomous virtual characters that can engage in dialogue and express themselves through body language. This has potential to greatly advance interactive characters for entertainment, education, and social connection.

Authors: Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu

Link: https://arxiv.org/abs/2312.04547v1

Date: 2023-12-07

Summary:

In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/

--------------------------------------------------------------------------------------------------------

Using Large Language Models for Hyperparameter Optimization

The paper on using LLMs for hyperparameter optimization shows they can efficiently search complex parameter spaces, improving automated machine learning. This can make model development more accessible and optimized.

Authors: Michael R. Zhang, Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba

Link: https://arxiv.org/abs/2312.04528v1

Date: 2023-12-07

Summary:

This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better than traditional HPO methods like random search and Bayesian optimization on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs, going beyond the capabilities of existing HPO approaches. Our findings suggest that LLMs are a promising tool for improving efficiency in the traditional decision-making problem of hyperparameter optimization.

--------------------------------------------------------------------------------------------------------

CLadder: A Benchmark to Assess Causal Reasoning Capabilities of Language Models

The CLadder benchmark evaluates causal reasoning abilities of language models, assessing if they can perform formal causal inference. Understanding these capacities is key for reliable deployment of LLMs.

Authors: Zhijing Jin, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng Lyu, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan, Bernhard Schölkopf

Link: https://arxiv.org/abs/2312.04350v1

Date: 2023-12-07

Summary:

The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules. To address this, we propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al. We compose a large dataset, CLadder, with 10K samples: based on a collection of causal graphs and queries (associational, interventional, and counterfactual), we obtain symbolic questions and ground-truth answers, through an oracle causal inference engine. These are then translated into natural language. We evaluate multiple LLMs on our dataset, and we introduce and evaluate a bespoke chain-of-thought prompting strategy, CausalCoT. We show that our task is highly challenging for LLMs, and we conduct an in-depth analysis to gain deeper insight into the causal reasoning abilities of LLMs. Our data is open-sourced at https://huggingface.co/datasets/causalNLP/cladder, and our code can be found at https://github.com/causalNLP/cladder.

--------------------------------------------------------------------------------------------------------

A Transformer Model for Symbolic Regression towards Scientific Discovery

The symbolic regression transformer is a model tailored for discovering mathematical relationships in data, enabling interpretable ML and scientific discovery.

Authors: Florian Lalande, Yoshitomo Matsubara, Naoya Chiba, Tatsunori Taniai, Ryo Igarashi, Yoshitala Ushiku

Link: https://arxiv.org/abs/2312.04070v1

Date: 2023-12-07

Summary:

Symbolic Regression (SR) searches for mathematical expressions which best describe numerical datasets. This allows to circumvent interpretation issues inherent to artificial neural networks, but SR algorithms are often computationally expensive. This work proposes a new Transformer model aiming at Symbolic Regression particularly focused on its application for Scientific Discovery. We propose three encoder architectures with increasing flexibility but at the cost of column-permutation equivariance violation. Training results indicate that the most flexible architecture is required to prevent from overfitting. Once trained, we apply our best model to the SRSD datasets (Symbolic Regression for Scientific Discovery datasets) which yields state-of-the-art results using the normalized tree-based edit distance, at no extra computational cost.

--------------------------------------------------------------------------------------------------------

An inductive bias from quantum mechanics: learning order effects with non-commuting measurements

The paper on learning order effects with quantum measurements leverages unique properties of quantum systems as an in-built bias for sequential data. This explores new quantum ML approaches.

Authors: Kaitlin Gili, Guillermo Alonso, Maria Schuld

Link: https://arxiv.org/abs/2312.03862v1

Date: 2023-12-06

Summary:

There are two major approaches to building good machine learning algorithms: feeding lots of data into large models, or picking a model class with an ''inductive bias'' that suits the structure of the data. When taking the second approach as a starting point to design quantum algorithms for machine learning, it is important to understand how mathematical structures in quantum mechanics can lead to useful inductive biases in quantum models. In this work, we bring a collection of theoretical evidence from the Quantum Cognition literature to the field of Quantum Machine Learning to investigate how non-commutativity of quantum observables can help to learn data with ''order effects'', such as the changes in human answering patterns when swapping the order of questions in a survey. We design a multi-task learning setting in which a generative quantum model consisting of sequential learnable measurements can be adapted to a given task -- or question order -- by changing the order of observables, and we provide artificial datasets inspired by human psychology to carry out our investigation. Our first experimental simulations show that in some cases the quantum model learns more non-commutativity as the amount of order effect present in the data is increased, and that the quantum model can learn to generate better samples for unseen question orders when trained on others - both signs that the model architecture suits the task.

--------------------------------------------------------------------------------------------------------

Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence of Training Data

The work showing perceptual losses can train models without natural data demonstrates surprising generalization. It suggests future self-supervised methods based solely on human-inspired metrics.

Authors: Tashi Namgyal, Alexander Hepburn, Raul Santos-Rodriguez, Valero Laparra, Jesus Malo

Link: https://arxiv.org/abs/2312.03455v1

Date: 2023-12-06

Summary:

Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio. They are designed to mimic the perceptual behaviour of human observers and usually reflect structures found in natural signals. This motivates their use as loss functions for training generative models such that models will learn to capture the structure held in the metric. We take this idea to the extreme in the audio domain by training a compressive autoencoder to reconstruct uniform noise, in lieu of natural data. We show that training with perceptual losses improves the reconstruction of spectrograms and re-synthesized audio at test time over models trained with a standard Euclidean loss. This demonstrates better generalisation to unseen natural signals when using perceptual metrics.

--------------------------------------------------------------------------------------------------------

Assertion Enhanced Few-Shot Learning: Instructive Technique for Large Language Models to Generate Educational Explanations

The paper on few-shot learning for explanation generation proposes a new prompting technique to improve the quality of explanations from language models. This could benefit education applications like intelligent tutoring systems.

Authors: Tasmia Shahriar, Noboru Matsuda, Kelly Ramos

Link: https://arxiv.org/abs/2312.03122v1

Date: 2023-12-05

Summary:

Human educators possess an intrinsic ability to anticipate and seek educational explanations from students, which drives them to pose thought-provoking questions when students cannot articulate these explanations independently. We aim to imbue Intelligent Tutoring Systems with this ability using few-shot learning capability of Large Language Models. Our work proposes a novel prompting technique, Assertion Enhanced Few-Shot Learning, to facilitate the generation of accurate, detailed oriented educational explanations. Our central hypothesis is that, in educational domain, few-shot demonstrations are necessary but not a sufficient condition for quality explanation generation. We conducted a study involving 12 in-service teachers, comparing our approach to Traditional Few-Shot Learning. The results show that Assertion Enhanced Few-Shot Learning improves explanation accuracy by 15% and yields higher-quality explanations, as evaluated by teachers. We also conduct a qualitative ablation study to factor the impact of assertions to provide educator-friendly prompting guidelines for generating explanations in their domain of interest.

--------------------------------------------------------------------------------------------------------

ScAR: Scaling Adversarial Robustness for LiDAR Object Detection

The work on adversarial attacks against LiDAR models analyzes model sensitivity to 3D scaling. It provides both attacks and defense methods to improve robustness, useful for reliable autonomous driving systems.

Authors: Xiaohu Lu, Hayder Radha

Link: https://arxiv.org/abs/2312.03085v1

Date: 2023-12-05

Summary:

The adversarial robustness of a model is its ability to resist adversarial attacks in the form of small perturbations to input data. Universal adversarial attack methods such as Fast Sign Gradient Method (FSGM) and Projected Gradient Descend (PGD) are popular for LiDAR object detection, but they are often deficient compared to task-specific adversarial attacks. Additionally, these universal methods typically require unrestricted access to the model's information, which is difficult to obtain in real-world applications. To address these limitations, we present a black-box Scaling Adversarial Robustness (ScAR) method for LiDAR object detection. By analyzing the statistical characteristics of 3D object detection datasets such as KITTI, Waymo, and nuScenes, we have found that the model's prediction is sensitive to scaling of 3D instances. We propose three black-box scaling adversarial attack methods based on the available information: model-aware attack, distribution-aware attack, and blind attack. We also introduce a strategy for generating scaling adversarial examples to improve the model's robustness against these three scaling adversarial attacks. Comparison with other methods on public datasets under different 3D object detection architectures demonstrates the effectiveness of our proposed method.

--------------------------------------------------------------------------------------------------------

Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions

The research on using LLMs for conversational question answering aims to simulate student-teacher dialogues, evaluating the potential to replace costly human annotation. Success here could greatly advance the development of assistive dialogue agents.

Authors: Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi

Link: https://arxiv.org/abs/2312.02913v1

Date: 2023-12-05

Summary:

Conversational question-answering (CQA) systems aim to create interactive search systems that effectively retrieve information by interacting with users. To replicate human-to-human conversations, existing work uses human annotators to play the roles of the questioner (student) and the answerer (teacher). Despite its effectiveness, challenges exist as human annotation is time-consuming, inconsistent, and not scalable. To address this issue and investigate the applicability of large language models (LLMs) in CQA simulation, we propose a simulation framework that employs zero-shot learner LLMs for simulating teacher-student interactions. Our framework involves two LLMs interacting on a specific topic, with the first LLM acting as a student, generating questions to explore a given search topic. The second LLM plays the role of a teacher by answering questions and is equipped with additional information, including a text on the given topic. We implement both the student and teacher by zero-shot prompting the GPT-4 model. To assess the effectiveness of LLMs in simulating CQA interactions and understand the disparities between LLM- and human-generated conversations, we evaluate the simulated data from various perspectives. We begin by evaluating the teacher's performance through both automatic and human assessment. Next, we evaluate the performance of the student, analyzing and comparing the disparities between questions generated by the LLM and those generated by humans. Furthermore, we conduct extensive analyses to thoroughly examine the LLM performance by benchmarking state-of-the-art reading comprehension models on both datasets. Our results reveal that the teacher LLM generates lengthier answers that tend to be more accurate and complete. The student LLM generates more diverse questions, covering more aspects of a given topic.

--------------------------------------------------------------------------------------------------------

BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

The cross-style image benchmark evaluates multimodal language models across various image transformations, revealing performance gaps. The analysis helps identify capabilities required for versatile computer vision systems.

Authors: Rizhao Cai, Zirui Song, Dayan Guan, Zhenhao Chen, Xing Luo, Chenyu Yi, Alex Kot

Link: https://arxiv.org/abs/2312.02896v2

Date: 2023-12-06

Summary:

Large Multimodal Models (LMMs) such as GPT-4V and LLaVA have shown remarkable capabilities in visual reasoning with common image styles. However, their robustness against diverse style shifts, crucial for practical applications, remains largely unexplored. In this paper, we propose a new benchmark, BenchLMM, to assess the robustness of LMMs against three different styles: artistic image style, imaging sensor style, and application style, where each style has five sub-styles. Utilizing BenchLMM, we comprehensively evaluate state-of-the-art LMMs and reveal: 1) LMMs generally suffer performance degradation when working with other styles; 2) An LMM performs better than another model in common style does not guarantee its superior performance in other styles; 3) LMMs' reasoning capability can be enhanced by prompting LMMs to predict the style first, based on which we propose a versatile and training-free method for improving LMMs; 4) An intelligent LMM is expected to interpret the causes of its errors when facing stylistic variations. We hope that our benchmark and analysis can shed new light on developing more intelligent and versatile LMMs.

--------------------------------------------------------------------------------------------------------

Toward autocorrection of chemical process flowsheets using large language models

The paper exploring autocorrection of process diagrams with language models is a novel application of LLMs. If methods translate to real diagrams, it could automate tedious engineering workflows.

Authors: Lukas Schulze Balhorn, Marc Caballero, Artur M. Schweidtmann

Link: https://arxiv.org/abs/2312.02873v1

Date: 2023-12-05

Summary:

The process engineering domain widely uses Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (P&IDs) to represent process flows and equipment configurations. However, the P&IDs and PFDs, hereafter called flowsheets, can contain errors causing safety hazards, inefficient operation, and unnecessary expenses. Correcting and verifying flowsheets is a tedious, manual process. We propose a novel generative AI methodology for automatically identifying errors in flowsheets and suggesting corrections to the user, i.e., autocorrecting flowsheets. Inspired by the breakthrough of Large Language Models (LLMs) for grammatical autocorrection of human language, we investigate LLMs for the autocorrection of flowsheets. The input to the model is a potentially erroneous flowsheet and the output of the model are suggestions for a corrected flowsheet. We train our autocorrection model on a synthetic dataset in a supervised manner. The model achieves a top-1 accuracy of 80% and a top-5 accuracy of 84% on an independent test dataset of synthetically generated flowsheets. The results suggest that the model can learn to autocorrect the synthetic flowsheets. We envision that flowsheet autocorrection will become a useful tool for chemical engineers.

--------------------------------------------------------------------------------------------------------

MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition

The work on accelerating neural nets proposes processing multiple inputs simultaneously to reduce computational costs. This superposition approach natively exploits model capacity for efficiency gains.

Authors: Nicolas Menet, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi

Link: https://arxiv.org/abs/2312.02829v1

Date: 2023-12-05

Summary:

With the advent of deep learning, progressively larger neural networks have been designed to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of inference by exploiting computation in superposition. To reduce the computational burden per input, we propose Multiple-Input-Multiple-Output Neural Networks (MIMONets) capable of handling many inputs at once. MIMONets augment various deep neural network architectures with variable binding mechanisms to represent an arbitrary number of inputs in a compositional data structure via fixed-width distributed representations. Accordingly, MIMONets adapt nonlinear neural transformations to process the data structure holistically, leading to a speedup nearly proportional to the number of superposed input items in the data structure. After processing in superposition, an unbinding mechanism recovers each transformed input of interest. MIMONets also provide a dynamic trade-off between accuracy and throughput by an instantaneous on-demand switching between a set of accuracy-throughput operating points, yet within a single set of fixed parameters. We apply the concept of MIMONets to both CNN and Transformer architectures resulting in MIMOConv and MIMOFormer, respectively. Empirical evaluations show that MIMOConv achieves about 2-4 x speedup at an accuracy delta within [+0.68, -3.18]% compared to WideResNet CNNs on CIFAR10 and CIFAR100. Similarly, MIMOFormer can handle 2-4 inputs at once while maintaining a high average accuracy within a [-1.07, -3.43]% delta on the long range arena benchmark. Finally, we provide mathematical bounds on the interference between superposition channels in MIMOFormer. Our code is available at https://github.com/IBM/multiple-input-multiple-output-nets.

--------------------------------------------------------------------------------------------------------

LExCI: A Framework for Reinforcement Learning with Embedded Systems

The paper on a reinforcement learning framework for embedded systems proposes LExCI, an open-source tool enabling RL agent training directly on low-level hardware. This could help deploy optimized policies on real-world control systems.

Authors: Kevin Badalian, Lucas Koch, Tobias Brinkmann, Mario Picerno, Marius Wegener, Sung-Yong Lee, Jakob Andert

Link: https://arxiv.org/abs/2312.02739v1

Date: 2023-12-05

Summary:

Advances in artificial intelligence (AI) have led to its application in many areas of everyday life. In the context of control engineering, reinforcement learning (RL) represents a particularly promising approach as it is centred around the idea of allowing an agent to freely interact with its environment to find an optimal strategy. One of the challenges professionals face when training and deploying RL agents is that the latter often have to run on dedicated embedded devices. This could be to integrate them into an existing toolchain or to satisfy certain performance criteria like real-time constraints. Conventional RL libraries, however, cannot be easily utilised in conjunction with that kind of hardware. In this paper, we present a framework named LExCI, the Learning and Experiencing Cycle Interface, which bridges this gap and provides end-users with a free and open-source tool for training agents on embedded systems using the open-source library RLlib. Its operability is demonstrated with two state-of-the-art RL-algorithms and a rapid control prototyping system.

--------------------------------------------------------------------------------------------------------

Towards the Inferrence of Structural Similarity of Combinatorial Landscapes

The work on analyzing similarities in combinatorial optimization landscapes aims to relate problem structure to algorithm performance. The ability to transfer strategies based on landscape traits could significantly improve computational problem solving.

Authors: Mingyu Huang, Ke Li

Link: https://arxiv.org/abs/2312.02720v1

Date: 2023-12-05

Summary:

One of the most common problem-solving heuristics is by analogy. For a given problem, a solver can be viewed as a strategic walk on its fitness landscape. Thus if a solver works for one problem instance, we expect it will also be effective for other instances whose fitness landscapes essentially share structural similarities with each other. However, due to the black-box nature of combinatorial optimization, it is far from trivial to infer such similarity in real-world scenarios. To bridge this gap, by using local optima network as a proxy of fitness landscapes, this paper proposed to leverage graph data mining techniques to conduct qualitative and quantitative analyses to explore the latent topological structural information embedded in those landscapes. By conducting large-scale empirical experiments on three classic combinatorial optimization problems, we gain concrete evidence to support the existence of structural similarity between landscapes of the same classes within neighboring dimensions. We also interrogated the relationship between landscapes of different problem classes.

--------------------------------------------------------------------------------------------------------

Reconciling AI Performance and Data Reconstruction Resilience for Medical Imaging

The research on balancing model utility and training data privacy in medical imaging contrasts the effects of differential privacy versus reconstruction vulnerability. Their analysis argues for always using privacy techniques, while allowing high budgets, when handling sensitive data. This helps guide responsible and ethical AI development.

Authors: Alexander Ziller, Tamara T. Mueller, Simon Stieger, Leonhard Feiner, Johannes Brandt, Rickmer Braren, Daniel Rueckert, Georgios Kaissis

Link: https://arxiv.org/abs/2312.04590v1

Date: 2023-12-05

Summary:

Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive, for example in medical imaging. Privacy Enhancing Technologies (PETs), such as Differential Privacy (DP), aim to circumvent these susceptibilities. DP is the strongest possible protection for training models while bounding the risks of inferring the inclusion of training samples or reconstructing the original data. DP achieves this by setting a quantifiable privacy budget. Although a lower budget decreases the risk of information leakage, it typically also reduces the performance of such models. This imposes a trade-off between robust performance and stringent privacy. Additionally, the interpretation of a privacy budget remains abstract and challenging to contextualize. In this study, we contrast the performance of AI models at various privacy budgets against both, theoretical risk bounds and empirical success of reconstruction attacks. We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible. We thus conclude that not using DP -- at all -- is negligent when applying AI models to sensitive data. We deem those results to lie a foundation for further debates on striking a balance between privacy risks and model performance.

--------------------------------------------------------------------------------------------------------

Eye On AI

Week Ending 12.10.2023

RESEARCH WATCH: 12.10.2023

SPONSORED BY

EYE ON A.I. GETS READERS UP TO DATE ON THE LATEST FUNDING NEWS AND RELATED ISSUES. SUBSCRIBE FOR THE WEEKLY NEWSLETTER.