Week Ending 1.5.2025

 

RESEARCH WATCH: 1.5.2025

 

AI-Enabled Operations at Fermi Complex: Multivariate Time Series Prediction for Outage Prediction and Diagnosis

A practical application of AI in particle accelerator operations, addressing the critical issue of unplanned beam outages. The research evaluates deep learning architectures for predicting these outages using data from thousands of sensors, aiming to transition from reactive to predictive maintenance. This could significantly reduce operational downtime and energy waste in particle accelerator facilities worldwide.

Authors:  Milan Jain, Burcu O. Mutlu, Caleb Stam, Jan Strube, Brian A. Schupbach, Jason M. St. John, William A. Pellico

Link:  https://arxiv.org/abs/2501.01509v1

Date: 2025-01-02

Summary:

The Main Control Room of the Fermilab accelerator complex continuously gathers extensive time-series data from thousands of sensors monitoring the beam. However, unplanned events such as trips or voltage fluctuations often result in beam outages, causing operational downtime. This downtime not only consumes operator effort in diagnosing and addressing the issue but also leads to unnecessary energy consumption by idle machines awaiting beam restoration. The current threshold-based alarm system is reactive and faces challenges including frequent false alarms and inconsistent outage-cause labeling. To address these limitations, we propose an AI-enabled framework that leverages predictive analytics and automated labeling. Using data from $2,703$ Linac devices and $80$ operator-labeled outages, we evaluate state-of-the-art deep learning architectures, including recurrent, attention-based, and linear models, for beam outage prediction. Additionally, we assess a Random Forest-based labeling system for providing consistent, confidence-scored outage annotations. Our findings highlight the strengths and weaknesses of these architectures for beam outage prediction and identify critical gaps that must be addressed to fully harness AI for transitioning downtime handling from reactive to predictive, ultimately reducing downtime and improving decision-making in accelerator management.

--------------------------------------------------------------------------------------------------------

Gradient polaritonic surface with space-variant switchable light-matter interactions in 2D moire superlattices

Explores novel ways to control light at nanoscale using twisted bilayer graphene on boron nitride. The research demonstrates switchable light-matter interactions through strain manipulation, enabling directional control of polaritons. This breakthrough could lead to advanced optical computing components and more efficient nanophotonic devices.

Authors:  Zhen-Bing Dai, Hua Fan, Vyacheslav Semenenko, Xinyu Lv, Lu Wen, Zhen Zhang, Shijie Fang, Vasili Perebeinos, Yue Zhao, Zhiqiang Li

Link:  https://arxiv.org/abs/2501.00929v1

Date: 2025-01-01

Summary:

Polaritons in two-dimensional (2D) materials provide unique opportunities for controlling light at nanoscales. Tailoring these polaritons via gradient polaritonic surfaces with space-variant response can enable versatile light-matter interaction platforms with advanced functionalities. However, experimental progress has been hampered by the optical losses and poor light confinement of conventionally used artificial nanostructures. Here, we demonstrate natural gradient polaritonic surfaces based on superlattices of solitons-localized structural deformations-in a prototypical moire system, twisted bilayer graphene on boron nitride. We demonstrate on-off switching and continuous modulation of local polariton-soliton interactions, which results from marked modifications of topological and conventional soliton states through variation of local strain direction. Furthermore, we reveal the capability of these structures to spatially modify the near-field profile, phase, and propagation direction of polaritons in record-small footprints, enabling generation and electrical switching of directional polaritons. Our findings open up new avenues toward nanoscale manipulation of light-matter interactions and spatial polariton engineering through gradient moire superlattices.

--------------------------------------------------------------------------------------------------------

Chunk-Distilled Language Modeling

Presents a novel approach to improve large language model efficiency by generating multiple tokens simultaneously through chunk-based generation. The method incorporates a retrieval framework for adapting to new data without retraining. This could lead to faster, more adaptable language models for various applications like content generation and translation.

Authors:  Yanhong Li, Karen Livescu, Jiawei Zhou

Link:  https://arxiv.org/abs/2501.00343v1

Date: 2024-12-31

Summary:

We introduce Chunk-Distilled Language Modeling (CD-LM), an approach to text generation that addresses two challenges in current large language models (LLMs): the inefficiency of token-level generation, and the difficulty of adapting to new data and knowledge. Our method combines deep network-based LLMs with a straightforward retrieval module, which allows the generation of multi-token text chunks at a single decoding step. Our retrieval framework enables flexible construction of model- or domain-specific datastores, either leveraging the internal knowledge of existing models, or incorporating expert insights from human-annotated corpora. This adaptability allows for enhanced control over the language model's distribution without necessitating additional training. We present the CD-LM formulation along with performance metrics demonstrating its ability to improve language model performance and efficiency across a diverse set of downstream tasks. Code and data will be made publicly available.

--------------------------------------------------------------------------------------------------------

Detection-Fusion for Knowledge Graph Extraction from Videos

Proposes a solution for converting video content into structured knowledge graphs instead of natural language descriptions. The system predicts relationships between individuals in videos using deep learning. This approach could enhance video search, automated content tagging, and improve accessibility of video content for computer processing.

Authors:  Taniya Das, Louis Mahon, Thomas Lukasiewicz

Link:  https://arxiv.org/abs/2501.00136v1

Date: 2024-12-30

Summary:

One of the challenging tasks in the field of video understanding is extracting semantic content from video inputs. Most existing systems use language models to describe videos in natural language sentences, but this has several major shortcomings. Such systems can rely too heavily on the language model component and base their output on statistical regularities in natural language text rather than on the visual contents of the video. Additionally, natural language annotations cannot be readily processed by a computer, are difficult to evaluate with performance metrics and cannot be easily translated into a different natural language. In this paper, we propose a method to annotate videos with knowledge graphs, and so avoid these problems. Specifically, we propose a deep-learning-based model for this task that first predicts pairs of individuals and then the relations between them. Additionally, we propose an extension of our model for the inclusion of background knowledge in the construction of knowledge graphs.

--------------------------------------------------------------------------------------------------------

AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Addresses digital accessibility by automating alt text generation for images in EPUB files. The system uses advanced AI models to create contextually relevant descriptions, significantly reducing accessibility errors. This technology could make digital content more accessible to visually impaired users while reducing the resource burden on publishers.

Authors:  Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao

Link:  https://arxiv.org/abs/2501.00113v1

Date: 2024-12-30

Summary:

Digital accessibility is a cornerstone of inclusive content delivery, yet many EPUB files fail to meet fundamental accessibility standards, particularly in providing descriptive alt text for images. Alt text plays a critical role in enabling visually impaired users to understand visual content through assistive technologies. However, generating high-quality alt text at scale is a resource-intensive process, creating significant challenges for organizations aiming to ensure accessibility compliance. This paper introduces AltGen, a novel AI-driven pipeline designed to automate the generation of alt text for images in EPUB files. By integrating state-of-the-art generative models, including advanced transformer-based architectures, AltGen achieves contextually relevant and linguistically coherent alt text descriptions. The pipeline encompasses multiple stages, starting with data preprocessing to extract and prepare relevant content, followed by visual analysis using computer vision models such as CLIP and ViT. The extracted visual features are enriched with contextual information from surrounding text, enabling the fine-tuned language models to generate descriptive and accurate alt text. Validation of the generated output employs both quantitative metrics, such as cosine similarity and BLEU scores, and qualitative feedback from visually impaired users.   Experimental results demonstrate the efficacy of AltGen across diverse datasets, achieving a 97.5% reduction in accessibility errors and high scores in similarity and linguistic fidelity metrics. User studies highlight the practical impact of AltGen, with participants reporting significant improvements in document usability and comprehension. Furthermore, comparative analyses reveal that AltGen outperforms existing approaches in terms of accuracy, relevance, and scalability.

--------------------------------------------------------------------------------------------------------

A Tale of Two Imperatives: Privacy and Explainability

Investigates the challenge of balancing privacy protection and model explainability in deep learning systems. The research focuses on combining differential privacy with post-hoc explainers, crucial for high-stakes applications. This work could help develop AI systems that respect both privacy rights and transparency requirements in regulated industries.

Authors:  Supriya Manna, Niladri Sett

Link:  https://arxiv.org/abs/2412.20798v2

Date: 2024-12-31

Summary:

Deep learning's preponderance across scientific domains has reshaped high-stakes decision-making, making it essential to follow rigorous operational frameworks that include both Right-to-Privacy (RTP) and Right-to-Explanation (RTE). This paper examines the complexities of combining these two requirements. For RTP, we focus on `Differential privacy' (DP), which is considered the current \textit{gold standard} for privacy-preserving machine learning due to its strong quantitative guarantee of privacy. For RTE, we focus on post-hoc explainers: they are the \textit{go-to} option for model auditing as they operate independently of model training. We formally investigate DP models and various commonly-used post-hoc explainers: how to evaluate these explainers subject to RTP, and analyze the intrinsic interactions between DP models and these explainers. Furthermore, our work throws light on how RTP and RTE can be effectively combined in high-stakes applications. Our study concludes by outlining an industrial software pipeline, with the example of a wildly used use-case, that respects both RTP and RTE requirements.

--------------------------------------------------------------------------------------------------------

The Proof is in the Almond Cookies

Presents a novel approach to computational recipe understanding for robotic cooking assistants. The system processes cooking instructions using narrative-based modeling, integrating language processing and mental simulation. This could enable development of AI kitchen assistants to help elderly or disabled individuals maintain independence.

Authors:  Remi van Trijp, Katrien Beuls, Paul Van Eecke

Link:  https://arxiv.org/abs/2501.01827v1

Date: 2025-01-03

Summary:

This paper presents a case study on how to process cooking recipes (and more generally, how-to instructions) in a way that makes it possible for a robot or artificial cooking assistant to support human chefs in the kitchen. Such AI assistants would be of great benefit to society, as they can help to sustain the autonomy of aging adults or people with a physical impairment, or they may reduce the stress in a professional kitchen. We propose a novel approach to computational recipe understanding that mimics the human sense-making process, which is narrative-based. Using an English recipe for almond crescent cookies as illustration, we show how recipes can be modelled as rich narrative structures by integrating various knowledge sources such as language processing, ontologies, and mental simulation. We show how such narrative structures can be used for (a) dealing with the challenges of recipe language, such as zero anaphora, (b) optimizing a robot's planning process, (c) measuring how well an AI system understands its current tasks, and (d) allowing recipe annotations to become language-independent.

--------------------------------------------------------------------------------------------------------

VidFormer: A novel end-to-end framework fused by 3DCNN and Transformer for Video-based Remote Physiological Measurement

Introduces a new framework combining 3DCNN and Transformer models for measuring physiological signals from facial videos. The system achieves state-of-the-art performance in remote photoplethysmography across multiple datasets. This technology could enable non-contact health monitoring in telemedicine and continuous patient monitoring applications.

Authors:  Jiachen Li, Shisheng Guo, Longzhen Tang, Cuolong Cui, Lingjiang Kong, Xiaobo Yang

Link:  https://arxiv.org/abs/2501.01691v1

Date: 2025-01-03

Summary:

Remote physiological signal measurement based on facial videos, also known as remote photoplethysmography (rPPG), involves predicting changes in facial vascular blood flow from facial videos. While most deep learning-based methods have achieved good results, they often struggle to balance performance across small and large-scale datasets due to the inherent limitations of convolutional neural networks (CNNs) and Transformer. In this paper, we introduce VidFormer, a novel end-to-end framework that integrates 3-Dimension Convolutional Neural Network (3DCNN) and Transformer models for rPPG tasks. Initially, we conduct an analysis of the traditional skin reflection model and subsequently introduce an enhanced model for the reconstruction of rPPG signals. Based on this improved model, VidFormer utilizes 3DCNN and Transformer to extract local and global features from input data, respectively. To enhance the spatiotemporal feature extraction capabilities of VidFormer, we incorporate temporal-spatial attention mechanisms tailored for both 3DCNN and Transformer. Additionally, we design a module to facilitate information exchange and fusion between the 3DCNN and Transformer. Our evaluation on five publicly available datasets demonstrates that VidFormer outperforms current state-of-the-art (SOTA) methods. Finally, we discuss the essential roles of each VidFormer module and examine the effects of ethnicity, makeup, and exercise on its performance.

--------------------------------------------------------------------------------------------------------

A Proof of Concept Resource Management Scheme for Augmented Reality Applications in 5G Systems

Demonstrates a resource management system for augmented reality applications in 5G networks, optimizing bandwidth and GPU usage to meet delay constraints. The system uses Multi-Armed Bandit algorithms for efficient resource allocation. This could improve AR performance while reducing power consumption in mobile networks.

Authors:  Panagiotis Nikolaidis, Samie Mostafavi, James Gross, John Baras

Link:  https://arxiv.org/abs/2501.01398v1

Date: 2025-01-02

Summary:

Augmented reality applications are bitrate intensive, delay-sensitive, and computationally demanding. To support them, mobile edge computing systems need to carefully manage both their networking and computing resources. To this end, we present a proof of concept resource management scheme that adapts the bandwidth at the base station and the GPU frequency at the edge to efficiently fulfill roundtrip delay constrains. Resource adaptation is performed using a Multi-Armed Bandit algorithm that accounts for the monotonic relationship between allocated resources and performance. We evaluate our scheme by experimentation on an OpenAirInterface 5G testbed where the considered application is OpenRTiST. The results indicate that our resource management scheme can substantially reduce both bandwidth usage and power consumption while delivering high quality of service. Overall, this work demonstrates that intelligent resource control can potentially establish systems that are not only more efficient but also more sustainable.

--------------------------------------------------------------------------------------------------------

Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models

Examines whether trustworthiness properties like robustness, fairness, and privacy can transfer from weak to strong language models during training. The research introduces new training strategies for trustworthiness generalization. This could help develop more reliable and trustworthy AI systems while reducing training costs.

Authors:  Martin Pawelczyk, Lillian Sun, Zhenting Qi, Aounon Kumar, Himabindu Lakkaraju

Link:  https://arxiv.org/abs/2501.00418v1

Date: 2024-12-31

Summary:

The rapid proliferation of generative AI, especially large language models, has led to their integration into a variety of applications. A key phenomenon known as weak-to-strong generalization - where a strong model trained on a weak model's outputs surpasses the weak model in task performance - has gained significant attention. Yet, whether critical trustworthiness properties such as robustness, fairness, and privacy can generalize similarly remains an open question. In this work, we study this question by examining if a stronger model can inherit trustworthiness properties when fine-tuned on a weaker model's outputs, a process we term weak-to-strong trustworthiness generalization. To address this, we introduce two foundational training strategies: 1) Weak Trustworthiness Finetuning (Weak TFT), which leverages trustworthiness regularization during the fine-tuning of the weak model, and 2) Weak and Weak-to-Strong Trustworthiness Finetuning (Weak+WTS TFT), which extends regularization to both weak and strong models. Our experimental evaluation on real-world datasets reveals that while some trustworthiness properties, such as fairness, adversarial, and OOD robustness, show significant improvement in transfer when both models were regularized, others like privacy do not exhibit signs of weak-to-strong trustworthiness. As the first study to explore trustworthiness generalization via weak-to-strong generalization, our work provides valuable insights into the potential and limitations of weak-to-strong generalization.

--------------------------------------------------------------------------------------------------------

Human-like Bots for Tactical Shooters Using Compute-Efficient Sensors

Presents a computationally efficient approach to creating human-like AI agents for tactical shooter games using ray-cast sensors instead of pixel-based input. The system demonstrates realistic behavior while requiring minimal CPU resources. This could improve NPC behavior in commercial games while maintaining performance.

Authors:  Niels Justesen, Maria Kaselimi, Sam Snodgrass, Miruna Vozaru, Matthew Schlegel, Jonas Wingren, Gabriella A. B. Barros, Tobias Mahlmann, Shyam Sudhakaran, Wesley Kerr, Albert Wang, Christoffer Holmgård, Georgios N. Yannakakis, Sebastian Risi, Julian Togelius

Link:  https://arxiv.org/abs/2501.00078v1

Date: 2024-12-30

Summary:

Artificial intelligence (AI) has enabled agents to master complex video games, from first-person shooters like Counter-Strike to real-time strategy games such as StarCraft II and racing games like Gran Turismo. While these achievements are notable, applying these AI methods in commercial video game production remains challenging due to computational constraints. In commercial scenarios, the majority of computational resources are allocated to 3D rendering, leaving limited capacity for AI methods, which often demand high computational power, particularly those relying on pixel-based sensors. Moreover, the gaming industry prioritizes creating human-like behavior in AI agents to enhance player experience, unlike academic models that focus on maximizing game performance. This paper introduces a novel methodology for training neural networks via imitation learning to play a complex, commercial-standard, VALORANT-like 2v2 tactical shooter game, requiring only modest CPU hardware during inference. Our approach leverages an innovative, pixel-free perception architecture using a small set of ray-cast sensors, which capture essential spatial information efficiently. These sensors allow AI to perform competently without the computational overhead of traditional methods. Models are trained to mimic human behavior using supervised learning on human trajectory data, resulting in realistic and engaging AI agents. Human evaluation tests confirm that our AI agents provide human-like gameplay experiences while operating efficiently under computational constraints. This offers a significant advancement in AI model development for tactical shooter games and possibly other genres.

--------------------------------------------------------------------------------------------------------

Mingling with the Good to Backdoor Federated Learning

Explores a new attack method called MIGO for inserting backdoors in federated learning systems while evading detection. The research demonstrates high success rates even with minimal control over the system. This highlights important security considerations for distributed learning systems.

Authors:  Nuno Neves

Link:  https://arxiv.org/abs/2501.01913v1

Date: 2025-01-03

Summary:

Federated learning (FL) is a decentralized machine learning technique that allows multiple entities to jointly train a model while preserving dataset privacy. However, its distributed nature has raised various security concerns, which have been addressed by increasingly sophisticated defenses. These protections utilize a range of data sources and metrics to, for example, filter out malicious model updates, ensuring that the impact of attacks is minimized or eliminated.   This paper explores the feasibility of designing a generic attack method capable of installing backdoors in FL while evading a diverse array of defenses. Specifically, we focus on an attacker strategy called MIGO, which aims to produce model updates that subtly blend with legitimate ones. The resulting effect is a gradual integration of a backdoor into the global model, often ensuring its persistence long after the attack concludes, while generating enough ambiguity to hinder the effectiveness of defenses.   MIGO was employed to implant three types of backdoors across five datasets and different model architectures. The results demonstrate the significant threat posed by these backdoors, as MIGO consistently achieved exceptionally high backdoor accuracy (exceeding 90%) while maintaining the utility of the main task. Moreover, MIGO exhibited strong evasion capabilities against ten defenses, including several state-of-the-art methods. When compared to four other attack strategies, MIGO consistently outperformed them across most configurations. Notably, even in extreme scenarios where the attacker controls just 0.1% of the clients, the results indicate that successful backdoor insertion is possible if the attacker can persist for a sufficient number of rounds.

--------------------------------------------------------------------------------------------------------

Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark

Addresses entity bias in relation extraction tasks by introducing a debiased benchmark DREB and a new method called MixDebias. The approach improves model generalization while maintaining performance on original datasets. This could enhance natural language processing systems' ability to understand relationships in text.

Authors:  Liang He, Yougang Chu, Zhen Wu, Jianbing Zhang, Xinyu Dai, Jiajun Chen

Link:  https://arxiv.org/abs/2501.01349v1

Date: 2025-01-02

Summary:

Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions. However, biases within datasets can lead models to learn shortcut patterns, resulting in inaccurate assessments and hindering real-world applicability. This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context. We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement. DREB utilizes Bias Evaluator and PPL Evaluator to ensure low bias and high naturalness, providing a reliable and accurate assessment of model generalization in entity bias scenarios. To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques. MixDebias effectively improves model performance on DREB while maintaining performance on the original dataset. Extensive experiments demonstrate the effectiveness and robustness of MixDebias compared to existing methods, highlighting its potential for improving the generalization ability of relation extraction models. We will release DREB and MixDebias publicly.

--------------------------------------------------------------------------------------------------------

The Algonauts Project 2025 Challenge: How the Human Brain Makes Sense of Multimodal Movies

Announces a challenge focused on understanding how the human brain processes multimodal movie content, using the largest available dataset of fMRI responses to movies. This project aims to advance both neuroscience and artificial intelligence through collaborative research on brain encoding models.

Authors:  Alessandro T. Gifford, Domenic Bersch, Marie St-Laurent, Basile Pinsard, Julie Boyle, Lune Bellec, Aude Oliva, Gemma Roig, Radoslaw M. Cichy

Link:  https://arxiv.org/abs/2501.00504v1

Date: 2024-12-31

Summary:

There is growing symbiosis between artificial and biological intelligence sciences: neural principles inspire new intelligent machines, which are in turn used to advance our theoretical understanding of the brain. To promote further collaboration between biological and artificial intelligence researchers, we introduce the 2025 edition of the Algonauts Project challenge: How the Human Brain Makes Sense of Multimodal Movies (https://algonautsproject.com/). In collaboration with the Courtois Project on Neuronal Modelling (CNeuroMod), this edition aims to bring forth a new generation of brain encoding models that are multimodal and that generalize well beyond their training distribution, by training them on the largest dataset of fMRI responses to movie watching available to date. Open to all, the 2025 challenge provides transparent, directly comparable results through a public leaderboard that is updated automatically after each submission to facilitate rapid model assessment and guide development. The challenge will end with a session at the 2025 Cognitive Computational Neuroscience (CCN) conference that will feature winning models. We welcome researchers interested in collaborating with the Algonauts Project by contributing ideas and datasets for future challenges.

--------------------------------------------------------------------------------------------------------

L3D-Pose: Lifting Pose for 3D Avatars from a Single Camera in the Wild

Develops a method for converting 2D poses to 3D using rigged avatars and synthetic data generation. The system enables accurate pose estimation and retargeting for animals in natural settings. This could benefit wildlife research and animation production by enabling better analysis of animal movement.

Authors:  Soumyaratna Debnath, Harish Katti, Shashikant Verma, Shanmuganathan Raman

Link:  https://arxiv.org/abs/2501.01174v1

Date: 2025-01-02

Summary:

While 2D pose estimation has advanced our ability to interpret body movements in animals and primates, it is limited by the lack of depth information, constraining its application range. 3D pose estimation provides a more comprehensive solution by incorporating spatial depth, yet creating extensive 3D pose datasets for animals is challenging due to their dynamic and unpredictable behaviours in natural settings. To address this, we propose a hybrid approach that utilizes rigged avatars and the pipeline to generate synthetic datasets to acquire the necessary 3D annotations for training. Our method introduces a simple attention-based MLP network for converting 2D poses to 3D, designed to be independent of the input image to ensure scalability for poses in natural environments. Additionally, we identify that existing anatomical keypoint detectors are insufficient for accurate pose retargeting onto arbitrary avatars. To overcome this, we present a lookup table based on a deep pose estimation method using a synthetic collection of diverse actions rigged avatars perform. Our experiments demonstrate the effectiveness and efficiency of this lookup table-based retargeting approach. Overall, we propose a comprehensive framework with systematically synthesized datasets for lifting poses from 2D to 3D and then utilize this to re-target motion from wild settings onto arbitrary avatars.

--------------------------------------------------------------------------------------------------------

eRevise+RF: A Writing Evaluation System for Assessing Student Essay Revisions and Providing Formative Feedback

Presents an automated writing evaluation system for assessing student essay revisions and providing feedback. The system has been successfully deployed in multiple schools, helping students improve their argumentative writing skills. This could enhance writing education by providing timely, consistent feedback.

Authors:  Zhexiong Liu, Diane Litman, Elaine Wang, Tianwen Li, Mason Gobat, Lindsay Clare Matsumura, Richard Correnti

Link:  https://arxiv.org/abs/2501.00715v1

Date: 2025-01-01

Summary:

The ability to revise essays in response to feedback is important for students' writing success. An automated writing evaluation (AWE) system that supports students in revising their essays is thus essential. We present eRevise+RF, an enhanced AWE system for assessing student essay revisions (e.g., changes made to an essay to improve its quality in response to essay feedback) and providing revision feedback. We deployed the system with 6 teachers and 406 students across 3 schools in Pennsylvania and Louisiana. The results confirmed its effectiveness in (1) assessing student essays in terms of evidence usage, (2) extracting evidence and reasoning revisions across essays, and (3) determining revision success in responding to feedback. The evaluation also suggested eRevise+RF is a helpful system for young students to improve their argumentative writing skills through revision and formative feedback.

--------------------------------------------------------------------------------------------------------

Braiding rule of boundary Majorana-like zero mode

Explores topological states in solid-state and artificial crystal systems, focusing on Majorana-like zero modes (MLZMs) at boundaries rather than bulk. The research demonstrates tunability of MLZMs around vortex-texture Kekule modulated graphene boundaries, protected by Zak phase. This could advance quantum computing and topological photonics applications.

Authors:  Qiyun Ma, Hailong He, Meng Xiao, Zhengyou Liu

Link:  https://arxiv.org/abs/2501.00749v1

Date: 2025-01-01

Summary:

The study of topological states has become an important topic in both solid-state systems and artificial structures such as photonic crystals and phononic crystals. Among them, Majorana zero modes, which exhibit nontrivial braiding process, have attracted extensive research interest. The analog of Majorana zero modes in classical waves, or the Majorana-like zero modes (MLZMs), have also got a lot of attention recently. However, the vast majority of previous works concerned with MLZMs that were bounded to vortexes inside the bulk. Here in this work, we unveil the braiding rule of MLZMs that are tunable around the boundary of a vortex-texture Kekule modulated graphene. We show that the existence of these zero-dimensional boundary MLZMs is protected by the Zak phase of 1D boundary states. As such, we are able to construct multiple MLZMs and analyze the corresponding braiding process. In addition, we also provide an implementation scheme of the boundary MLZMs in acoustic crystals. The tunability of the boundary MLZMs proposed herein offer new freedom for topological states braiding in both solid-state systems and artificial structures.

--------------------------------------------------------------------------------------------------------

3GPP Evolution from 5G to 6G: A 10-Year Retrospective

Reviews the decade-long evolution of mobile communications through six 3GPP releases, from 5G's introduction to 6G's foundation. Traces key developments including New Radio interface, non-terrestrial networks, and AI integration. This comprehensive analysis provides insights for future telecommunications development through 2030.

Authors:  Xingqin Lin

Link:  https://arxiv.org/abs/2412.21077v1

Date: 2024-12-30

Summary:

The 3rd Generation Partnership Project (3GPP) evolution of mobile communication technologies from 5G to 6G has been a transformative journey spanning a decade, shaped by six releases from Release 15 to Release 20. This article provides a retrospective of this evolution, highlighting the technical advancements, challenges, and milestones that have defined the transition from the foundational 5G era to the emergence of 6G. Starting with Release 15, which marked the birth of 5G and its New Radio (NR) air interface, the journey progressed through Release 16, where 5G was qualified as an International Mobile Telecommunications-2020 (IMT-2020) technology, and Release 17, which expanded 5G into new domains such as non-terrestrial networks. Release 18 ushered in the 5G-Advanced era, incorporating novel technologies like artificial intelligence. Releases 19 and 20 continue this momentum, focusing on commercially driven enhancements while laying the groundwork for the 6G era. This article explores how 3GPP technology evolution has shaped the telecommunications landscape over the past decade, bridging two mobile generations. It concludes with insights into learned lessons, future challenges, and opportunities, offering guidelines on 6G evolution for 2030 and beyond.

--------------------------------------------------------------------------------------------------------

Bridging Simplicity and Sophistication using GLinear: A Novel Architecture for Enhanced Time Series Prediction

Introduces a data-efficient linear architecture for time series forecasting that challenges complex Transformer models. GLinear achieves superior performance using less historical data across multiple domains. This could improve forecasting in areas like electricity consumption, traffic prediction, and weather forecasting.

Authors:  Syed Tahir Hussain Rizvi, Neel Kanwal, Muddasar Naeem, Alfredo Cuzzocrea, Antonio Coronato

Link:  https://arxiv.org/abs/2501.01087v2

Date: 2025-01-03

Summary:

Time Series Forecasting (TSF) is an important application across many fields. There is a debate about whether Transformers, despite being good at understanding long sequences, struggle with preserving temporal relationships in time series data. Recent research suggests that simpler linear models might outperform or at least provide competitive performance compared to complex Transformer-based models for TSF tasks. In this paper, we propose a novel data-efficient architecture, GLinear, for multivariate TSF that exploits periodic patterns to provide better accuracy. It also provides better prediction accuracy by using a smaller amount of historical data compared to other state-of-the-art linear predictors. Four different datasets (ETTh1, Electricity, Traffic, and Weather) are used to evaluate the performance of the proposed predictor. A performance comparison with state-of-the-art linear architectures (such as NLinear, DLinear, and RLinear) and transformer-based time series predictor (Autoformer) shows that the GLinear, despite being parametrically efficient, significantly outperforms the existing architectures in most cases of multivariate TSF. We hope that the proposed GLinear opens new fronts of research and development of simpler and more sophisticated architectures for data and computationally efficient time-series analysis.

--------------------------------------------------------------------------------------------------------

Graph Neural Networks for Next-Generation-IoT: Recent Advances and Open Challenges

Surveys GNN applications in 6G IoT environments, covering technologies like MIMO, edge computing, and satellite networks. Addresses security challenges and integration with emerging technologies. This research guides development of efficient, secure IoT systems for next-generation networks.

Authors:  Nguyen Xuan Tung, Le Tung Giang, Bui Duc Son, Seon Geun Jeong, Trinh Van Chien, Won Joo Hwang, Lajos Hanzo

Link:  https://arxiv.org/abs/2412.20634v1

Date: 2024-12-30

Summary:

Graph Neural Networks (GNNs) have emerged as a critical tool for optimizing and managing the complexities of the Internet of Things (IoT) in next-generation networks. This survey presents a comprehensive exploration of how GNNs may be harnessed in 6G IoT environments, focusing on key challenges and opportunities through a series of open questions. We commence with an exploration of GNN paradigms and the roles of node, edge, and graph-level tasks in solving wireless networking problems and highlight GNNs' ability to overcome the limitations of traditional optimization methods. This guidance enhances problem-solving efficiency across various next-generation (NG) IoT scenarios. Next, we provide a detailed discussion of the application of GNN in advanced NG enabling technologies, including massive MIMO, reconfigurable intelligent surfaces, satellites, THz, mobile edge computing (MEC), and ultra-reliable low latency communication (URLLC). We then delve into the challenges posed by adversarial attacks, offering insights into defense mechanisms to secure GNN-based NG-IoT networks. Next, we examine how GNNs can be integrated with future technologies like integrated sensing and communication (ISAC), satellite-air-ground-sea integrated networks (SAGSIN), and quantum computing. Our findings highlight the transformative potential of GNNs in improving efficiency, scalability, and security within NG-IoT systems, paving the way for future advances. Finally, we propose a set of design guidelines to facilitate the development of efficient, scalable, and secure GNN models tailored for NG IoT applications.

--------------------------------------------------------------------------------------------------------


EYE ON A.I. GETS READERS UP TO DATE ON THE LATEST FUNDING NEWS AND RELATED ISSUES. SUBSCRIBE FOR THE WEEKLY NEWSLETTER.