Showing 46 papers for 2026-06-02
We show that some graph anomalies decrease spectral energy variation, camouflaged across datasets; existing spectral methods fail to detect them. We then present a node-level spectral energy formulation that integrates with message-passing to detect these camouflaged anomalies.
LLMs used for KBQA can hallucinate; even with a graph as the knowledge source, models may rely on parametric knowledge or perform invalid reasoning. KG-Guard introduces graph-based hallucination detection to ground answers in graph evidence and validate reasoning paths.
We propose SMT-GraphFormer, a spatiotemporal multi-task graph transformer for trip-level transit prediction. It captures non-linear spatiotemporal dependencies across stops and lines and frames prediction at the trip level, leveraging network context and within-trip evolution.
RADE is a stochastic graph augmentation that simultaneously drops and adds edges to regularize training and mitigate overfitting and over-squashing. Unlike prior augmentations, RADE improves connectivity while providing regularization benefits, addressing train-test misalignment.
We introduce latent diffusion pretraining for crystal property prediction to address data scarcity. A diffusion-based pretraining objective on crystal graphs/structures enables effective pretraining, followed by finetuning on property tasks, yielding improved accuracy with limited labels.
Temporal Motif Signatures reveal that short-horizon motif patterns carry predictive information for TGNNs. A small four-feature family of past-window star counts provides most of the lift in MOOC interaction prediction, and motif activity correlates with three scalable regimes across real and synthetic datasets.
AdaKernel proposes learning adaptive kernel parameters for spatiotemporal graphs, addressing the limitation of fixed distance-based kernels. The paper provides theoretical insights showing misspecified kernels hurt performance and presents an adaptive kernel parameterization to improve expressivity, especially in data-sparse regimes.
We learn a directed graph abstraction of combinatorial spaces to enable order-preserving search for mixed-combinatorial nonlinear optimization (MCNLP). This avoids spurious relations and high dimensionality from traditional encodings, leveraging planning/routing-inspired graph abstractions.
GJDNet proposes robust GNNs via joint disentangled learning to defend against adversarial attacks that invert assortativity patterns. By disentangling structure and features and employing robust training, it improves resistance to attacks across graph types.
Pre-propagation GNNs move graph computations to preprocessing; more complex hop aggregators often underperform simple MLP. We analyze this through a graph-filter lens on a precomputed diffusion basis and show how filter coefficient sharing across channels explains the observed behavior.
G2LoRA introduces Gradient-Orthogonal Low-Rank Adaptation for continual learning on text-attributed graphs, addressing catastrophic forgetting and enabling better knowledge transfer across streaming tasks.
TIGER presents an inference-time framework for fact-level repair in multimodal generation using graph-based evidence routing, enabling localized, ranked corrections without letting hallucinated facts bias input interpretation.
KG-FairDiff uses knowledge-graph guided prompt refinement to reduce demographic bias in text-to-image generation, offering cross-domain fairness without retraining. It leverages KG constraints to steer generation away from stereotypes.
IstGPT is the first industrial anomaly detection tool combining LLMs and graph learning for real-time spatial-temporal monitoring in industrial control systems, enabling precise and broad-spectrum protection against ICS attacks.
We formulate VRP as a Graph Edit Distance maximization problem: with an edge-deletion cost model, minimizing route cost is equivalent to maximizing the total weight of deleted edges. This per-edge formulation enables attribution and decomposition analyses for VRP.
We propose an uncertainty-aware GNN for reconstructing urban temperature fields from sparse sensors under deployment constraints. The model predicts both the temperature field and spatially varying predictive uncertainty, supporting distance-constrained sensor placement and probabilistic exceedance mapping.
We address stability and expressivity when using Laplacian eigenvectors as node features. We propose a symmetry-aware encoder that handles the orthogonal-group invariance among eigenvectors, preserving global expressivity and numerical stability.
This survey introduces graph neural networks for machine learning engineers, explaining GNNs via an encoder-decoder framework, with concrete decoder examples and experiments on homogeneous graphs to illustrate behavior under different settings.
MOGKAN integrates multi-omics data (mRNA, miRNA, DNA methylation) with protein interaction networks to classify 31 cancer types and identify biomarkers. It combines differential expression with graph-based modeling for interpretable cancer diagnostics.
We propose RGPD, a reinforced graph-based physics-informed network with dynamic weighting to improve accurate RUL and SoH estimation across assets. Dynamic weights adapt loss contributions across asset-specific degradation, enabling better transfer and robustness.
Vector Quantization (VQ) is studied as a tool for learning compressed, discrete graph representations. The paper reports that codebook collapse consistently arises when training VQ jointly with Graph Neural Networks on graph reconstruction tasks, and mitigation strategies fail to fully avert it. The work provides an empirical account of this limitation and discusses implications for the expressiveness and generalization of graph tokens.
We propose a chaining pipeline of multiple 2-FWL GNNs, where each stage is trained to refine a similarity matrix by decoding the previous stage and ranking nodes by alignment quality. This sequential refinement leads to improved combinatorial graph alignment in purely structural settings, surpassing standard baselines.
Text-attributed graphs enable joint modeling of semantic content and graph structure in GNNs. The study shows that while TAGs work well in-distribution, GNNs struggle to detect out-of-distribution nodes with unseen text or structure, often producing overconfident predictions. It argues that existing topology-driven methods underutilize textual semantics and emphasizes methods that fuse textual meaning with topology for robust OOD detection.
GNNs have been used for SAT tasks, but bipartite or DAG representations may miss higher-order interactions and polarity among clauses and literals. The paper proposes clause-literal hypergraphs to capture these interactions and introduces polarity-aware representations to model literal polarity. Experiments show improved unsat-core prediction using the hypergraph-based approach.
We analyze why explanations produced by Self-Interpretable Graph Neural Networks (SI-GNN) can be self-inconsistent when re-applying the model to its explanatory graph. The paper identifies re-explanation-induced context perturbation as the direct cause of score variation and proposes a latent signal assignment hypothesis to explain why only some edges are sensitive. Implications and potential mitigation strategies are discussed.
OgBench introduces a framework for evaluating Graph Neural Networks on omics data, focusing on the n << p regime where many nodes exist per graph but only a few samples. It provides datasets, benchmarks, and guidelines to assess GNN performance in low-sample, high-node settings typical of genomics and proteomics.
Graph Navier Stokes Networks (GNSN) introduce convection-inspired terms into message passing to overcome the oversmoothing problem in deep GNNs. By adopting Navier–Stokes–like dynamics, the architecture preserves node discriminability as depth grows. Experiments demonstrate improved performance over diffusion-based GNNs on various benchmarks.
We propose graph-based credit assignment (Graph-b) to enable step-level attribution in agentic reinforcement learning, moving beyond coarse trajectory-level signals. By constructing a graph over decisions or states and propagating rewards, the method uncovers latent step contributions and improves credit allocation and policy learning.
This work proposes robust contrastive graph clustering with adaptive local-global integration to flexibly capture high-order local structures and global semantics. It addresses data fragmentation and ambiguous cluster boundaries common in real-world graphs, and demonstrates improved clustering performance and stability.
The paper analyzes rank bottlenecks in embedding-based link prediction, explaining how linear output layers limit the expressivity when predicting links among many entities. It provides theoretical insight into why this bottleneck occurs and discusses implications for knowledge graph embeddings, offering directions to overcome it.
We propose a GNN-enabled robust framework for hybrid beamforming under imperfect CSI by combining a Hybrid Message Graph Attention Network (HMGAT) with score-based CSI generation and denoising. The approach updates node and edge features through message passing and uses generative CSI logic to mitigate CSI imperfections, yielding improved performance.
We introduce LLM-Wikirace, a benchmark where models must plan and reason across real-world knowledge graphs by navigating Wikipedia hyperlinks step by step to reach a target page. The evaluation covers a broad set of open- and closed-source models, revealing planning capabilities and knowledge reasoning performance, with large models performing strongest on easier tasks.
We argue that knowledge graphs provide a missing data layer for LLM-based industrial asset operations, grounding reasoning beyond flat documents. The paper positions typed knowledge graphs as the data substrate and discusses routing questions via graph queries (e.g., Cypher) to improve reliability of operational reasoning.
Grokers presents an architecture for bottom-up inductive comprehension of typed knowledge graphs, pushing intelligence to write-time. Autonomous Groker agents analyze nodes in a typed stream graph, extract structured attributes via governed language model calls, and inductively compose understanding upward through dependencies, writing enriched attributes.
The paper argues for a deterministic recipe to resolve memory conflicts in LLM systems without asking the model to track freshness, showing improved reliability on memory benchmarks.
We present a multi-domain red-teaming framework for safety, robustness, and fairness evaluation of medical LLMs, covering 690 clinically grounded scenarios across nine domains and eleven models. Scenarios include adversarial transformations and are scored with a seven-dimension rubric, aided by LLM-assisted scoring and human-in-the-loop validation.
We argue that Semantic IDs (SIDs) require encoders because their meaning depends on prefix context, not just raw tokens. The paper advocates contextual SID encoding rather than simply expanding vocabulary, enabling better use of SIDs in multimodal or recommendation settings.
We present Needles at Scale, a low-cost, batch pipeline for LLM-assisted Windows vulnerability research that converts production binaries into a queryable queue of function targets. The pipeline recovers function-level symbols, enriches targets with metadata, and performs sampling and prioritization to guide vulnerability analysis.
We propose a graph-based prompt selection framework that models each benchmark as a similarity graph and uses Maximum Independent Set (MIS) to select a diverse, non-redundant subset of prompts. We compare multiple MIS solvers across different embeddings and distances to analyze efficiency and coverage.
TechGraphRAG introduces an agentic, graph-augmented RAG framework for technical literature reasoning over a corpus of about 2,100 papers. It uses a 13-step autonomous pipeline that classifies queries by intent, scores evidence with a multi-dimensional rubric, performs drift-guarded query reformulation, and conducts external searches as needed.
Argument Collapse shows that essays generated by different LLMs tend to converge toward a smaller set of main arguments, sub-arguments, and paragraph structures when illustrating public debate. The study compares 1,039 human responses from NYT debates, 448 human responses from BR forums, and 23,384 LLM-written essays to quantify this collapse. This convergence suggests that LLMs may flatten long-form debates by recycling polished arguments.
We ask how much parameter capacity is needed for implicit reasoning without explicit chain-of-thought supervision. The authors pretrain language models from scratch in a controlled synthetic environment that mimics real-world knowledge graphs and evaluate their implicit reasoning ability. They propose a data-complexity driven scaling law linking parameter budget to implicit reasoning performance.
The paper presents two frameworks for autonomous, agentic AI in scientific workflows. Both use a hybrid Local Body, Remote Brain architecture via Google Colab, with Python-based local controllers to call LLM cloud backends. The DeepTS/DeepCollector agent automates curation, extraction, and deduplication of large time-series datasets, while DeepScribe analyzes and presents dense, mathematics-heavy physics lectures.
FundaPod is a multi-persona agent pod platform with a knowledge-graph memory to support AI-assisted fundamental investment research. The platform targets evidence gathering, driver identification, viewpoint comparison, and memo generation beyond traditional signals, aiming to produce transparent, reusable, and verifiable investment plans that contribute to the cumulative development of investment research.
GFlowGR investigates fine-tuning generative recommendation frameworks using Generative Flow Networks. While most work focuses on item tokenizers and decoding strategies, the critical fine-tuning step adapting LLMs to recommendation data remains underexplored. The paper proposes GFlowGR to integrate generative flow networks into the fine-tuning process, improving alignment with data distribution and recommendation quality.
The work shows that enabling a thinking mode through Chain-of-Thought can sometimes hurt recommendations by up to 25%. It diagnoses linguistic inertia and identifies that inserting a textual CoT segment before generating semantic IDs causes models to rely more on natural language context than on historical data. The authors discuss remedies to rectify this mismatch.