Hojabr：迈向人工智能与数据分析的万物理论 (Hojabr: Towards a Theory of Everything for AI and Data Analytics)

Modern data analytics pipelines increasingly combine relational queries, graph processing, and tensor computation within a single application, but existing systems remain fragmented across paradigms, execution models, and research communities. This fragmentation results in repeated optimization efforts, limited interoperability, and strict separation between logical abstractions and physical execution strategies. We propose Hojabr as a unified declarative intermediate language to address this problem. Hojabr integrates relational algebra, tensor algebra, and constraint-based reasoning within a single higher-order algebraic framework, in which joins, aggregations, tensor contractions, and recursive computations are expressed uniformly. Physical choices, such as join algorithms, execution models, and sparse versus dense tensor representations, are handled as constraint-specialization decisions rather than as separate formalisms. Hojabr supports bidirectional translation with existing declarative languages, enabling programs to be both lowered into Hojabr for analysis and optimization and lifted back into their original declarative form. By making semantic, structural, and algebraic properties explicit, and by supporting extensibility across the compilation stack, Hojabr enables systematic reasoning and reuse of optimization techniques across database systems, machine learning frameworks, and compiler infrastructures.

翻译：现代数据分析流水线日益将关系型查询、图处理与张量计算结合于单一应用之中，但现有系统在范式、执行模型和研究社群之间仍处于割裂状态。这种割裂导致优化工作重复、互操作性受限，以及逻辑抽象与物理执行策略间的严格分离。本文提出Hojabr作为一种统一的声明式中间语言以解决此问题。Hojabr将关系代数、张量代数和基于约束的推理整合于单一的高阶代数框架内，其中连接、聚合、张量缩并和递归计算均可被统一表达。物理层面的选择（如连接算法、执行模型、稀疏与稠密张量表示）被处理为约束特化决策而非独立的形制。Hojabr支持与现有声明式语言的双向转换，使得程序既能降级为Hojabr以进行分析优化，也能升级回原始声明形式。通过显式化语义、结构和代数特性，并支持编译栈层面的可扩展性，Hojabr实现了跨数据库系统、机器学习框架和编译器基础设施的系统化推理与优化技术复用。