Abstract
We present a novel framework for authorial classification and clustering of the Qumran Dead Sea Scrolls (DSS). Our approach combines modern Hebrew BERT embeddings with traditional natural language processing features in a graph neural network (GNN) architecture.
Our results outperform baseline methods on both the Dead Sea Scrolls and a validation dataset of the Hebrew Bible. In particular, we leverage our model to provide significant insights into long-standing debates, including the classification of sectarian and non-sectarian texts and the division of the Hodayot collection of hymns.
Integrating Semantic and Statistical Features for Authorial Clustering of Qumran Scrolls
Interactive Clustering Analysis
Interactive analysis of the Hodayot composition, comparing between Teacher Hymns and Community Hymns.
Clustering by sectarian/non-sectarian
The colors in this visualization shows the distribution of text clusters across sectarian and non-sectarian.
Clustering by Composition
The colors in this visualization shows the distribution of text clusters across different compositions.