InDelsTopo¶
InDelsTopo is a Python package for studying the topological structure of sets of words, especially when their primary source of variation arises from insertions and deletions.
It implements the Insertion Chain Complex introduced in
Natasha Jonoska, Francisco Martinez-Figueroa, and Masahico Saito,
The Insertion Chain Complex: A Topological Approach to the Structure of Word Sets, 2025.
Overview¶
The Insertion Chain Complex framework provides a topological structure on the relationships between words that differ by insertions and deletions.
It was originally developed to model DNA sequence variation during double-strand break repair but can be applied to any setting where the structure of sets of words needs to be analyzed.
InDelsTopo provides:
- Construction of Filtrations and Complexes over word sets
- Tools to compute homology, Euler characteristic curves, and persistent homology
- Optional integration with SageMath
- Methods to analyze and visualize the topological structure of word sets
Installation¶
Install via pip:
pip install InDelsTopo
For full functionality (e.g., integer homology over \(\mathbb{Z}\)), install SageMath and run your notebooks in a SageMath kernel.
Quick Start¶
from InDelsTopo import Filtration, filtration_plot
# Create a filtration from a small set of words
F = Filtration()
F.compute_d_skeleton(['a', 'b', 'ab', '', 'ba'], [1,2,3,4,5])
# Visualize the sublevel sets at different heights.
F.get_graph(1)
F.get_graph(2)
F.get_graph(5)
Main Concepts¶
| Concept | Description |
|---|---|
| Block | Represents a combinatorial element generated by insertions. |
| Chain | Linear combination of blocks with integer coefficients. |
| Complex | Collection of blocks and faces forming a topological structure. |
| Filtration | Sequence of nested complexes indexed by a height function. |
For detailed examples, see the Jupyter Notebook Tutorial.
Documentation Structure¶
Reference: API documentation.
Citation¶
If you use InDelsTopo in academic work, please cite:
Jonoska, N., Martinez-Figueroa, F., & Saito, M.
The Insertion Chain Complex: A Topological Approach to the Structure of Word Sets.
arXiv preprint arXiv:2509.12607, 2025.
Contributing¶
Contributions, pull requests, and feedback are welcome!
Please open an issue on the GitHub repository.
License¶
This project is licensed under the MIT License.
See the LICENSE file for details.
Acknowledgements¶
This project was developed at the University of South Florida, in collaboration with Prof. Natasha Jonoska and Prof. Masahico Saito (University of South Florida), and insights from experimental data provided by Prof. Francesca Storici’s lab (Georgia Tech).
This package was built under auspices of the Southeast Center for Mathematics and Biology, an NSF-Simons Research Center for Mathematics of Complex Biological Systems, under National Science Foundation Grant No. DMS-1764406 and Simons Foundation Grant No. 594594 as well as NSF DMS-2054321, CCF-2107267, CCF-2505771 and the W.M. Keck Foundation.