Papers for stringologist (2021)
Contents
- A “Learned” Approach to Quicken and Compress Rank/Select Dictionaries.
- PFP Compressed Suffix Trees.
- Approximate Hashing for Bioinformatics.
- The Parameterized Suffix Tray.
- Succinct Data Structures for Series-Parallel, Block-Cactus and 3-Leaf Power Graphs.
- A Compact Index for Cartesian Tree Matching.
- A Fast and Small Subsampled R-Index.
- A Linear Time Algorithm for Constructing Hierarchical Overlap Graphs.
- AWLCO: All-Window Length Co-Occurrence.
- An Invertible Transform for Efficient String Matching in Labeled Digraphs.
- Compressed Weighted de Bruijn Graphs.
- Computing Covers of 2D-Strings.
- Computing Edit Distance (Invited Talk).
- Constructing Strings Avoiding Forbidden Substrings.
- Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time.
- Data Structures for Categorical Path Counting Queries.
- Disorders and Permutations.
- Efficient Algorithms for Counting Gapped Palindromes.
- Front Matter, Table of Contents, Preface, Conference Organization.
- Gapped Indexing for Consecutive Occurrences.
- Internal Shortest Absent Word Queries.
- On-Line Pattern Matching on D-Texts (Invited Talk).
- Optimal Completion and Comparison of Incomplete Phylogenetic Trees Under Robinson-Foulds Distance.
- Optimal Construction of Hierarchical Overlap Graphs.
- R-enum: Enumeration of Characteristic Substrings in BWT-runs Bounded Space.
- Ranking Bracelets in Polynomial Time.
- Repetitions in Strings: A “Constant” Problem (Invited Talk).
- String Sanitization Under Edit Distance: Improved and Generalized.
- The Longest Run Subsequence Problem: Further Complexity Results.
- The k-Mappability Problem Revisited.
- Weighted Ancestors in Suffix Trees Revisited.
- A Disk-Based Index for Trajectories with an In-Memory Compressed Cache.
- A grammar compressor for collections of reads with applications to the construction of the BWT.
- Accelerating Knuth-Morris-Pratt String Matching over LZ77 Compressed Text.
- Approximate Hashing for Bioinformatics.
- Backward Weighted Coding.
- Compact Representation of Spatial Hierarchies and Topological Relationships.
- DZip: improved general-purpose loss less compression based on novel neural network modeling.
- Efficiently Merging r-indexes.
- End-to-End optimized image compression for machines, a study.
- Improved LZ77 Compression.
- Improving Run Length Encoding by Preprocessing.
- Neural Networks Optimally Compress the Sawbridge.
- On Elias-Fano for Rank Queries in FM-Indexes.
- On Random Editing in LZ-End.
- PHONI: Streamed Matching Statistics with Multi-Genome References.
- Parallel Processing of Grammar Compression.
- Smaller RLZ-Compressed Suffix Arrays.
- Succinct Data Structures for Small Clique-Width Graphs.
- Succinct representations of Intersection Graphs on a Circle.
- ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
- Upper Bounds on Distinct Maximal (Sub-)Repetitions in Compressed Strings.
- Weighted Prefix Normal Words: Mind the Gap.
- Bidirectional String Anchors: A New String Sampling Mechanism.
- Compression by Contracting Straight-Line Programs.
- Dynamic Colored Orthogonal Range Searching.
- Fast and Space-Efficient Construction of AVL Grammars from the LZ77 Parsing.
- Faster Algorithms for Longest Common Substring.
- Hypersuccinct Trees - New Universal Tree Source Codes for Optimal Compressed Tree Data Structures and Range Minima.
- Lyndon Words Accelerate Suffix Sorting.
- Minimum Common String Partition: Exact Algorithms.
- Space Efficient Two-Dimensional Orthogonal Colored Range Counting.
- Breaking the Cubic Barrier for (Unweighted) Tree Edit Distance.
- Small-space and streaming pattern matching with $k$ edits.
- A Linear-Time n0.4-Approximation for Longest Common Subsequence.
- An Almost Optimal Edit Distance Oracle.
- Analysis of Smooth Heaps and Slim Heaps.
- Faster Algorithms for Bounded Tree Edit Distance.
- Fine-Grained Hardness for Edit Distance to a Fixed Sequence.
- Improved Approximation for Longest Common Subsequence over Small Alphabets.
- LF Successor: Compact Space Indexing for Order-Isomorphic Pattern Matching.
- Linear Time Runs Over General Ordered Alphabets.
- New Sublinear Algorithms and Lower Bounds for LIS Estimation.
- Optimal-Time Queries on BWT-Runs Compressed Indexes.
- Sorting Short Integers.
- Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence.
- Algorithms and Complexity on Indexing Elastic Founder Graphs.
- Inverse Suffix Array Queries for 2-Dimensional Pattern Matching in Near-Compact Space.
- Pattern Masking for Dictionary Matching.
- Repetition- and Linearity-Aware Rank/Select Dictionaries.
- Resilient Level Ancestor, Bottleneck, and Lowest Common Ancestor Queries in Dynamic Trees.
- Streaming Pattern Matching (Invited Talk).
- The Tandem Duplication Distance Problem Is Hard over Bounded Alphabets.
- Cadences in Grammar-Compressed Strings.
- Succinct Representations for (Non)Deterministic Finite Automata.
- Matching Patterns with Variables Under Hamming Distance.
- Absent Subsequences in Words.
- Document Retrieval Hacks.
- Engineering Predecessor Data Structures for Dynamic Integer Sets.
- A Lower Bound for Dynamic Fractional Cascading.
- Beating the probabilistic lower bound on perfect hashing.
- Competitive Data-Structure Dynamization.
- New Data Structures for Orthogonal Range Reporting and Range Minima Queries.
- On Indexing and Compressing Finite Automata.
- On Locating Paths in Compressed Tries.
- Optimal Oblivious Priority Queues.
- A Normal Sequence Compressed by PPM* But Not by Lempel-Ziv 78.
- Blocksequences of k-local Words.
- Novel Results on the Number of Runs of the Burrows-Wheeler-Transform.
- Modular Subset Sum, Dynamic Strings, and Zero-Sum Sets.
- Soft Sequence Heaps.
- A Separation of γ and b via Thue-Morse Words.
- All Instantiations of the Greedy Algorithm for the Shortest Common Superstring Problem are Equivalent.
- An LMS-Based Grammar Self-index with Local Consistency Properties.
- Computing the Original eBWT Faster, Simpler, and with Less Memory.
- Exploiting Pseudo-locality of Interchange Distance.
- Extracting the Sparse Longest Common Prefix Array from the Suffix Binary Search Tree.
- Grammar Index by Induced Suffix Sorting.
- Improved Topic Modeling in Twitter Through Community Pooling.
- Longest Common Rollercoasters.
- Lower Bounds for the Number of Repetitions in 2D Strings.
- Minimal Unique Palindromic Substrings After Single-Character Substitution.
- On Stricter Reachable Repetitiveness Measures.
- On the Approximation Ratio of LZ-End to LZ77.
- Permutation-Constrained Common String Partitions with Applications.
- Position Heaps for Cartesian-Tree Matching on Strings and Tries.
- String Covers of a Tree.
- TSXor: A Simple Time Series Compression Algorithm.
- Unicode at Gigabytes per Second.
- findere: Fast and Precise Approximate Membership Query.
- r-Indexing the eBWT.
- Efficiently Testing Simon’s Congruence.
- Finding an Optimal Alphabet Ordering for Lyndon Factorization Is Hard.
- The Edit Distance to k-Subsequence Universality.
- Fully dynamic approximation of LIS in polylogarithmic time.
- Improved dynamic algorithms for longest increasing subsequence.
- Subcubic algorithms for Gomory-Hu tree in unweighted graphs.
- Automata Approach to Inexact Tree Pattern Matching Using 1-degree Edit Distance.
- Computational Substantiation of the d-step Conjecture for Distinct Squares Revisited.
- Counting Lyndon Subsequences.
- Pitfalls of Algorithm Comparison.
- Refined Upper Bounds on the Size of the Condensed Neighbourhood of Sequences.
- Searching with Extended Guard and Pivot Loop.
- The n-ary Initial Literal and Literal Shuffle.
- Towards an Efficient Text Sampling Approach for Exact and Approximate Matching.
- Compressing and Indexing Aligned Readsets.
- Space-Efficient Representation of Genomic k-Mer Count Tables.
ACM Comput. Surv. 2021
- Predecessor Search.
ACM J. Exp. Algorithmics 2021
- Engineering Practical Lempel-Ziv Tries.
ACM Trans. Algorithms 2021
- Optimal Substring Equality Queries with Applications to Sparse Text Indexing.
- Optimal-Time Dictionary-Compressed Indexes.
ACM Trans. Knowl. Discov. Data 2021
- Combinatorial Algorithms for String Sanitization.
Algorithmica 2021
- Internal Dictionary Matching.
- Range Majorities and Minorities in Arrays.
- Top Tree Compression of Tries.
Algorithms 2021
- Compressed Communication Complexity of Hamming Distance.
- Lempel-Ziv Parsing for Sequences of Blocks.
- Non-Overlapping LZ77 Factorization and LZ78 Substring Compression Queries with Suffix Trees.
- Re-Pair in Small Space.
- Reversed Lempel-Ziv Factorization with Suffix Trees.
- Subpath Queries on Compressed Graphs: A Survey.
- Accurate spliced alignment of long RNA sequencing reads.
CoRR 2021
- $r$-indexing Wheeler graphs.
- A Bloom Filter Survey: Variants for Different Domain Applications.
- A Conditional Lower Bound for Episode Matching.
- A Fast and Small Subsampled R-index.
- A Linear Time Algorithm for Constructing Hierarchical Overlap Graphs.
- A New Lossless Data Compression Algorithm Exploiting Positional Redundancy.
- A Separation of γ and b via Thue-Morse Words.
- A Simple Algorithm for Optimal Search Trees with Two-Way Comparisons.
- A Tight Analysis of Slim Heaps and Smooth Heaps.
- A new compressed cover tree guarantees a near linear parameterized complexity for all $k$-nearest neighbors search in metric spaces.
- A new distance based on minimal absent words and applications to biological sequences.
- APMF < APSP? Gomory-Hu Tree for Unweighted Graphs in Almost-Quadratic Time.
- Abelian Repetition Threshold Revisited.
- Absent Subsequences in Words.
- Algorithms and Complexity on Indexing Founder Graphs.
- Algorithms and Hardness for Multidimensional Range Updates and Queries.
- All instantiations of the greedy algorithm for the shortest superstring problem are equivalent.
- An Almost Optimal Edit Distance Oracle.
- An Improved Algorithm for The k-Dyck Edit Distance Problem.
- An O(k log n) algorithm for prefix based ranked autocomplete.
- An efficient way to manage blocks of data with Wise Red-Black Trees.
- Analysis of Smooth Heaps and Slim Heaps.
- Approximate Membership Query Filters with a False Positive Free Set.
- Approximating LCS and Alignment Distance over Multiple Sequences.
- Approximating Length-Restricted Means under Dynamic Time Warping.
- Approximating the Longest Common Subsequence problem within a sub-polynomial factor in linear time.
- Approximation Algorithms for LCS and LIS with Truly Improved Running Times.
- Arbitrary-length analogs to de Bruijn sequences.
- Beyond Single-Deletion Correcting Codes: Substitutions and Transpositions.
- Beyond the Longest Letter-duplicated Subsequence Problem.
- Binary Dynamic Time Warping in Linear Time.
- Boosting the Search Performance of B+-tree for Non-volatile Memory with Sentinels.
- Breaking the Cubic Barrier for (Unweighted) Tree Edit Distance.
- Breaking the O(n)-Barrier in the Construction of Compressed Suffix Arrays.
- Burrows Wheeler Transform on a Large Scale: Algorithms Implemented in Apache Spark.
- Cantor Mapping Technique.
- Checking whether a word is Hamming-isometric in linear time.
- Chunk List: Concurrent Data Structures.
- Closed Ziv-Lempel factorization of the m-bonacci words.
- Combinatorics of minimal absent words for a sliding window.
- Compact Euler Tours of Trees with Small Maximum Degree.
- Complex Event Forecasting with Prediction Suffix Trees: Extended Technical Report.
- Compressed Communication Complexity of Hamming Distance.
- Compression by Contracting Straight-Line Programs.
- Computing Matching Statistics on Repetitive Texts.
- Computing the original eBWT faster, simpler, and with less memory.
- Conditional Lower Bounds for Variants of Dynamic LIS.
- Construction of Sparse Suffix Trees and LCE Indexes in Optimal Time and Space.
- Counting Lyndon Subsequences.
- Counting and Verifying Abelian Border Arrays of Binary Words.
- Critical factorisation in square-free words.
- Defeating duplicates: A re-design of the LearnedSort algorithm.
- Deriving monadic quicksort (Declarative Pearl).
- Does Preprocessing help in Fast Sequence Comparisons?
- Dynamic Longest Increasing Subsequence and the Erdös-Szekeres Partitioning Problem.
- Dynamic Range Mode Enumeration.
- Dynamic Suffix Array with Sub-linear update time and Poly-logarithmic Lookup Time.
- Efficient construction of the extended BWT from grammar-compressed DNA sequencing reads.
- Empirically Improved Tokuda Gap Sequence in Shellsort.
- Engineering Predecessor Data Structures for Dynamic Integer Sets.
- Entropy of Mersenne-Twisters.
- Estimating the Longest Increasing Subsequence in Nearly Optimal Time.
- ExtendedHyperLogLog: Analysis of a new Cardinality Estimator.
- FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns.
- Fast Succinct Retrieval and Approximate Membership using Ribbon.
- Fast and Space-Efficient Construction of AVL Grammars from the LZ77 Parsing.
- Fast direct access to variable length codes.
- Faster Algorithms for Bounded Tree Edit Distance.
- Faster Algorithms for Longest Common Substring.
- Faster Exponential Algorithm for Permutation Pattern Matching.
- Friendly Cut Sparsifiers and Faster Gomory-Hu Trees.
- From Bit-Parallelism to Quantum: Breaking the Quadratic Barrier.
- Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal.
- Gapped Indexing for Consecutive Occurrences.
- Grammar Index By Induced Suffix Sorting.
- Graphs can be succinctly indexed for pattern matching in $ O(│E│^2 + │V│^{5 / 2}) $ time.
- HINT: A Hierarchical Index for Intervals in Main Memory.
- HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances.
- Hardness of Detecting Abelian and Additive Square Factors in Strings.
- Hierarchical Bitmap Indexing for Range and Membership Queries on Multidimensional Arrays.
- How Compression and Approximation Affect Efficiency in String Distance Measures.
- Hypersuccinct Trees - New universal tree source codes for optimal compressed tree data structures.
- Improved Approximation Algorithms for Dyck Edit Distance and RNA Folding.
- Improved Approximation Guarantees for Shortest Superstrings using Cycle Classification by Overlap to Length Ratios.
- Improved Approximation for Longest Common Subsequence over Small Alphabets.
- Improving Run Length Encoding by Preprocessing.
- Internal Shortest Absent Word Queries in Constant Time and Linear Space.
- Is this the simplest (and most surprising) sorting algorithm ever?
- Learned Sorted Table Search and Static Indexes in Small Space: Methodological and Practical Insights via an Experimental Study.
- Linear Approximate Pattern Matching Algorithm.
- Linear Time Runs over General Ordered Alphabets.
- Linear-time Minimization of Wheeler DFAs.
- Load-Balancing Succinct B Trees.
- Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence.
- Lower Bounds for the Number of Repetitions in 2D Strings.
- Matching Patterns with Variables under Hamming Distance.
- Memory-Optimality for Non-Blocking Containers.
- Minimal unique palindromic substrings after single-character substitution.
- N-ary Huffman Encoding Using High-Degree Trees - A Performance Comparison.
- Near-Optimal Quantum Algorithms for String Problems.
- Nearly Tight Lower Bounds for Succinct Range Minimum Query.
- Number Parsing at a Gigabyte per Second.
- On (co-lex) Ordering Automata.
- On Arithmetically Progressed Suffix Arrays and related Burrows-Wheeler Transforms.
- On Solving the Minimum Common String Partition Problem by Decision Diagrams.
- On Specialization of a Program Model of Naive Pattern Matching in Strings (Extended Abstract).
- On Stricter Reachable Repetitiveness Measures.
- On the Cost of Unsuccessful Searches in Search Trees with Two-way Comparisons.
- On the Optimal Time/Space Tradeoff for Hash Tables.
- On the approximation ratio of LZ-End to LZ77.
- Optimal Gap Sequences in Shellsort for n≤6 Elements.
- Optimal Sorting Circuits for Short Keys.
- Optimal Space and Time for Streaming Pattern Matching.
- PTHash: Revisiting FCH Minimal Perfect Hashing.
- Parallel Batch-Dynamic kd-Trees.
- Parallel Batched Interpolation Search Tree.
- Parallel and External-Memory Construction of Minimal Perfect Hash Functions with PTHash.
- Pattern Matching on Grammar-Compressed Strings in Linear Time.
- Pattern-defeating Quicksort.
- Position Heaps for Cartesian-tree Matching on Strings and Tries.
- Practical evaluation of Lyndon factors via alphabet reordering.
- Prefixes of the Fibonacci word that end with a cube.
- RLBWT Tricks.
- Range Minimum Queries in Minimal Space.
- Reconstruction of Sets of Strings from Prefix/Suffix Compositions.
- Resilient Level Ancestor, Bottleneck, and Lowest Common Ancestor Queries in Dynamic Trees.
- Ruler Wrapping.
- SIMD-Optimized Search Over Sorted Data.
- Scalable Hash Table for NUMA Systems.
- Sensitivity of string compressors and repetitiveness measures.
- SetSketch: Filling the Gap between MinHash and HyperLogLog.
- Simple Worst-Case Optimal Adaptive Prefix-Free Coding.
- Small space and streaming pattern matching with k edits.
- Solving one variable word equations in the free group in cubic time.
- Sorted Range Reporting.
- Sorting Short Integers.
- Space Efficient Two-Dimensional Orthogonal Colored Range Counting.
- Space-Efficient Huffman Codes Revisited.
- Spanner Evaluation over SLP-Compressed Documents.
- Stochastic and Worst-Case Generalized Sorting Revisited.
- Strictly In-Place Algorithms for Permuting and Inverting Permutations.
- String Comparison on a Quantum Computer Using Hamming Distance.
- String Sampling with Bidirectional String Anchors.
- Succinct Data Structure for Path Graphs.
- Succinct Data Structures for Series-Parallel, Block-Cactus and 3-Leaf Power Graphs.
- Support Optimality and Adaptive Cuckoo Filters.
- Text Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations.
- The Dynamic k-Mismatch Problem.
- The k-mappability problem revisited.
- Tiny Pointers.
- Tree Edit Distance with Variables. Measuring the Similarity between Mathematical Formulas.
- Weighted Ancestors in Suffix Trees Revisited.
- Weighted Burrows-Wheeler Compression.
- Which Regular Languages can be Efficiently Indexed?
Comput. J. 2021
- Smaller Compressed Suffix Arrays†.
IEEE Trans. Inf. Theory 2021
- On the Approximation Ratio of Ordered Parsings.
- The Smallest Grammar Problem Revisited.
Inf. Comput. 2021
- Efficient pattern matching in elastic-degenerate strings.
- On the cost of unsuccessful searches in search trees with two-way comparisons.
- Wheeler languages.
Inf. Process. Lett. 2021
- Longest previous overlapping factor array.
Int. J. Geogr. Inf. Sci. 2021
- An index for moving objects with constant-time access to their compressed trajectories.
J. Comput. Syst. Sci. 2021
- Block trees.
- Circular pattern matching with k mismatches.
- Grammar-compressed indexes with logarithmic search time.
J. Inf. Process. 2021
- Towards a Complete Perspective on Labeled Tree Indexing: New Size Bounds, Efficient Constructions, and Beyond.
J. King Saud Univ. Comput. Inf. Sci. 2021
- A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications.
Theor. Comput. Sci. 2021
- Adaptive learning of compressible strings.
- Computing longest palindromic substring after single-character or block-wise edits.
- Efficiently computing runs on a trie.
- Maximal unbordered factors of random strings.
- Space-efficient construction of compressed suffix trees.
- Tight upper and lower bounds on suffix tree breadth.
- When a dollar makes a BWT.
Theory Comput. Syst. 2021
- Constructing Antidictionaries of Long Texts in Output-Sensitive Space.