StringologyTimes

Papers for stringologist (2022)

Contents

CPM 2022

  1. A Theoretical and Experimental Analysis of BWT Variants for String Collections.
  2. An FPT-Algorithm for Longest Common Subsequence Parameterized by the Maximum Number of Deletions.
  3. Arbitrary-Length Analogs to de Bruijn Sequences.
  4. Back-To-Front Online Lyndon Forest Construction.
  5. Beyond the Longest Letter-Duplicated Subsequence Problem.
  6. Bi-Directional r-Indexes.
  7. Cartesian Tree Subsequence Matching.
  8. Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk).
  9. Efficient Construction of the BWT for Repetitive Text Using String Compression.
  10. Front Matter, Table of Contents, Preface, Conference Organization.
  11. Indexable Elastic Founder Graphs of Minimum Height.
  12. Invitation to Combinatorial Reconfiguration (Invited Talk).
  13. Linear-Time Computation of Shortest Covers of All Rotations of a String.
  14. Longest Palindromic Substring in Sublinear Time.
  15. Making de Bruijn Graphs Eulerian.
  16. Mechanical Proving with Walnut for Squares and Cubes in Partial Words.
  17. Minimal Absent Words on Run-Length Encoded Strings.
  18. On Strings Having the Same Length- k Substrings.
  19. Parallel Algorithm for Pattern Matching Problems Under Substring Consistent Equivalence Relations.
  20. Partial Permutations Comparison, Maintenance and Applications.
  21. Permutation Pattern Matching for Doubly Partially Ordered Patterns.
  22. Polynomial-Time Equivalences and Refined Algorithms for Longest Common Subsequence Variants.
  23. Rectangular Tile Covers of 2D-Strings.
  24. Reduction Ratio of the IS-Algorithm: Worst and Random Cases.
  25. Reordering a Tree According to an Order on Its Leaves.
  26. The Dynamic k-Mismatch Problem.
  27. The Fine-Grained Complexity of Episode Matching.
  28. The Normalized Edit Distance with Uniform Operation Costs Is a Metric.
  29. Using Automata and a Decision Procedure to Prove Results in Pattern Matching (Invited Talk).
  30. {RePair} Grammars Are the Smallest Grammars for Fibonacci Words.

DCC 2022

  1. A Benchmark of Entropy Coders for the Compression of Genome Sequencing Data.
  2. Burrows-Wheeler Transform on Purely Morphic Words.
  3. CSTs for Terabyte-Sized Data.
  4. Compressing the Tree of Canonical Huffman Coding.
  5. Computing Lexicographic Parsings.
  6. Computing Matching Statistics on Repetitive Texts.
  7. Converting RLBWT to LZ77 in smaller space.
  8. FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns.
  9. Fast Coding of Haar Wavelet Trees.
  10. Graphs can be succinctly indexed for pattern matching in $O(\vert E\vert ^{2}+\vert V\vert ^{5/2})$ time.
  11. HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances.
  12. Linear-time Minimization of Wheeler DFAs.
  13. Lower Bounds for Lexicographical DFS Data Structures.
  14. On Dynamic Bitvector Implementations.
  15. On different variants of the Burrows-Wheeler-Transform of string collections.
  16. RLBWT Tricks.
  17. Selective Weighted Adaptive Coding.
  18. Simple Worst-Case Optimal Adaptive Prefix-Free Coding.
  19. Succinct Data Structure for Path Graphs.
  20. x3: Lossless Data Compressor.

DLT 2022

  1. Logarithmic Equal-Letter Runs for BWT of Purely Morphic Words.
  2. Prefix Palindromic Length of the Sierpinski Word.

ESA 2022

  1. An Improved Algorithm for Finding the Shortest Synchronizing Words.
  2. Approximate Circular Pattern Matching.
  3. Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings.
  4. Computing NP-Hard Repetitiveness Measures via MAX-SAT.
  5. Distinct Elements in Streams: An Algorithm for the (Text) Book.
  6. Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold.
  7. Lyndon Arrays Simplified.
  8. Simple Worst-Case Optimal Adaptive Prefix-Free Coding.

FOCS 2022

  1. Estimating the Longest Increasing Subsequence in Nearly Optimal Time.
  2. Faster Pattern Matching under Edit Distance : A Reduction to Dynamic Puzzle Matching and the Seaweed Monoid of Permutation Matrices.
  3. Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal.
  4. Online List Labeling: Breaking the log2n Barrier.
  5. Strong XOR Lemma for Communication with Bounded Rounds : (extended abstract).
  6. Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance.

ICALP 2022

  1. An Optimal-Time RLBWT Construction in BWT-Runs Bounded Space.
  2. Fully Functional Parameterized Suffix Trees in Compact Space.
  3. Galloping in Fast-Growth Natural Merge Sorts.
  4. Improved Approximation Algorithms for Dyck Edit Distance and RNA Folding.
  5. Improved Sublinear-Time Edit Distance for Preprocessed Strings.
  6. Listing, Verifying and Counting Lowest Common Ancestors in DAGs: Algorithms and Fine-Grained Lower Bounds.

ISAAC 2022

  1. Computing Palindromes on a Trie in Linear Time.
  2. External-Memory Dictionaries with Worst-Case Update Cost.
  3. On the Complexity of Tree Edit Distance with Variables.
  4. Simple Order-Isomorphic Matching Index with Expected Compact Space.
  5. Succinct List Indexing in Optimal Time.

ISBRA 2022

  1. Optimal Sequence Alignment to ED-Strings.

IWCIA 2022

  1. Lyndon Partial Words and Arrays with Applications.

IWOCA 2022

  1. Computing Longest (Common) Lyndon Subsequences.
  2. Linear Time Construction of Indexable Elastic Founder Graphs.
  3. Practical Space-Efficient Index for Structural Pattern Matching.
  4. Shortest Unique Palindromic Substring Queries in Semi-dynamic Settings.
  5. Space-Efficient B Trees via Load-Balancing.

LATIN 2022

  1. Elastic-Degenerate String Matching with 1 Error.
  2. Near-Optimal Search Time in δ-Optimal Space.
  3. Space-Efficient Data Structure for Next/Previous Larger/Smaller Value Queries.
  4. String Attractors and Infinite Words.

MFCS 2022

  1. On Uniformization in the Full Binary Tree.
  2. Streaming Word Problems.

SEA 2022

  1. Computing Maximal Unique Matches with the r-Index.
  2. RLBWT Tricks.

SODA 2022

  1. A Lower Bound for the n-queens Problem.
  2. An Improved Algorithm for The k-Dyck Edit Distance Problem.
  3. An Upper Bound and Linear-Space Queries on the LZ-End Parsing.
  4. Average Sensitivity of Dynamic Programming.
  5. Enumerating k-SAT functions.
  6. How Compression and Approximation Affect Efficiency in String Distance Measures.
  7. How many Clusters? - An algorithmic answer.
  8. Pattern Matching on Grammar-Compressed Strings in Linear Time.
  9. Selectable Heaps and Optimal Lazy Search Trees.
  10. Simulating a stack using queues.
  11. Splay trees on trees.
  12. Streaming Regular Expression Membership and Pattern Matching.

SOSA 2022

  1. Faster Exponential Algorithm for Permutation Pattern Matching.
  2. Simpler Adjacency Labeling for Planar Graphs with B-Trees.

SPIRE 2022

  1. Accessing the Suffix Array via φ -1-Forest.
  2. Balancing Run-Length Straight-Line Programs.
  3. Compressed String Dictionaries via Data-Aware Subtrie Compaction.
  4. Computing All-vs-All MEMs in Run-Length-Encoded Collections of HiFi Reads.
  5. Computing the Parameterized Burrows-Wheeler Transform Online.
  6. Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors.
  7. Genome Comparison on Succinct Colored de Bruijn Graphs.
  8. How Train-Test Leakage Affects Zero-Shot Retrieval.
  9. Internal Masked Prefix Sums and Its Connection to Fully Internal Measurement Queries.
  10. KATKA: A KRAKEN-Like Tool with k Given at Query Time.
  11. Matching Patterns with Variables Under Edit Distance.
  12. Maximal Closed Substrings.
  13. On Representing the Degree Sequences of Sublogarithmic-Degree Wheeler Graphs.
  14. On the Hardness of Computing the Edit Distance of Shallow Trees.
  15. On the Optimisation of the GSACA Suffix Array Construction Algorithm.
  16. Online Algorithms for Finding Distinct Substrings with Length and Multiple Prefix and Suffix Conditions.
  17. Pattern Matching Under DTW Distance.
  18. Quantum Time Complexity and Algorithms for Pattern Matching on Labeled Graphs.
  19. Reconstructing Parameterized Strings from Parameterized Suffix and LCP Arrays.
  20. Sorting Genomes by Prefix Double-Cut-and-Joins.
  21. Subsequence Covers of Words.
  22. Substring Complexities on Run-Length Compressed Strings.
  23. The Complexity of the Co-occurrence Problem.

STACS 2022

  1. Existential Definability over the Subword Ordering.
  2. Probabilistic vs Deterministic Gamblers.

STOC 2022

  1. Almost-optimal sublinear-time edit distance in the low distance regime.
  2. Dynamic suffix array with polylogarithmic queries and updates.
  3. Explicit binary tree codes with sub-logarithmic size alphabet.
  4. Improved approximation guarantees for shortest superstrings using cycle classification by overlap to length ratios.
  5. On the optimal time/space tradeoff for hash tables.

SWAT 2022

  1. Predecessor on the Ultra-Wide Word RAM.
  2. Unit-Disk Range Searching and Applications.

WABI 2022

  1. Eulertigs: Minimum Plain Text Representation of k-mer Sets Without Repetitions in Linear Time.
  2. Fast Gapped k-mer Counting with Subdivided Multi-Way Bucketed Cuckoo Hash Tables.
  3. Haplotype Threading Using the Positional Burrows-Wheeler Transform.
  4. Locality-Sensitive Bucketing Functions for the Edit Distance.
  5. On Weighted k-mer Dictionaries.
  6. Prefix-Free Parsing for Building Large Tunnelled Wheeler Graphs.
  7. Suffix Sorting via Matching Statistics.
  8. Toward Optimal Fingerprint Indexing for Large Scale Genomics.
  9. phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering.

ACM Comput. Surv. 2022

  1. Indexing Highly Repetitive String Collections, Part I: Repetitiveness Measures.
  2. Indexing Highly Repetitive String Collections, Part II: Compressed Indexes.

ACM J. Exp. Algorithmics 2022

  1. Grammar Compression by Induced Suffix Sorting.

ACM Trans. Algorithms 2022

  1. A Simple Algorithm for Optimal Search Trees with Two-way Comparisons.

Algorithmica 2022

  1. A Faster Reduction of the Dynamic Time Warping Distance to the Longest Increasing Subsequence Length.
  2. Adaptive Succinctness.
  3. Computing Minimal Unique Substrings for a Sliding Window.
  4. Efficient Computation of Sequence Mappability.
  5. Fast and Longest Rollercoasters.
  6. Space Efficient Merging of de Bruijn Graphs and Wheeler Graphs.
  7. Streaming Dictionary Matching with Mismatches.

Algorithms Mol. Biol. 2022

  1. Space-efficient representation of genomic k-mer count tables.

Appl. Soft Comput. 2022

  1. Graph search and variable neighborhood search for finding constrained longest common subsequences in artificial and real gene sequences.

Bioinform. 2022

  1. Fast and compact matching statistics analytics.

CoRR 2022

  1. A New Class of String Transformations for Compressed Text Indexing.
  2. A theoretical and experimental analysis of BWT variants for string collections.
  3. Almost-Optimal Sublinear-Time Edit Distance in the Low Distance Regime.
  4. An Optimal-Time RLBWT Construction in BWT-runs Bounded Space.
  5. An n Hk-compressed searchable partial-sums data structure for static sequences of sublogarithmic positive integers.
  6. Approximate Circular Pattern Matching.
  7. Cartesian Tree Subsequence Matching.
  8. Computing Longest (Common) Lyndon Subsequences.
  9. Computing NP-hard Repetitiveness Measures via MAX-SAT.
  10. Computing maximal generalized palindromes.
  11. Dynamic Suffix Array with Polylogarithmic Queries and Updates.
  12. Efficient Construction of the BWT for Repetitive Text Using String Compression.
  13. Faster Pattern Matching under Edit Distance.
  14. Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices.
  15. LCP-dropout: Compression-based Multiple Subword Segmentation for Neural Machine Translation.
  16. Linear Time Construction of Indexable Elastic Founder Graphs.
  17. Logarithmic equal-letter runs for BWT of purely morphic words.
  18. Longest (Sub-)Periodic Subsequence.
  19. MONI can find k-MEMs.
  20. Memory Efficient Tries for Sequential Pattern Mining.
  21. Minimal Absent Words on Run-Length Encoded Strings.
  22. Multiple Genome Analytics Framework: The Case of All SARS-CoV-2 Complete Variants.
  23. Near-Optimal Search Time in δ-Optimal Space.
  24. Numeric Lyndon-based feature embedding of sequencing reads for machine learning approaches.
  25. OSM-tree: A Sortedness-Aware Index.
  26. Online List Labeling: Breaking the log2n Barrier.
  27. Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions.
  28. Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance.
  29. Predecessor on the Ultra-Wide Word RAM.
  30. RePair Grammars are the Smallest Grammars for Fibonacci Words.
  31. Safety and Completeness in Flow Decompositions for RNA Assembly.
  32. Shortest Unique Palindromic Substring Queries in Semi-dynamic Settings.
  33. Space-Efficient STR-IC-LCS Computation.
  34. Standard Vs Uniform Binary Search and Their Variants in Learned Static Indexing: The Case of the Searching on Sorted Data Benchmarking Software Platform.
  35. Substring Complexities on Run-length Compressed Strings.
  36. The Efficiency of the ANS Entropy Encoding.
  37. What Does Dynamic Optimality Mean in External Memory?
  38. X3: Lossless Data Compressor.

Commun. ACM 2022

  1. Resolution of the burrows-wheeler transform conjecture.
  2. The compression power of the BWT: technical perspective.

IEEE Access 2022

  1. Compressing and Querying Integer Dictionaries Under Linearities and Repetitions.

Inf. Comput. 2022

  1. A periodicity lemma for partial words.
  2. Efficient representation and counting of antipower factors in words.
  3. LZRR: LZ77 parsing with right reference.
  4. Order-preserving pattern matching indeterminate strings.
  5. c-trie++: A dynamic trie tailored for fast prefix searches.

Inf. Process. Lett. 2022

  1. All-pairs suffix/prefix in optimal time using Aho-Corasick space.
  2. Palindromic trees for a sliding window and its applications.

Inf. Retr. J. 2022

  1. Applying burst-tries for error-tolerant prefix search.

J. Comput. Biol. 2022

  1. Finding Maximal Exact Matches Using the r-Index.
  2. MONI: A Pangenomic Index for Finding Maximal Exact Matches.

J. Comput. Sci. 2022

  1. A sorting algorithm based on ordered block insertions.

SN Comput. Sci. 2022

  1. Combining Forward Compression with PPM.
  2. Correction to: Graph Compression for Adjacency-Matrix Multiplication.
  3. Graph Compression for Adjacency-Matrix Multiplication.

Theor. Comput. Sci. 2022

  1. A data structure for substring-substring LCS length queries.
  2. An efficient algorithm for the longest common palindromic subsequence problem.
  3. Combinatorics of minimal absent words for a sliding window.
  4. Efficient and compact representations of some non-canonical prefix-free codes.
  5. In-place initializable arrays.
  6. Internal shortest absent word queries in constant time and linear space.
  7. Parameterized DAWGs: Efficient constructions and bidirectional pattern searches.
  8. Partial sums on the ultra-wide word RAM.

Theory Comput. Syst. 2022

  1. Factorizing Strings into Repetitions.