StringologyTimes

DCC for Stringologist

DCC 2023

  1. Recursive Prefix-Free Parsing for Building Big BWTs.
  2. LZ4r - A New Fast Compression Algorithm for High-Speed Data Storage Systems.
  3. Practical Implementations of Compressed RAM.
  4. Model Compression for Data Compression: Neural Network Based Lossless Compressor Made Practical.
  5. Measuring the Similarity of Files by Data Compression.
  6. Contextual Pattern Matching in Less Space.
  7. Augmented Thresholds for MONI.
  8. Bit-Parallel (Compressed) Wavelet Tree Construction.
  9. Permutation coding using divide-and-conquer strategy.
  10. Computing the optimal BWT of very large string collections.
  11. SnappyR: A New High-Speed Lossless Data Compression Algorithm.
  12. Computing matching statistics on Wheeler DFAs.
  13. FM-Directories: Extending the Burrows-Wheeler Transform for String Labeled Vertex Graphs of (Almost) Arbitrary Topology.
  14. RNA secondary structures: from ab initio prediction to better compression, and back.
  15. Constructing the CDAWG CFG using LCP-Intervals.
  16. JARVIS2: a data compressor for large genome sequences.

DCC 2022

  1. On different variants of the Burrows-Wheeler-Transform of string collections.
  2. Computing Matching Statistics on Repetitive Texts.
  3. Lower Bounds for Lexicographical DFS Data Structures.
  4. Computing Lexicographic Parsings.
  5. Burrows-Wheeler Transform on Purely Morphic Words.
  6. Simple Worst-Case Optimal Adaptive Prefix-Free Coding.
  7. CSTs for Terabyte-Sized Data.
  8. On Dynamic Bitvector Implementations.
  9. Fast Coding of Haar Wavelet Trees.
  10. x3: Lossless Data Compressor.
  11. A Benchmark of Entropy Coders for the Compression of Genome Sequencing Data.
  12. Converting RLBWT to LZ77 in smaller space.
  13. HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances.
  14. Succinct Data Structure for Path Graphs.
  15. Compressing the Tree of Canonical Huffman Coding.
  16. Graphs can be succinctly indexed for pattern matching in $O(\vert E\vert ^{2}+\vert V\vert ^{5/2})$ time.
  17. Linear-time Minimization of Wheeler DFAs.
  18. FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns.
  19. Selective Weighted Adaptive Coding.
  20. RLBWT Tricks.

DCC 2021

  1. Succinct representations of Intersection Graphs on a Circle.
  2. On Elias-Fano for Rank Queries in FM-Indexes.
  3. PHONI: Streamed Matching Statistics with Multi-Genome References.
  4. Accelerating Knuth-Morris-Pratt String Matching over LZ77 Compressed Text.
  5. A grammar compressor for collections of reads with applications to the construction of the BWT.
  6. DZip: improved general-purpose loss less compression based on novel neural network modeling.
  7. A Disk-Based Index for Trajectories with an In-Memory Compressed Cache.
  8. Improved LZ77 Compression.
  9. Succinct Data Structures for Small Clique-Width Graphs.
  10. Approximate Hashing for Bioinformatics.
  11. Efficiently Merging r-indexes.
  12. Neural Networks Optimally Compress the Sawbridge.
  13. Smaller RLZ-Compressed Suffix Arrays.
  14. On Random Editing in LZ-End.
  15. Parallel Processing of Grammar Compression.
  16. End-to-End optimized image compression for machines, a study.
  17. Improving Run Length Encoding by Preprocessing.
  18. Compact Representation of Spatial Hierarchies and Topological Relationships.
  19. ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
  20. Backward Weighted Coding.

DCC 2020

  1. Grammar Compression with Probabilistic Context-Free Grammar.
  2. Edge Minimization in de Bruijn Graphs.
  3. Revisiting Compact RDF Stores Based on k2-Trees.
  4. Bitvectors with Runs and the Successor/Predecessor Problem.
  5. Compact Representation of Graphs with Small Bandwidth and Treedepth.
  6. Re-Pair in Small Space.
  7. c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches.
  8. On Dynamic Succinct Graph Representations.
  9. Practical Repetition-Aware Grammar Compression.
  10. Compressing and Randomly Accessing Sequences (note).
  11. Decompressing Lempel-Ziv Compressed Text.
  12. Pattern Search in Grammar-Compressed Graphs.
  13. Approximating Optimal Bidirectional Macro Schemes.
  14. Towards Better Compressed Representations.
  15. Semantrix: A Compressed Semantic Matrix.

DCC 2019

  1. Generalized Word Equations: A New Approach to Data Compresion.
  2. LZRR: LZ77 Parsing with Right Reference.
  3. Constructing Antidictionaries in Output-Sensitive Space.
  4. MR-RePair: Grammar Compression Based on Maximal Repeats.
  5. RePair in Compressed Space and Time.
  6. Dv2v: A Dynamic Variable-to-Variable Compressor.
  7. Better Than Optimal Huffman Coding?
  8. Tunneling on Wheeler Graphs.
  9. Multidimensional Compression with Pattern Matching.
  10. Parameterized Text Indexing with One Wildcard.
  11. Space-Efficient Computation of the Burrows-Wheeler Transform.
  12. Regular Expression Search on Compressed Text.
  13. Numerical Pattern Mining Through Compression.
  14. Vectorizing Fast Compression.
  15. BWT Tunnel Planning is Hard But Manageable.
  16. On Lempel-Ziv Decompression in Small Space.
  17. A Compact Representation of Raster Time Series.
  18. On the Randomness of Compressed Data.
  19. Light Field Image Compression with Random Access.
  20. A New Technique for Lossless Compression of Color Images Based on Hierarchical Prediction, Inversion and Context Adaptive Coding.
  21. Practical Indexing of Repetitive Collections Using Relative Lempel-Ziv.
  22. Selective Dynamic Compression.

DCC 2018

  1. K-Means Algorithm Over Compressed Binary Data.
  2. Engineering Compressed Static Functions.
  3. A Hybrid Approach for Wind Tunnel Data Compression.
  4. Compact Representations of Event Sequences.
  5. Run Compressed Rank/Select for Large Alphabets.
  6. Optimal In-Place Suffix Sorting.
  7. Lapped Transforms Based Image Recovery for Block Compressed Sensing.
  8. Two-Dimensional Block Trees.
  9. LZ77 Like Lossy Transformation of Quality Scores.
  10. A Grammar Compression Algorithm Based on Induced Suffix Sorting.
  11. Delta-Huffman Coding of Unbounded Integers.
  12. Exploiting Computation-Friendly Graph Compression Methods for Adjacency-Matrix Multiplication.
  13. A Dynamic Compressed Self-Index for Highly Repetitive Text Collections.
  14. Practical Succinct Text Indexes in External Memory.
  15. The Bits Between Proteins.
  16. Compressed Hierarchical Clustering.
  17. Compaction of Church Numerals for Higher-Order Compression.
  18. Fast and Efficient Compression of Next Generation Sequencing Data.
  19. Efficient Processing of top-K Vector-Raster Queries Over Compressed Data.
  20. Constant Delay Traversal of Compressed Graphs.
  21. Compact Encoding for Galled-Trees and Its Applications.
  22. Fibonacci Based Compressed Suffix Array.

DCC 2017

  1. Content Adaptive Embedded Compression.
  2. Full Compressed Affix Tree Representations.
  3. Symmetry-Compressible Graphs.
  4. Optimize Genomics Data Compression with Hardware Accelerator.
  5. Stabbing Colors in One Dimension.
  6. Complementary Contextual Models with FM-Index for DNA Compression.
  7. Improved Parallel Construction of Wavelet Trees and Rank/Select Structures.
  8. Improvements on Re-Pair Grammar Compressor.
  9. Marlin: A High Throughput Variable-to-Fixed Codec Using Plurally Parsable Dictionaries.
  10. A Succinct Data Structure for Multidimensional Orthogonal Range Searching.
  11. Making Compression Algorithms for Unicode Text.
  12. A Compact Index for Order-Preserving Pattern Matching.
  13. LZ-End Parsing in Compressed Space.
  14. Compressed Dynamic Range Majority Data Structures.
  15. Space-Efficient Re-Pair Compression.
  16. Streaming K-Mismatch with Error Correcting and Applications.

DCC 2016

  1. Shortest DNA Cyclic Cover in Compressed Space.
  2. Efficient Compression of Genomic Sequences.
  3. Burrows-Wheeler Transform for Terabases.
  4. Self-Indexing RDF Archives.
  5. Parallel Lightweight Wavelet Tree, Suffix Array and FM-Index Construction.
  6. Lossy Compression of Unordered Rooted Trees.
  7. Efficient Environmental Temperature Monitoring Using Compressed Sensing.
  8. Improved Range Minimum Queries.
  9. Quick Access to Compressed Data in Storage Systems.
  10. Compressing Combinatorial Objects.
  11. Faster, Minuter.
  12. CS2A: A Compressed Suffix Array-Based Method for Short Read Alignment.
  13. Linear Time Succinct Indexable Dictionary Construction with Applications.
  14. A Simple and Efficient Approach for Adaptive Entropy Coding over Large Alphabets.
  15. Small Polygon Compression.
  16. Timeliness in Lossless Block Coding.
  17. When Less is More - Using Restricted Repetition Search in Fast Compressors.
  18. A Space Efficient Direct Access Data Structure.
  19. Analysis of a Rewriting Compression System for Flash Memory.
  20. Induced Suffix Sorting for String Collections.
  21. Online Grammar Transformation Based on Re-Pair Algorithm.
  22. Lempel-Ziv Computation in Compressed Space (LZ-CICS).
  23. Approximate String Matching for Self-Indexes.
  24. Positional Inverted Self-index.
  25. Hardware Based Compression in Big Data.
  26. Computing LZ77 in Run-Compressed Space.
  27. Grammatical Ziv-Lempel Compression: Achieving PPM-Class Text Compression Ratios with LZ-Class Decompression Speed.

DCC 2015

  1. Geometric Compression of Orientation Signals for Fast Gesture Analysis.
  2. Range Selection Queries in Data Aware Space and Time.
  3. Compression of Next Generation Sequencing Data.
  4. On Probability Estimation via Relative Frequencies and Discount.
  5. Variable-Order de Bruijn Graphs.
  6. OnlineRePair: A Recompressor for XML Structures.
  7. Document Counting in Compressed Space.
  8. Queries on LZ-Bounded Encodings.
  9. Enhanced Direct Access to Huffman Encoded Files.
  10. Incremental Locality and Clustering-Based Compression.
  11. Parallel Wavelet Tree Construction.
  12. Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+.
  13. Faster Compressed Quadtrees.
  14. Bi-Directional Context Modeling with Combinatorial Structuring for Genome Sequence Compression.
  15. Compression for Similarity Identification: Computing the Error Exponent.
  16. Serializing RDF in Compressed Space.
  17. Improving PPM with Dynamic Parameter Updates.
  18. Data Compression Cost Optimization.
  19. Universal Compression of Memoryless Sources over Large Alphabets via Independent Component Analysis.
  20. Compressing Yahoo Mail.
  21. Efficient Set Operations over k2-Trees.
  22. Compression-Aware Algorithms for Massive Datasets.

DCC 2014

  1. Hybrid Compression of Bitvectors for the FM-Index.
  2. Boosting the Compression of Rewriting on Flash Memory.
  3. Interleaved K2-Tree: Indexing and Navigating Ternary Relations.
  4. Entropy Reduction Using Context Transformations.
  5. Adaptive Dictionary Sharing Method for Re-Pair Algorithm.
  6. Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets.
  7. Towards Markup-Aware Text Compression.
  8. Information Profiles for DNA Pattern Discovery.
  9. Fast Fully-Compressed Suffix Trees.
  10. Alignment Free Sequence Similarity with Bounded Hamming Distance.
  11. Better Compression through Better List Update Algorithms.
  12. Compressing Similar Biological Sequences Using FM-Index.
  13. Universal Text Preprocessing and Postprocessing for PPM Using Alphabet Adjustment.
  14. Compressing Sets and Multisets of Sequences.
  15. Relative Lempel-Ziv with Constant-Time Random Access.
  16. Direct Access to Variable-to-Fixed Length Codes with a Succinct Index.
  17. Fully Online Grammar Compression in Constant Space.
  18. LZ-Compressed String Dictionaries.
  19. Lempel-Ziv Parsing in External Memory.
  20. A Practical Implementation of Compressed Suffix Arrays with Applications to Self-Indexing.
  21. Compression Schemes for Similarity Queries.
  22. Space Efficient Linear Time Lempel-Ziv Factorization for Small Alphabets.

DCC 2013

  1. Quadratic Similarity Queries on Compressed Data.
  2. Faster Compact Top-k Document Retrieval.
  3. Texture Compression.
  4. Faster Compressed Top-k Document Retrieval.
  5. The Rightmost Equal-Cost Position Problem.
  6. Random Extraction from Compressed Data - A Practical Study.
  7. Practical Parallel Lempel-Ziv Factorization.
  8. Variable-to-Fixed-Length Encoding for Large Texts Using Re-Pair Algorithm with Shared Dictionaries.
  9. Compressed Parameterized Pattern Matching.
  10. A Simple Online Competitive Adaptation of Lempel-Ziv Compression with Efficient Random Access Support.
  11. Algorithms for Compressed Inputs.
  12. Effective Variable-Length-to-Fixed-Length Coding via a Re-Pair Algorithm.
  13. An Adaptive Difference Distribution-Based Coding with Hierarchical Tree Structure for DNA Sequence Compression.
  14. Simpler and Faster Lempel Ziv Factorization.
  15. Computing Convolution on Grammar-Compressed Text.
  16. Space-Efficient Construction Algorithm for the Circular Suffix Tree.
  17. Partition Tree Weighting.
  18. Compressing Huffman Models on Large Alphabets.
  19. From Run Length Encoding to LZ78 and Back Again.

DCC 2012

  1. A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time.
  2. Adaptive Context Tree Weighting.
  3. Differentially Encoded Search Trees.
  4. Fast Insertion and Deletion in Compressed Texts.
  5. Compressed Dynamic Binary Relations.
  6. Mixing Strategies in Data Compression.
  7. Fast Construction of Nearly-Optimal Prefix Codes without Probability Sorting.
  8. Gipfeli - High Speed Compression Algorithm.
  9. Slashing the Time for BWT Inversion.

DCC 2011

  1. Tree Structure Compression with RePair.
  2. Improving PPM Algorithm Using Dictionaries.
  3. On Performance of Compressed Pattern Matching on VF Codes.
  4. Search and Modification in Compressed Texts.
  5. Compressed Context Modeling for Text Compression.
  6. Sliding Window Update Using Suffix Arrays.
  7. Compressed Property Suffix Trees.
  8. Compressed Index for Property Matching.
  9. Deplump for Streaming Data.
  10. Lossless Data Compression Testbed: ExCom and Prague Corpus.
  11. Error Recovery Method for PPM Compressed Data.
  12. The String-to-Dictionary Matching Problem.
  13. Color Image Compression Using a Learned Dictionary of Pairs of Orthonormal Bases.
  14. Mixing Deduplication and Compression on Active Data Sets.
  15. Coding of Sets of Words.
  16. Sequence Similarity by Gapped LZW.

DCC 2010

  1. Advantages of Shared Data Structures for Sequences of Balanced Parentheses.
  2. gFPC: A Self-Tuning Compression Algorithm.
  3. I/O-Efficient Compressed Text Indexes: From Theory to Practice.
  4. A Pseudo-Random Number Generator Based on LZSS.
  5. Xampling: Analog Data Compression.
  6. Efficient Algorithms for Constructing Optimal Bi-directional Context Sets.
  7. Modelling Parallel Texts for Boosting Compression.
  8. A Similarity Measure Using Smallest Context-Free Grammars.
  9. Lossless Data Compression via Substring Enumeration.
  10. On Computation of Performance Bounds of Optimal Index Assignment.
  11. File-Size Preserving LZ Encoding for Reversible Data Embedding.
  12. Local Modeling for WebGraph Compression.
  13. Bidirectional Delta Files.
  14. LZ77-Like Compression with Fast Random Access.
  15. Neural Markovian Predictive Compression: An Algorithm for Online Lossless Data Compression.
  16. Lossless Compression of Maps, Charts, and Graphs via Color Separation.
  17. A New Searchable Variable-to-Variable Compressor.
  18. Optimum String Match Choices in LZSS.

DCC 2009

  1. Suffix Tree Based VF-Coding for Compressed Pattern Matching.
  2. Low-Memory Adaptive Prefix Coding.

DCC 2008

  1. List Update Algorithms for Data Compression.
  2. Word-Based Statistical Compressors as Natural Language Compression Boosters.
  3. Re-pair Achieves High-Order Entropy.
  4. On Self-Indexing Images - Image Compression with Added Value.
  5. All-Match LZ77 Bit Recycling.

DCC 2007

  1. Compressed Delta Encoding for LZSS Encoded Files.
  2. Simple Linear-Time Off-Line Text Compression by Longest-First Substitution.
  3. Bit Recycling with Prefix Codes.

DCC 2006

  1. Compressed Data Structures: Dictionaries and Data-Aware Measures.
  2. Modeling Delta Encoding of Compressed Files.

DCC 2005

  1. Compressed Pattern Matching in JPEG Images.
  2. Real-Time Traversal in Grammar-Based Compressed Files.
  3. The Performance of Linear Time Suffix Sorting Algorithms.

DCC 2003

  1. In-Place Differential File Compression.

DCC 2002

  1. Searching in Compressed Dictionaries.