StringologyTimes

DCC for Stringologist

DCC 2023

  1. Model Compression for Data Compression: Neural Network Based Lossless Compressor Made Practical.
  2. JARVIS2: a data compressor for large genome sequences.
  3. Constructing the CDAWG CFG using LCP-Intervals.
  4. Augmented Thresholds for MONI.
  5. FM-Directories: Extending the Burrows-Wheeler Transform for String Labeled Vertex Graphs of (Almost) Arbitrary Topology.
  6. Contextual Pattern Matching in Less Space.
  7. Practical Implementations of Compressed RAM.
  8. Permutation coding using divide-and-conquer strategy.
  9. Bit-Parallel (Compressed) Wavelet Tree Construction.
  10. SnappyR: A New High-Speed Lossless Data Compression Algorithm.
  11. RNA secondary structures: from ab initio prediction to better compression, and back.
  12. Recursive Prefix-Free Parsing for Building Big BWTs.
  13. Computing the optimal BWT of very large string collections.
  14. Measuring the Similarity of Files by Data Compression.
  15. LZ4r - A New Fast Compression Algorithm for High-Speed Data Storage Systems.
  16. Computing matching statistics on Wheeler DFAs.

DCC 2022

  1. Burrows-Wheeler Transform on Purely Morphic Words.
  2. Linear-time Minimization of Wheeler DFAs.
  3. FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns.
  4. Fast Coding of Haar Wavelet Trees.
  5. CSTs for Terabyte-Sized Data.
  6. On Dynamic Bitvector Implementations.
  7. x3: Lossless Data Compressor.
  8. Compressing the Tree of Canonical Huffman Coding.
  9. Succinct Data Structure for Path Graphs.
  10. Computing Matching Statistics on Repetitive Texts.
  11. Lower Bounds for Lexicographical DFS Data Structures.
  12. Graphs can be succinctly indexed for pattern matching in $O(\vert E\vert ^{2}+\vert V\vert ^{5/2})$ time.
  13. RLBWT Tricks.
  14. Simple Worst-Case Optimal Adaptive Prefix-Free Coding.
  15. A Benchmark of Entropy Coders for the Compression of Genome Sequencing Data.
  16. Converting RLBWT to LZ77 in smaller space.
  17. HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances.
  18. On different variants of the Burrows-Wheeler-Transform of string collections.
  19. Computing Lexicographic Parsings.
  20. Selective Weighted Adaptive Coding.

DCC 2021

  1. Improving Run Length Encoding by Preprocessing.
  2. A grammar compressor for collections of reads with applications to the construction of the BWT.
  3. Neural Networks Optimally Compress the Sawbridge.
  4. Improved LZ77 Compression.
  5. PHONI: Streamed Matching Statistics with Multi-Genome References.
  6. Compact Representation of Spatial Hierarchies and Topological Relationships.
  7. Succinct representations of Intersection Graphs on a Circle.
  8. Efficiently Merging r-indexes.
  9. Backward Weighted Coding.
  10. On Random Editing in LZ-End.
  11. Accelerating Knuth-Morris-Pratt String Matching over LZ77 Compressed Text.
  12. A Disk-Based Index for Trajectories with an In-Memory Compressed Cache.
  13. End-to-End optimized image compression for machines, a study.
  14. ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
  15. Approximate Hashing for Bioinformatics.
  16. DZip: improved general-purpose loss less compression based on novel neural network modeling.
  17. Succinct Data Structures for Small Clique-Width Graphs.
  18. Parallel Processing of Grammar Compression.
  19. Smaller RLZ-Compressed Suffix Arrays.
  20. On Elias-Fano for Rank Queries in FM-Indexes.

DCC 2020

  1. Edge Minimization in de Bruijn Graphs.
  2. Revisiting Compact RDF Stores Based on k2-Trees.
  3. Compressing and Randomly Accessing Sequences (note).
  4. Bitvectors with Runs and the Successor/Predecessor Problem.
  5. Pattern Search in Grammar-Compressed Graphs.
  6. Approximating Optimal Bidirectional Macro Schemes.
  7. Practical Repetition-Aware Grammar Compression.
  8. Re-Pair in Small Space.
  9. Grammar Compression with Probabilistic Context-Free Grammar.
  10. Semantrix: A Compressed Semantic Matrix.
  11. Decompressing Lempel-Ziv Compressed Text.
  12. Towards Better Compressed Representations.
  13. Compact Representation of Graphs with Small Bandwidth and Treedepth.
  14. On Dynamic Succinct Graph Representations.
  15. c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches.

DCC 2019

  1. LZRR: LZ77 Parsing with Right Reference.
  2. On Lempel-Ziv Decompression in Small Space.
  3. Multidimensional Compression with Pattern Matching.
  4. RePair in Compressed Space and Time.
  5. Better Than Optimal Huffman Coding?
  6. Practical Indexing of Repetitive Collections Using Relative Lempel-Ziv.
  7. Selective Dynamic Compression.
  8. Dv2v: A Dynamic Variable-to-Variable Compressor.
  9. Regular Expression Search on Compressed Text.
  10. A Compact Representation of Raster Time Series.
  11. Light Field Image Compression with Random Access.
  12. Numerical Pattern Mining Through Compression.
  13. Tunneling on Wheeler Graphs.
  14. Constructing Antidictionaries in Output-Sensitive Space.
  15. BWT Tunnel Planning is Hard But Manageable.
  16. A New Technique for Lossless Compression of Color Images Based on Hierarchical Prediction, Inversion and Context Adaptive Coding.
  17. Generalized Word Equations: A New Approach to Data Compresion.
  18. Space-Efficient Computation of the Burrows-Wheeler Transform.
  19. MR-RePair: Grammar Compression Based on Maximal Repeats.
  20. Vectorizing Fast Compression.
  21. On the Randomness of Compressed Data.
  22. Parameterized Text Indexing with One Wildcard.

DCC 2018

  1. Practical Succinct Text Indexes in External Memory.
  2. Compaction of Church Numerals for Higher-Order Compression.
  3. Compressed Hierarchical Clustering.
  4. Exploiting Computation-Friendly Graph Compression Methods for Adjacency-Matrix Multiplication.
  5. A Grammar Compression Algorithm Based on Induced Suffix Sorting.
  6. Run Compressed Rank/Select for Large Alphabets.
  7. Compact Encoding for Galled-Trees and Its Applications.
  8. Efficient Processing of top-K Vector-Raster Queries Over Compressed Data.
  9. Delta-Huffman Coding of Unbounded Integers.
  10. The Bits Between Proteins.
  11. Lapped Transforms Based Image Recovery for Block Compressed Sensing.
  12. K-Means Algorithm Over Compressed Binary Data.
  13. Compact Representations of Event Sequences.
  14. Engineering Compressed Static Functions.
  15. Optimal In-Place Suffix Sorting.
  16. Constant Delay Traversal of Compressed Graphs.
  17. Fast and Efficient Compression of Next Generation Sequencing Data.
  18. A Hybrid Approach for Wind Tunnel Data Compression.
  19. A Dynamic Compressed Self-Index for Highly Repetitive Text Collections.
  20. Two-Dimensional Block Trees.
  21. LZ77 Like Lossy Transformation of Quality Scores.
  22. Fibonacci Based Compressed Suffix Array.

DCC 2017

  1. A Succinct Data Structure for Multidimensional Orthogonal Range Searching.
  2. Symmetry-Compressible Graphs.
  3. Stabbing Colors in One Dimension.
  4. Improved Parallel Construction of Wavelet Trees and Rank/Select Structures.
  5. Making Compression Algorithms for Unicode Text.
  6. Improvements on Re-Pair Grammar Compressor.
  7. Marlin: A High Throughput Variable-to-Fixed Codec Using Plurally Parsable Dictionaries.
  8. A Compact Index for Order-Preserving Pattern Matching.
  9. Complementary Contextual Models with FM-Index for DNA Compression.
  10. Optimize Genomics Data Compression with Hardware Accelerator.
  11. Full Compressed Affix Tree Representations.
  12. Space-Efficient Re-Pair Compression.
  13. LZ-End Parsing in Compressed Space.
  14. Content Adaptive Embedded Compression.
  15. Streaming K-Mismatch with Error Correcting and Applications.
  16. Compressed Dynamic Range Majority Data Structures.

DCC 2016

  1. When Less is More - Using Restricted Repetition Search in Fast Compressors.
  2. Lempel-Ziv Computation in Compressed Space (LZ-CICS).
  3. Analysis of a Rewriting Compression System for Flash Memory.
  4. Efficient Environmental Temperature Monitoring Using Compressed Sensing.
  5. Faster, Minuter.
  6. Online Grammar Transformation Based on Re-Pair Algorithm.
  7. Grammatical Ziv-Lempel Compression: Achieving PPM-Class Text Compression Ratios with LZ-Class Decompression Speed.
  8. Approximate String Matching for Self-Indexes.
  9. Induced Suffix Sorting for String Collections.
  10. A Simple and Efficient Approach for Adaptive Entropy Coding over Large Alphabets.
  11. A Space Efficient Direct Access Data Structure.
  12. Linear Time Succinct Indexable Dictionary Construction with Applications.
  13. CS2A: A Compressed Suffix Array-Based Method for Short Read Alignment.
  14. Hardware Based Compression in Big Data.
  15. Small Polygon Compression.
  16. Positional Inverted Self-index.
  17. Shortest DNA Cyclic Cover in Compressed Space.
  18. Efficient Compression of Genomic Sequences.
  19. Lossy Compression of Unordered Rooted Trees.
  20. Burrows-Wheeler Transform for Terabases.
  21. Compressing Combinatorial Objects.
  22. Computing LZ77 in Run-Compressed Space.
  23. Quick Access to Compressed Data in Storage Systems.
  24. Improved Range Minimum Queries.
  25. Parallel Lightweight Wavelet Tree, Suffix Array and FM-Index Construction.
  26. Self-Indexing RDF Archives.
  27. Timeliness in Lossless Block Coding.

DCC 2015

  1. Serializing RDF in Compressed Space.
  2. On Probability Estimation via Relative Frequencies and Discount.
  3. Bi-Directional Context Modeling with Combinatorial Structuring for Genome Sequence Compression.
  4. Incremental Locality and Clustering-Based Compression.
  5. Queries on LZ-Bounded Encodings.
  6. Enhanced Direct Access to Huffman Encoded Files.
  7. Compression for Similarity Identification: Computing the Error Exponent.
  8. Compression of Next Generation Sequencing Data.
  9. Universal Compression of Memoryless Sources over Large Alphabets via Independent Component Analysis.
  10. Compressing Yahoo Mail.
  11. Range Selection Queries in Data Aware Space and Time.
  12. Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+.
  13. Geometric Compression of Orientation Signals for Fast Gesture Analysis.
  14. Efficient Set Operations over k2-Trees.
  15. Document Counting in Compressed Space.
  16. Compression-Aware Algorithms for Massive Datasets.
  17. OnlineRePair: A Recompressor for XML Structures.
  18. Improving PPM with Dynamic Parameter Updates.
  19. Variable-Order de Bruijn Graphs.
  20. Faster Compressed Quadtrees.
  21. Parallel Wavelet Tree Construction.
  22. Data Compression Cost Optimization.

DCC 2014

  1. Boosting the Compression of Rewriting on Flash Memory.
  2. Lempel-Ziv Parsing in External Memory.
  3. Interleaved K2-Tree: Indexing and Navigating Ternary Relations.
  4. A Practical Implementation of Compressed Suffix Arrays with Applications to Self-Indexing.
  5. Adaptive Dictionary Sharing Method for Re-Pair Algorithm.
  6. Entropy Reduction Using Context Transformations.
  7. Fully Online Grammar Compression in Constant Space.
  8. LZ-Compressed String Dictionaries.
  9. Better Compression through Better List Update Algorithms.
  10. Compression Schemes for Similarity Queries.
  11. Compressing Sets and Multisets of Sequences.
  12. Compressing Similar Biological Sequences Using FM-Index.
  13. Information Profiles for DNA Pattern Discovery.
  14. Hybrid Compression of Bitvectors for the FM-Index.
  15. Space Efficient Linear Time Lempel-Ziv Factorization for Small Alphabets.
  16. Fast Fully-Compressed Suffix Trees.
  17. Alignment Free Sequence Similarity with Bounded Hamming Distance.
  18. Towards Markup-Aware Text Compression.
  19. Direct Access to Variable-to-Fixed Length Codes with a Succinct Index.
  20. Universal Text Preprocessing and Postprocessing for PPM Using Alphabet Adjustment.
  21. Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets.
  22. Relative Lempel-Ziv with Constant-Time Random Access.

DCC 2013

  1. Partition Tree Weighting.
  2. A Simple Online Competitive Adaptation of Lempel-Ziv Compression with Efficient Random Access Support.
  3. Variable-to-Fixed-Length Encoding for Large Texts Using Re-Pair Algorithm with Shared Dictionaries.
  4. Simpler and Faster Lempel Ziv Factorization.
  5. Random Extraction from Compressed Data - A Practical Study.
  6. An Adaptive Difference Distribution-Based Coding with Hierarchical Tree Structure for DNA Sequence Compression.
  7. Compressing Huffman Models on Large Alphabets.
  8. Computing Convolution on Grammar-Compressed Text.
  9. Faster Compressed Top-k Document Retrieval.
  10. Compressed Parameterized Pattern Matching.
  11. Algorithms for Compressed Inputs.
  12. Quadratic Similarity Queries on Compressed Data.
  13. Texture Compression.
  14. Space-Efficient Construction Algorithm for the Circular Suffix Tree.
  15. From Run Length Encoding to LZ78 and Back Again.
  16. Effective Variable-Length-to-Fixed-Length Coding via a Re-Pair Algorithm.
  17. The Rightmost Equal-Cost Position Problem.
  18. Faster Compact Top-k Document Retrieval.
  19. Practical Parallel Lempel-Ziv Factorization.

DCC 2012

  1. Differentially Encoded Search Trees.
  2. Compressed Dynamic Binary Relations.
  3. Slashing the Time for BWT Inversion.
  4. Gipfeli - High Speed Compression Algorithm.
  5. A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time.
  6. Fast Insertion and Deletion in Compressed Texts.
  7. Adaptive Context Tree Weighting.
  8. Mixing Strategies in Data Compression.
  9. Fast Construction of Nearly-Optimal Prefix Codes without Probability Sorting.

DCC 2011

  1. On Performance of Compressed Pattern Matching on VF Codes.
  2. Compressed Context Modeling for Text Compression.
  3. Error Recovery Method for PPM Compressed Data.
  4. Compressed Property Suffix Trees.
  5. The String-to-Dictionary Matching Problem.
  6. Mixing Deduplication and Compression on Active Data Sets.
  7. Search and Modification in Compressed Texts.
  8. Sliding Window Update Using Suffix Arrays.
  9. Compressed Index for Property Matching.
  10. Tree Structure Compression with RePair.
  11. Color Image Compression Using a Learned Dictionary of Pairs of Orthonormal Bases.
  12. Lossless Data Compression Testbed: ExCom and Prague Corpus.
  13. Coding of Sets of Words.
  14. Sequence Similarity by Gapped LZW.
  15. Improving PPM Algorithm Using Dictionaries.
  16. Deplump for Streaming Data.

DCC 2010

  1. Xampling: Analog Data Compression.
  2. A Pseudo-Random Number Generator Based on LZSS.
  3. Neural Markovian Predictive Compression: An Algorithm for Online Lossless Data Compression.
  4. Lossless Data Compression via Substring Enumeration.
  5. Lossless Compression of Maps, Charts, and Graphs via Color Separation.
  6. Modelling Parallel Texts for Boosting Compression.
  7. Optimum String Match Choices in LZSS.
  8. Advantages of Shared Data Structures for Sequences of Balanced Parentheses.
  9. A Similarity Measure Using Smallest Context-Free Grammars.
  10. Local Modeling for WebGraph Compression.
  11. Bidirectional Delta Files.
  12. A New Searchable Variable-to-Variable Compressor.
  13. On Computation of Performance Bounds of Optimal Index Assignment.
  14. LZ77-Like Compression with Fast Random Access.
  15. File-Size Preserving LZ Encoding for Reversible Data Embedding.
  16. Efficient Algorithms for Constructing Optimal Bi-directional Context Sets.
  17. I/O-Efficient Compressed Text Indexes: From Theory to Practice.
  18. gFPC: A Self-Tuning Compression Algorithm.

DCC 2009

  1. Suffix Tree Based VF-Coding for Compressed Pattern Matching.
  2. Low-Memory Adaptive Prefix Coding.

DCC 2008

  1. Word-Based Statistical Compressors as Natural Language Compression Boosters.
  2. List Update Algorithms for Data Compression.
  3. Re-pair Achieves High-Order Entropy.
  4. On Self-Indexing Images - Image Compression with Added Value.
  5. All-Match LZ77 Bit Recycling.

DCC 2007

  1. Simple Linear-Time Off-Line Text Compression by Longest-First Substitution.
  2. Bit Recycling with Prefix Codes.
  3. Compressed Delta Encoding for LZSS Encoded Files.

DCC 2006

  1. Compressed Data Structures: Dictionaries and Data-Aware Measures.
  2. Modeling Delta Encoding of Compressed Files.

DCC 2005

  1. Real-Time Traversal in Grammar-Based Compressed Files.
  2. Compressed Pattern Matching in JPEG Images.
  3. The Performance of Linear Time Suffix Sorting Algorithms.

DCC 2003

  1. In-Place Differential File Compression.

DCC 2002

  1. Searching in Compressed Dictionaries.