StringologyTimes
DCC for Stringologist
DCC 2023
JARVIS2: a data compressor for large genome sequences.
Augmented Thresholds for MONI.
Model Compression for Data Compression: Neural Network Based Lossless Compressor Made Practical.
Permutation coding using divide-and-conquer strategy.
Practical Implementations of Compressed RAM.
Computing matching statistics on Wheeler DFAs.
LZ4r - A New Fast Compression Algorithm for High-Speed Data Storage Systems.
Constructing the CDAWG CFG using LCP-Intervals.
FM-Directories: Extending the Burrows-Wheeler Transform for String Labeled Vertex Graphs of (Almost) Arbitrary Topology.
Measuring the Similarity of Files by Data Compression.
Computing the optimal BWT of very large string collections.
Contextual Pattern Matching in Less Space.
Bit-Parallel (Compressed) Wavelet Tree Construction.
RNA secondary structures: from ab initio prediction to better compression, and back.
SnappyR: A New High-Speed Lossless Data Compression Algorithm.
Recursive Prefix-Free Parsing for Building Big BWTs.
DCC 2022
Burrows-Wheeler Transform on Purely Morphic Words.
Succinct Data Structure for Path Graphs.
x3: Lossless Data Compressor.
On Dynamic Bitvector Implementations.
Linear-time Minimization of Wheeler DFAs.
Simple Worst-Case Optimal Adaptive Prefix-Free Coding.
FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns.
Fast Coding of Haar Wavelet Trees.
CSTs for Terabyte-Sized Data.
Graphs can be succinctly indexed for pattern matching in $O(\vert E\vert ^{2}+\vert V\vert ^{5/2})$ time.
A Benchmark of Entropy Coders for the Compression of Genome Sequencing Data.
RLBWT Tricks.
Computing Lexicographic Parsings.
Lower Bounds for Lexicographical DFS Data Structures.
Converting RLBWT to LZ77 in smaller space.
On different variants of the Burrows-Wheeler-Transform of string collections.
Computing Matching Statistics on Repetitive Texts.
Compressing the Tree of Canonical Huffman Coding.
HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances.
Selective Weighted Adaptive Coding.
DCC 2021
PHONI: Streamed Matching Statistics with Multi-Genome References.
Improving Run Length Encoding by Preprocessing.
Succinct representations of Intersection Graphs on a Circle.
A Disk-Based Index for Trajectories with an In-Memory Compressed Cache.
Neural Networks Optimally Compress the Sawbridge.
Backward Weighted Coding.
Parallel Processing of Grammar Compression.
Compact Representation of Spatial Hierarchies and Topological Relationships.
Smaller RLZ-Compressed Suffix Arrays.
On Elias-Fano for Rank Queries in FM-Indexes.
DZip: improved general-purpose loss less compression based on novel neural network modeling.
Succinct Data Structures for Small Clique-Width Graphs.
A grammar compressor for collections of reads with applications to the construction of the BWT.
End-to-End optimized image compression for machines, a study.
On Random Editing in LZ-End.
ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
Improved LZ77 Compression.
Approximate Hashing for Bioinformatics.
Efficiently Merging r-indexes.
Accelerating Knuth-Morris-Pratt String Matching over LZ77 Compressed Text.
DCC 2020
Semantrix: A Compressed Semantic Matrix.
Grammar Compression with Probabilistic Context-Free Grammar.
Revisiting Compact RDF Stores Based on k2-Trees.
Pattern Search in Grammar-Compressed Graphs.
Decompressing Lempel-Ziv Compressed Text.
Towards Better Compressed Representations.
c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches.
Bitvectors with Runs and the Successor/Predecessor Problem.
Edge Minimization in de Bruijn Graphs.
Practical Repetition-Aware Grammar Compression.
On Dynamic Succinct Graph Representations.
Approximating Optimal Bidirectional Macro Schemes.
Re-Pair in Small Space.
Compact Representation of Graphs with Small Bandwidth and Treedepth.
Compressing and Randomly Accessing Sequences (note).
DCC 2019
On the Randomness of Compressed Data.
RePair in Compressed Space and Time.
Practical Indexing of Repetitive Collections Using Relative Lempel-Ziv.
On Lempel-Ziv Decompression in Small Space.
Space-Efficient Computation of the Burrows-Wheeler Transform.
Constructing Antidictionaries in Output-Sensitive Space.
MR-RePair: Grammar Compression Based on Maximal Repeats.
Better Than Optimal Huffman Coding?
Numerical Pattern Mining Through Compression.
Multidimensional Compression with Pattern Matching.
BWT Tunnel Planning is Hard But Manageable.
Vectorizing Fast Compression.
Light Field Image Compression with Random Access.
Regular Expression Search on Compressed Text.
Parameterized Text Indexing with One Wildcard.
Selective Dynamic Compression.
A New Technique for Lossless Compression of Color Images Based on Hierarchical Prediction, Inversion and Context Adaptive Coding.
LZRR: LZ77 Parsing with Right Reference.
Generalized Word Equations: A New Approach to Data Compresion.
Tunneling on Wheeler Graphs.
A Compact Representation of Raster Time Series.
Dv2v: A Dynamic Variable-to-Variable Compressor.
DCC 2018
Run Compressed Rank/Select for Large Alphabets.
Delta-Huffman Coding of Unbounded Integers.
Optimal In-Place Suffix Sorting.
Compressed Hierarchical Clustering.
K-Means Algorithm Over Compressed Binary Data.
Two-Dimensional Block Trees.
Constant Delay Traversal of Compressed Graphs.
Compact Representations of Event Sequences.
Lapped Transforms Based Image Recovery for Block Compressed Sensing.
Efficient Processing of top-K Vector-Raster Queries Over Compressed Data.
A Grammar Compression Algorithm Based on Induced Suffix Sorting.
Practical Succinct Text Indexes in External Memory.
Compaction of Church Numerals for Higher-Order Compression.
Exploiting Computation-Friendly Graph Compression Methods for Adjacency-Matrix Multiplication.
LZ77 Like Lossy Transformation of Quality Scores.
Compact Encoding for Galled-Trees and Its Applications.
Fibonacci Based Compressed Suffix Array.
A Dynamic Compressed Self-Index for Highly Repetitive Text Collections.
Fast and Efficient Compression of Next Generation Sequencing Data.
Engineering Compressed Static Functions.
The Bits Between Proteins.
A Hybrid Approach for Wind Tunnel Data Compression.
DCC 2017
Marlin: A High Throughput Variable-to-Fixed Codec Using Plurally Parsable Dictionaries.
A Compact Index for Order-Preserving Pattern Matching.
Making Compression Algorithms for Unicode Text.
Content Adaptive Embedded Compression.
LZ-End Parsing in Compressed Space.
Stabbing Colors in One Dimension.
Space-Efficient Re-Pair Compression.
Improved Parallel Construction of Wavelet Trees and Rank/Select Structures.
Improvements on Re-Pair Grammar Compressor.
Optimize Genomics Data Compression with Hardware Accelerator.
Complementary Contextual Models with FM-Index for DNA Compression.
Compressed Dynamic Range Majority Data Structures.
Streaming K-Mismatch with Error Correcting and Applications.
A Succinct Data Structure for Multidimensional Orthogonal Range Searching.
Full Compressed Affix Tree Representations.
Symmetry-Compressible Graphs.
DCC 2016
Grammatical Ziv-Lempel Compression: Achieving PPM-Class Text Compression Ratios with LZ-Class Decompression Speed.
Self-Indexing RDF Archives.
Efficient Environmental Temperature Monitoring Using Compressed Sensing.
Induced Suffix Sorting for String Collections.
Shortest DNA Cyclic Cover in Compressed Space.
CS2A: A Compressed Suffix Array-Based Method for Short Read Alignment.
Positional Inverted Self-index.
Timeliness in Lossless Block Coding.
A Simple and Efficient Approach for Adaptive Entropy Coding over Large Alphabets.
Lempel-Ziv Computation in Compressed Space (LZ-CICS).
Linear Time Succinct Indexable Dictionary Construction with Applications.
When Less is More - Using Restricted Repetition Search in Fast Compressors.
Lossy Compression of Unordered Rooted Trees.
Approximate String Matching for Self-Indexes.
Burrows-Wheeler Transform for Terabases.
Efficient Compression of Genomic Sequences.
Computing LZ77 in Run-Compressed Space.
Quick Access to Compressed Data in Storage Systems.
Analysis of a Rewriting Compression System for Flash Memory.
Parallel Lightweight Wavelet Tree, Suffix Array and FM-Index Construction.
A Space Efficient Direct Access Data Structure.
Online Grammar Transformation Based on Re-Pair Algorithm.
Improved Range Minimum Queries.
Hardware Based Compression in Big Data.
Small Polygon Compression.
Compressing Combinatorial Objects.
Faster, Minuter.
DCC 2015
Serializing RDF in Compressed Space.
Compression for Similarity Identification: Computing the Error Exponent.
Queries on LZ-Bounded Encodings.
OnlineRePair: A Recompressor for XML Structures.
Geometric Compression of Orientation Signals for Fast Gesture Analysis.
Document Counting in Compressed Space.
On Probability Estimation via Relative Frequencies and Discount.
Efficient Set Operations over k2-Trees.
Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+.
Range Selection Queries in Data Aware Space and Time.
Improving PPM with Dynamic Parameter Updates.
Universal Compression of Memoryless Sources over Large Alphabets via Independent Component Analysis.
Compressing Yahoo Mail.
Parallel Wavelet Tree Construction.
Compression-Aware Algorithms for Massive Datasets.
Compression of Next Generation Sequencing Data.
Incremental Locality and Clustering-Based Compression.
Enhanced Direct Access to Huffman Encoded Files.
Data Compression Cost Optimization.
Variable-Order de Bruijn Graphs.
Faster Compressed Quadtrees.
Bi-Directional Context Modeling with Combinatorial Structuring for Genome Sequence Compression.
DCC 2014
Fully Online Grammar Compression in Constant Space.
Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets.
Hybrid Compression of Bitvectors for the FM-Index.
Compressing Similar Biological Sequences Using FM-Index.
LZ-Compressed String Dictionaries.
Relative Lempel-Ziv with Constant-Time Random Access.
Space Efficient Linear Time Lempel-Ziv Factorization for Small Alphabets.
Universal Text Preprocessing and Postprocessing for PPM Using Alphabet Adjustment.
Information Profiles for DNA Pattern Discovery.
Compressing Sets and Multisets of Sequences.
Lempel-Ziv Parsing in External Memory.
Direct Access to Variable-to-Fixed Length Codes with a Succinct Index.
Fast Fully-Compressed Suffix Trees.
Towards Markup-Aware Text Compression.
Interleaved K2-Tree: Indexing and Navigating Ternary Relations.
Compression Schemes for Similarity Queries.
Adaptive Dictionary Sharing Method for Re-Pair Algorithm.
A Practical Implementation of Compressed Suffix Arrays with Applications to Self-Indexing.
Alignment Free Sequence Similarity with Bounded Hamming Distance.
Entropy Reduction Using Context Transformations.
Better Compression through Better List Update Algorithms.
Boosting the Compression of Rewriting on Flash Memory.
DCC 2013
Practical Parallel Lempel-Ziv Factorization.
Partition Tree Weighting.
Random Extraction from Compressed Data - A Practical Study.
Compressed Parameterized Pattern Matching.
Quadratic Similarity Queries on Compressed Data.
Effective Variable-Length-to-Fixed-Length Coding via a Re-Pair Algorithm.
Faster Compressed Top-k Document Retrieval.
Space-Efficient Construction Algorithm for the Circular Suffix Tree.
Algorithms for Compressed Inputs.
A Simple Online Competitive Adaptation of Lempel-Ziv Compression with Efficient Random Access Support.
Variable-to-Fixed-Length Encoding for Large Texts Using Re-Pair Algorithm with Shared Dictionaries.
Computing Convolution on Grammar-Compressed Text.
An Adaptive Difference Distribution-Based Coding with Hierarchical Tree Structure for DNA Sequence Compression.
Texture Compression.
The Rightmost Equal-Cost Position Problem.
Compressing Huffman Models on Large Alphabets.
Faster Compact Top-k Document Retrieval.
From Run Length Encoding to LZ78 and Back Again.
Simpler and Faster Lempel Ziv Factorization.
DCC 2012
Fast Construction of Nearly-Optimal Prefix Codes without Probability Sorting.
Differentially Encoded Search Trees.
Compressed Dynamic Binary Relations.
Gipfeli - High Speed Compression Algorithm.
Fast Insertion and Deletion in Compressed Texts.
Mixing Strategies in Data Compression.
A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time.
Adaptive Context Tree Weighting.
Slashing the Time for BWT Inversion.
DCC 2011
Color Image Compression Using a Learned Dictionary of Pairs of Orthonormal Bases.
Coding of Sets of Words.
On Performance of Compressed Pattern Matching on VF Codes.
Improving PPM Algorithm Using Dictionaries.
Tree Structure Compression with RePair.
Deplump for Streaming Data.
Compressed Context Modeling for Text Compression.
Compressed Index for Property Matching.
Sequence Similarity by Gapped LZW.
Error Recovery Method for PPM Compressed Data.
Mixing Deduplication and Compression on Active Data Sets.
Compressed Property Suffix Trees.
Search and Modification in Compressed Texts.
Sliding Window Update Using Suffix Arrays.
The String-to-Dictionary Matching Problem.
Lossless Data Compression Testbed: ExCom and Prague Corpus.
DCC 2010
A Pseudo-Random Number Generator Based on LZSS.
Neural Markovian Predictive Compression: An Algorithm for Online Lossless Data Compression.
Lossless Compression of Maps, Charts, and Graphs via Color Separation.
gFPC: A Self-Tuning Compression Algorithm.
Local Modeling for WebGraph Compression.
Xampling: Analog Data Compression.
Modelling Parallel Texts for Boosting Compression.
Lossless Data Compression via Substring Enumeration.
Bidirectional Delta Files.
Optimum String Match Choices in LZSS.
A Similarity Measure Using Smallest Context-Free Grammars.
Advantages of Shared Data Structures for Sequences of Balanced Parentheses.
Efficient Algorithms for Constructing Optimal Bi-directional Context Sets.
A New Searchable Variable-to-Variable Compressor.
On Computation of Performance Bounds of Optimal Index Assignment.
File-Size Preserving LZ Encoding for Reversible Data Embedding.
I/O-Efficient Compressed Text Indexes: From Theory to Practice.
LZ77-Like Compression with Fast Random Access.
DCC 2009
Suffix Tree Based VF-Coding for Compressed Pattern Matching.
Low-Memory Adaptive Prefix Coding.
DCC 2008
Re-pair Achieves High-Order Entropy.
Word-Based Statistical Compressors as Natural Language Compression Boosters.
On Self-Indexing Images - Image Compression with Added Value.
All-Match LZ77 Bit Recycling.
List Update Algorithms for Data Compression.
DCC 2007
Simple Linear-Time Off-Line Text Compression by Longest-First Substitution.
Compressed Delta Encoding for LZSS Encoded Files.
Bit Recycling with Prefix Codes.
DCC 2006
Modeling Delta Encoding of Compressed Files.
Compressed Data Structures: Dictionaries and Data-Aware Measures.
DCC 2005
The Performance of Linear Time Suffix Sorting Algorithms.
Real-Time Traversal in Grammar-Based Compressed Files.
Compressed Pattern Matching in JPEG Images.
DCC 2003
In-Place Differential File Compression.
DCC 2002
Searching in Compressed Dictionaries.