StringologyTimes

Papers for stringologist (2011)

Contents

ALENEX 2011

  1. A Closer Look at the Closest String and Closest Substring Problem.
  2. A Simple Parallel Cartesian Tree Algorithm and its Application to Suffix Tree Construction.
  3. Exact Pattern Matching with Feed-Forward Bloom Filters.
  4. Fast and Lightweight LCP-Array Construction Algorithms.

BCB 2011

  1. Approximate string-matching with a single gap for sequence alignment.
  2. DynMap: mapping short reads to multiple related genomes.

CCP 2011

  1. A Method to Ensure the Confidentiality of the Compressed Data.
  2. An Online Algorithm for Lightweight Grammar-Based Compression.
  3. Asymptotic Optimal Lossless Compression via the CSE Technique.
  4. Backwards Search in Context Bound Text Transformations.
  5. Cache Friendly Burrows-Wheeler Inversion.
  6. Combining Non-stationary Prediction, Optimization and Mixing for Data Compression.
  7. Generalized Witness Sets.
  8. Lempel-Ziv Data Compression on Parallel and Distributed Systems.
  9. Lossless Compression of Hyperspectral Imagery.
  10. Natural Language Compression per Blocks.
  11. Pattern Matching on Sparse Suffix Trees.
  12. Quick Estimation of Data Compression and De-duplication for Large Storage Systems.
  13. Straight-Line Programs: A Practical Test.
  14. Wavelet Trees: From Theory to Practice.

CIAA 2011

  1. Chrobak Normal Form Revisited, with Applications.
  2. Tree Template Matching in Ranked Ordered Trees by Pushdown Automata.

CIKM 2011

  1. Indexes for highly repetitive document collections.

COCOON 2011

  1. On the Right-Seed Array of a String.

CPM 2011

  1. A Coarse-to-Fine Approach to Computing the k-Best Viterbi Paths.
  2. A Combinatorial Model of Phyllotaxis Perturbations in Arabidopsis thaliana.
  3. A d-Step Approach for Distinct Squares in Strings.
  4. Algorithms on Grammar-Compressed Strings.
  5. Approximation Algorithms for Orienting Mixed Graphs.
  6. Automatic Discovery of Patterns in Media Content.
  7. Computational Regulatory Genomics.
  8. Counting Colours in Compressed Strings.
  9. Edit Distance with Duplications and Contractions Revisited.
  10. Efficient Matching of Biological Sequences Allowing for Non-overlapping Inversions.
  11. Efficient Seeds Computation Revisited.
  12. Fast Error-Tolerant Quartet Phylogeny Algorithms.
  13. Faster Subsequence and Don’t-Care Pattern Matching on Compressed Texts.
  14. Filling Scaffolds with Gene Repetitions: Maximizing the Number of Adjacencies.
  15. Finding Approximate and Constrained Motifs in Graphs.
  16. Forest Alignment with Affine Gaps and Anchors.
  17. Frequent Submap Discovery.
  18. Improved MAX SNP-Hard Results for Finding an Edit Distance between Unordered Trees.
  19. LRM-Trees: Compressed Indices, Adaptive Sorting, and Compressed Permutations.
  20. Lempel-Ziv Factorization Revisited.
  21. Lightweight BWT Construction for Very Large String Collections.
  22. On Wavelet Tree Construction.
  23. On the Weak Prefix-Search Problem.
  24. Palindrome Pattern Matching.
  25. Phylogenetic Footprinting and Consistent Sets of Local Aligments.
  26. Polynomial-Time Approximation Algorithms for Weighted LCS Problem.
  27. Quick Greedy Computation for Minimum Common String Partitions.
  28. Real-Time Streaming String-Matching.
  29. Restricted Common Superstring and Restricted Common Supersequence.
  30. Self-indexing Based on LZ77.
  31. Simple Real-Time Constant-Space String Matching.
  32. Space Lower Bounds for Online Pattern Matching.
  33. Sparse and Truncated Suffix Trees on Variable-Length Codes.
  34. String Comparison and Lyndon-Like Factorization Using V-Order in Linear Time.
  35. Substring Range Reporting.
  36. Succincter Text Indexing with Wildcards.
  37. Tractability Results for the Consecutive-Ones Property with Multiplicity.
  38. Tractability and Approximability of Maximal Strip Recovery.
  39. Unique Perfect Phylogeny Is NP-Hard.

DCC 2011

  1. Coding of Sets of Words.
  2. Color Image Compression Using a Learned Dictionary of Pairs of Orthonormal Bases.
  3. Compressed Context Modeling for Text Compression.
  4. Compressed Index for Property Matching.
  5. Compressed Property Suffix Trees.
  6. Deplump for Streaming Data.
  7. Error Recovery Method for PPM Compressed Data.
  8. Improving PPM Algorithm Using Dictionaries.
  9. Lossless Data Compression Testbed: ExCom and Prague Corpus.
  10. Mixing Deduplication and Compression on Active Data Sets.
  11. On Performance of Compressed Pattern Matching on VF Codes.
  12. Search and Modification in Compressed Texts.
  13. Sequence Similarity by Gapped LZW.
  14. Sliding Window Update Using Suffix Arrays.
  15. The String-to-Dictionary Matching Problem.
  16. Tree Structure Compression with RePair.

DSL 2011

  1. Specific “scientific” data structures, and their processing

Developments in Language Theory 2011

  1. Abelian Primitive Words.
  2. Avoiding Abelian Powers in Partial Words.
  3. On Highly Repetitive and Power Free Words.
  4. On Prefix Normal Words.

Discovery Science 2011

  1. Scalable Detection of Frequent Substrings by Grammar-Based Compression.

ESA 2011

  1. Alphabet-Independent Compressed Text Indexing.
  2. Distribution-Aware Compressed Full-Text Indexes.
  3. Pattern Matching in Lempel-Ziv Compressed Strings: Fast, Simple, and Deterministic.

FSTTCS 2011

  1. Optimal Packed String Matching.

ICALP (1) 2011

  1. Range Majority in Constant Time and Linear Space.

ISAAC 2011

  1. A Dynamic Stabbing-Max Data Structure with Sub-Logarithmic Query Time.
  2. A New Algorithm for the Characteristic String Problem under Loose Similarity Criteria.
  3. Compact Representation of Posets.
  4. Dynamic Range Majority Data Structures.
  5. Dynamic Range Selection in Linear Space.
  6. Encoding 2D Range Maximum Queries.
  7. External Memory Orthogonal Range Reporting with Fast Updates.
  8. Faster Approximate Pattern Matching in Compressed Repetitive Texts.
  9. Path Queries in Weighted Trees.
  10. Range LCP.
  11. Space-Efficient Data-Analysis Queries on Grids.
  12. Succinct Indexes for Circular Patterns.

IWOCA 2011

  1. A Unifying Property for Distribution-Sensitive Priority Queues.
  2. Parameterized Longest Previous Factor.
  3. Periods in Partial Words: An Algorithm.
  4. Two Constant-Factor-Optimal Realizations of Adaptive Heapsort.
  5. p-Suffix Sorting as Arithmetic Coding.

LATA 2011

  1. Improved Alignment Based Algorithm for Multilingual Text Compression.
  2. Unary Pattern Avoidance in Partial Words Dense with Holes.

MFCS 2011

  1. Compressed Word Problems for Inverse Monoids.
  2. On Minimising Automata with Errors.
  3. Periodicity Algorithms for Partial Words.
  4. The Bounded Search Tree Algorithm for the Closest String Problem Has Quadratic Smoothed Complexity.

SEA 2011

  1. An Empirical Evaluation of Extendible Arrays.
  2. Compressed String Dictionaries.
  3. Online Dictionary Matching with Variable-Length Gaps.
  4. Practical Compressed Document Retrieval.

SIGIR 2011

  1. Sample selection for dictionary-based corpus compression.

SISAP 2011

  1. Succinct nearest neighbor search.

SODA 2011

  1. Improved Space Bounds for Cache-Oblivious Range Reporting.
  2. Optimal pattern matching in LZW compressed strings.
  3. Ordered and Unordered Top-K Range Reporting in Large Data Sets.
  4. Persistent Predecessor Search and Orthogonal point Location on the Word RAM.
  5. Random Access to grammar-Compressed Strings.
  6. Top-K Color Queries for Document Retrieval.

SOFSEM 2011

  1. An Improved B+ Tree for Flash File Systems.
  2. In-Place Sorting.

SPIRE 2011

  1. A Knowledge-Based Semantic Kernel for Text Classification.
  2. A Learned Approach for Ranking News in Real-Time Using the Blogosphere.
  3. A Multi-faceted Approach to Query Intent Classification.
  4. A New Approach for Verifying URL Uniqueness in Web Crawlers.
  5. A Succinct Index for Hypertext.
  6. Approximate Point Set Pattern Matching with L p -Norm.
  7. Approximate Regular Expression Matching with Multi-strings.
  8. Approximations and Partial Solutions for the Consensus Sequence Problem.
  9. Attribute Retrieval from Relational Web Tables.
  10. COCA Filters: Co-occurrence Aware Bloom Filters.
  11. Candidate Document Retrieval for Web-Scale Text Reuse Detection.
  12. Compressed Indexes for Aligned Pattern Matching.
  13. Compressed Text Indexing with Wildcards.
  14. Computing All Subtree Repeats in Ordered Ranked Trees.
  15. Computing the Longest Common Prefix Array Based on the Burrows-Wheeler Transform.
  16. Constructing Strings at the Nano Scale via Staged Self-assembly.
  17. Cross-Lingual Text Fragment Alignment Using Divergence from Randomness.
  18. Detecting Health Events on the Social Web to Enable Epidemic Intelligence.
  19. Discounted Cumulative Gain and User Decision Models.
  20. ESP-Index: A Compressed Index Based on Edit-Sensitive Parsing.
  21. Enhancing Document Snippets Using Temporal Information.
  22. External Query Reformulation for Text-Based Image Retrieval.
  23. Fast Computation of a String Duplication History under No-Breakpoint-Reuse - (Extended Abstract).
  24. Fast q-gram Mining on SLP Compressed Strings.
  25. Finding Frequent Elements in Compressed 2D Arrays and Strings.
  26. Fixed Block Compression Boosting in FM-Indexes.
  27. Improved Compressed Indexes for Full-Text Document Retrieval.
  28. Indexing with Gaps.
  29. Navigating the User Query Space.
  30. Near Real-Time Suffix Tree Construction via the Fringe Marked Ancestor Problem.
  31. On Suffix Extensions in Suffix Trees.
  32. On-Line Construction of Position Heaps.
  33. Persistency in Suffix Trees with Applications to String Interval Problems.
  34. Query-Sets + + : A Scalable Approach for Modeling Web Sites.
  35. Reference Sequence Construction for Relative Compression of Genomes.
  36. Space Efficient Wavelet Tree Construction.
  37. Spaced Seeds Design Using Perfect Rulers.
  38. Sparse Spatial Selection for Novelty-Based Search Result Diversification.
  39. Succinct Gapped Suffix Arrays.
  40. Weighted Shortest Common Supersequence.
  41. When Was It Written? Automatically Determining Publication Dates.

STACS 2011

  1. On Minimal Sturmian Partial Words.

STOC 2011

  1. The power of simple tabulation hashing.

Stringology 2011

  1. 2001-2010: Ten Years of Exact String Matching Algorithms.
  2. A Parameterized Formulation for the Maximum Number of Runs Problem.
  3. Algorithmics of Posets Generated by Words over Partially Commutative Alphabets.
  4. An Improved Version of the Runs Algorithm Based on Crochemore’s Partitioning Algorithm.
  5. Analyzing Edit Distance on Trees: Tree Swap Distance is Intractable.
  6. Binary Image Compression via Monochromatic Pattern Substitution: A Sequential Speed-Up.
  7. Computing Abelian Periods in Words.
  8. Computing Longest Common Substring/Subsequence of Non-linear Texts.
  9. Computing the Number of Cubic Runs in Standard Sturmian Words.
  10. Efficient Eager XPath Filtering over XML Streams.
  11. Finding Long and Multiple Repeats with Edit Distance.
  12. Improving Deduplication Techniques by Accelerating Remainder Calculations.
  13. Improving Exact Search of Multiple Patterns From a Compressed Suffix Array.
  14. Inexact Graph Matching by “Geodesic Hashing” for the Alignment of Pseudoknoted RNA Secondary Structures.
  15. Inferring Strings from Suffix Trees and Links on a Binary Alphabet.
  16. Minimization of Acyclic DFAs.
  17. Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations.
  18. Observations On Compressed Pattern-Matching with Ranked Variables in Zimin Words.
  19. On Compile Time Knuth-Morris-Pratt Precomputation.
  20. Variations of Forward-SBNDM.

WADS 2011

  1. Space Efficient Data Structures for Dynamic Orthogonal Range Counting.

WALCOM 2011

  1. De Bruijn Sequences for the Binary Strings with Maximum Density.
  2. Efficient Top-k Queries for Orthogonal Ranges.

ACM J. Exp. Algorithmics 2011

  1. Theory and practice of monotone minimal perfect hashing.

ACM Trans. Algorithms 2011

  1. Fully compressed suffix trees.
  2. Succinct indexes for strings, binary relations and multilabeled trees.
  3. The tree inclusion problem: In linear space and faster.

Algorithmica 2011

  1. On Optimally Partitioning a Text to Improve Its Compression.

Algorithms 2011

  1. Compressed Matching in Dictionaries.
  2. Edit Distance with Block Deletions.

Bioinform. 2011

  1. Compression of DNA sequence reads in FASTQ format.

CoRR 2011

  1. A Compressed Self-Index for Genomic Databases
  2. A Dynamic Stabbing-Max Data Structure with Sub-Logarithmic Query Time
  3. A Faster LZ77-Based Index
  4. A Regularity Measure for Context Free Grammars
  5. A Searchable Compressed Edit-Sensitive Parsing
  6. A Simple Linear-Space Data Structure for Constant-Time Range Minimum Query
  7. Algorithms for Jumbled Pattern Matching in Strings
  8. An Improved Move-To-Front(IMTF) Off-line Algorithm for the List Accessing Problem
  9. An implementation of range trees with fractional cascading in C++
  10. Anomaly Sequences Detection from Logs Based on Compression
  11. Approximating Edit Distance in Near-Linear Time
  12. Compressed String Dictionaries
  13. Computing a Longest Common Palindromic Subsequence
  14. Computing on Binary Strings
  15. Computing q-gram Frequencies on Collage Systems
  16. Computing q-gram Non-overlapping Frequencies on SLP Compressed Texts
  17. De-amortizing Binary Search Trees
  18. Don’t Rush into a Union: Take Time to Find Your Roots
  19. Dynamic Range Majority Data Structures
  20. Dynamic Range Selection in Linear Space
  21. Edit Distance to Monotonicity in Sliding Windows
  22. Efficient Seeds Computation Revisited
  23. Encoding 2-D Range Maximum Queries
  24. External Memory Orthogonal Range Reporting with Fast Updates
  25. Fast $q$-gram Mining on SLP Compressed Strings
  26. Faster Approximate Pattern Matching in Compressed Repetitive Texts
  27. Faster fully compressed pattern matching by recompression
  28. Fixed Block Compression Boosting in FM-Indexes
  29. Genome Halving by Block Interchange
  30. I/O-Efficient Data Structures for Colored Range and Prefix Reporting
  31. Improved Grammar-Based Compressed Indexes
  32. Improved space-time tradeoffs for approximate full-text indexing with one edit error
  33. Inducing the LCP-Array
  34. K-sort: A new sorting algorithm that beats Heap sort for n <= 70 lakhs!
  35. Linear Time Inference of Strings from Cover Arrays using a Binary Alphabet
  36. Linear pattern matching on sparse suffix trees
  37. Linear-Space Data Structures for Range Mode Query in Arrays
  38. Lossless data compression on GPGPU architectures
  39. Mining Patterns in Networks using Homomorphism
  40. New Lower and Upper Bounds for Representing Sequences
  41. On Approximability of Block Sorting
  42. On Compressing Permutations and Adaptive Sorting
  43. On Dynamic Optimality for Binary Search Trees
  44. On minimising automata with errors
  45. On the Complexity of Approximate Sum of Sorted List
  46. On-line construction of position heaps
  47. Optimal Indexes for Sparse Bit Vectors
  48. Orthogonal Range Searching on the RAM, Revisited
  49. Partial Data Compression and Text Indexing via Optimal Suffix Multi-Selection
  50. Pattern Matching under Polynomial Transformation
  51. Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic
  52. Practical Top-K Document Retrieval in Reduced Space
  53. Privacy-Enhanced Methods for Comparing Compressed DNA Sequences
  54. Quadratic-time Algorithm for the String Constrained LCS Problem
  55. Random input helps searching predecessors
  56. Reference Sequence Construction for Relative Compression of Genomes
  57. Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections
  58. Restructuring Compressed Texts without Explicit Decompression
  59. Self-Index Based on LZ77
  60. Self-Index based on LZ77 (thesis)
  61. Sorting Algorithms with Restrictions
  62. Space Lower Bounds for Online Pattern Matching
  63. Space-Efficient Data-Analysis Queries on Grids
  64. SparseAssembler: de novo Assembly with the Sparse de Bruijn Graph
  65. Stratified B-trees and versioning dictionaries
  66. String Indexing for Patterns with Wildcards
  67. String Matching with Variable Length Gaps
  68. Substring Range Reporting
  69. Succinct Representations of Permutations and Functions
  70. Succincter Text Indexing with Wildcards
  71. The Cell Probe Complexity of Dynamic Range Counting
  72. Tight lower bounds for online labeling problem
  73. Towards Optimal Sorting of 16 Elements
  74. Towards an Optimal Space-and-Query-Time Index for Top-$k$ Document Retrieval
  75. Tying up the loose ends in fully LZW-compressed pattern matching
  76. Uncommon Suffix Tries
  77. Upper Bounds for Maximally Greedy Binary Search Trees

Fundam. Informaticae 2011

  1. Self-Indexed Grammar-Based Compression.

Inf. Comput. 2011

  1. Space-efficient construction of Lempel-Ziv compressed text indexes.

Inf. Process. Lett. 2011

  1. Computing Longest Previous non-overlapping Factors.

Inf. Process. Manag. 2011

  1. Improving semistatic compression via phrase-based modeling.

Inf. Syst. 2011

  1. Fully dynamic metric access methods based on hyperplane partitioning.

Int. J. Found. Comput. Sci. 2011

  1. Stronger Quickheaps.

J. Discrete Algorithms 2011

  1. Fast searching in packed strings.
  2. Missing pattern discovery.
  3. Tight bounds for online stable sorting.

Probl. Inf. Transm. 2011

  1. Computing the longest common substring with one mismatch.

Proc. VLDB Endow. 2011

  1. Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections.

Theor. Comput. Sci. 2011

  1. Approximate string matching with stuck address bits.
  2. On-line approximate string matching with bounded errors.
  3. Succinct data structures for Searchable Partial Sums with optimal worst-case performance.
  4. Succinct representation of dynamic trees.
  5. Verifying and enumerating parameterized border arrays.