StringologyTimes

Papers for stringologist (2012)

Contents

AIAI (2) 2012

  1. GapMis-OMP: Pairwise Short-Read Alignment on Multi-core Architectures.

ALENEX 2012

  1. Computing a Consensus of Multilabeled Trees.
  2. Fast Compressed Tries through Path Decompositions.
  3. Solving the Minimum String Cover Problem.
  4. The Complexity of Partial Orders.

ANALCO 2012

  1. The Complexity of Partial Orders.

BCB 2012

  1. ERNE-BS5: aligning BS-treated sequences by multiple hits on a 5-letters alphabet.

BIBM Workshops 2012

  1. Libgapmis: An ultrafast library for short-read single-gap alignment.

COCOON 2012

  1. Multi-pattern Matching with Bidirectional Indexes.

CPM 2012

  1. A Linear Kernel for the Complementary Maximal Strip Recovery Problem.
  2. An Efficient Linear Pseudo-minimization Algorithm for Aho-Corasick Automata.
  3. Approximation Algorithms and Hardness Results for Shortest Path Based Graph Orientations.
  4. Compressed String Dictionary Look-Up with Edit Distance One.
  5. Computing the Burrows-Wheeler Transform of a String and Its Reverse.
  6. Computing the Rooted Triplet Distance between Galled Trees by Counting Triangles.
  7. Constant-Time Word-Size String Matching.
  8. Cross-Document Pattern Matching.
  9. Document Listing for Queries with Excluded Pattern.
  10. Efficient Algorithm for Circular Burrows-Wheeler Transform.
  11. Efficient Exponential Time Algorithms for Edit Distance between Unordered Trees.
  12. Efficient Two-Dimensional Pattern Matching with Scaling and Rotation and Higher-Order Interpolation.
  13. FEMTO: Fast Search of Large Sequence Collections.
  14. Faster and Simpler Minimal Conflicting Set Identification - (Extended Abstract).
  15. Finding Longest Common Segments in Protein Structures in Nearly Linear Time.
  16. Fixed-Parameter Algorithms for Finding Agreement Supertrees.
  17. Gene Regulation, Protein Networks and Disease: A Computational Perspective.
  18. Hardness of Longest Common Subsequence for Sequences with Bounded Run-Lengths.
  19. Impact of the Energy Model on the Complexity of RNA Folding with Pseudoknots.
  20. Least Random Suffix/Prefix Matches in Output-Sensitive Time.
  21. Local Exact Pattern Matching for Non-fixed RNA Structures.
  22. Minimum Leaf Removal for Reconciliation: Complexity and Algorithms.
  23. Near Linear Time Construction of an Approximate Index for All Maximum Consecutive Sub-sums of a Sequence.
  24. On Approximating String Selection Problems with Outliers.
  25. On the Closest String via Rank Distance.
  26. Partitioning into Colorful Components by Minimum Edge Deletions.
  27. Pattern Matching in Multiple Streams.
  28. Simple and Efficient LZW-Compressed Multiple Pattern Matching.
  29. Speeding Up q-Gram Mining on Grammar-Based Compressed Texts.
  30. The Complexity of String Partitioning.
  31. The Maximum Number of Squares in a Tree.
  32. The Parameterized Complexity of the Shared Center Problem.
  33. Time-Space Trade-Offs for Longest Common Extensions.
  34. Towards an Optimal Space-and-Query-Time Index for Top-k Document Retrieval.
  35. Wavelet Trees for All.

DCC 2012

  1. A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time.
  2. Adaptive Context Tree Weighting.
  3. Compressed Dynamic Binary Relations.
  4. Differentially Encoded Search Trees.
  5. Fast Construction of Nearly-Optimal Prefix Codes without Probability Sorting.
  6. Fast Insertion and Deletion in Compressed Texts.
  7. Gipfeli - High Speed Compression Algorithm.
  8. Mixing Strategies in Data Compression.
  9. Slashing the Time for BWT Inversion.

Developments in Language Theory 2012

  1. Fine and Wilf’s Theorem for k-Abelian Periods.
  2. Pseudoperiodic Words.
  3. Squares in Binary Partial Words.

ECML/PKDD (2) 2012

  1. General Algorithms for Mining Closed Flexible Patterns under Various Equivalence Relations.

ESA 2012

  1. Efficient Communication Protocols for Deciding Edit Distance.
  2. New Lower and Upper Bounds for Representing Sequences.
  3. Succinct Data Structures for Path Queries.
  4. Succinct Posets.

FAW-AAIM 2012

  1. Fast Relative Lempel-Ziv Self-index for Similar Sequences.

FOCS 2012

  1. On-Line Indexing for General Alphabets via Predecessor Queries on Subsets of an Ordered List.

ICALP (1) 2012

  1. CRAM: Compressed Random Access Memory.
  2. De-amortizing Binary Search Trees.
  3. Faster Fully Compressed Pattern Matching by Recompression.
  4. Succinct Indices for Range Queries with Applications to Orthogonal Range Maxima.

ISAAC 2012

  1. A Framework for Succinct Labeled Ordinal Trees over Large Alphabets.
  2. A General Method for Improving Insertion-Based Adaptive Sorting.
  3. An Improved Algorithm for Static 3D Dominance Reporting in the Pointer Machine.
  4. Computing the Longest Common Subsequence of Two Run-Length Encoded Strings.
  5. Efficient Counting of Square Substrings in a Tree.
  6. Finger Search in the Implicit Model.

IWOCA 2012

  1. A Sequential Recursive Implementation of Dead-Zone Single Keyword Pattern Matching.
  2. Border Array for Structural Strings.
  3. Computing a Longest Common Palindromic Subsequence.
  4. Indexing Highly Repetitive Collections.
  5. Range Extremum Queries.

LATA 2012

  1. A Faster Grammar-Based Self-index.
  2. Longest Common Extensions via Fingerprinting.

LATIN 2012

  1. Forbidden Patterns.
  2. Indexed Multi-pattern Matching.

MFCS 2012

  1. Abelian Pattern Avoidance in Partial Words.
  2. Computing Lempel-Ziv Factorization Online.
  3. Fine and Wilf’s Theorem and Pseudo-repetitions.
  4. How to Reconstruct a Genome.

SEA 2012

  1. Branch Mispredictions Don’t Affect Mergesort.
  2. Dynamizing Succinct Tree Representations.
  3. Fast, Small, Simple Rank/Select on Bitmaps.
  4. Space Efficient Modifications to Structator - A Fast Index-Based Search Tool for RNA Sequence-Structure Patterns.
  5. Space-Efficient Top-k Document Retrieval.

SODA 2012

  1. A linear time algorithm for seeds computation.
  2. Fully persistent B-trees.
  3. I/O-efficient data structures for colored range and prefix reporting.
  4. Top-k document retrieval in optimal time and linear space.
  5. Using hashing to solve the dictionary problem.

SOFSEM 2012

  1. Computing q-Gram Non-overlapping Frequencies on SLP Compressed Texts.

SPIRE 2012

  1. A Study on Novelty Evaluation in Biomedical Information Retrieval.
  2. A Zipf-Like Distant Supervision Approach for Multi-document Summarization Using Wikinews Articles.
  3. Active Microbloggers: Identifying Influencers, Leaders and Discussers in Microblogging Networks.
  4. Approximate Function Matching under δ- and γ- Distances.
  5. Approximate Period Detection and Correction.
  6. Basic Word Completion and Prediction for Hebrew.
  7. Characterization and Extraction of Irredundant Tandem Motifs.
  8. Clustering Heterogeneous Data with Mutual Semi-supervision.
  9. Collection Ranking and Selection for Federated Entity Search.
  10. Compressed Representation of Web and Social Networks via Dense Subgraphs.
  11. Compressed Suffix Trees for Repetitive Texts.
  12. Computing Discriminating and Generic Words.
  13. Computing Maximum Number of Runs in Strings.
  14. Computing the Maximal-Exponent Repeats of an Overlap-Free String in Linear Time.
  15. Configurations and Minority in the String Consensus Problem.
  16. Dual-Sorted Inverted Lists in Practice.
  17. Eager XPath Evaluation over XML Streams.
  18. Efficient Bubble Enumeration in Directed Graphs.
  19. Efficient Data Structures for the Factor Periodicity Problem.
  20. Efficient LZ78 Factorization of Grammar Compressed Text.
  21. Experiments on Pseudo Relevance Feedback Using Graph Random Walks.
  22. Fast Multiple String Matching Using Streaming SIMD Extensions Technology.
  23. Faster Algorithm for Computing the Edit Distance between SLP-Compressed Strings.
  24. Grammar Precompression Speeds Up Burrows-Wheeler Compression.
  25. Impact of Regionalization on Performance of Web Search Engine Result Caches.
  26. Improved Address-Calculation Coding of Integer Arrays.
  27. Improved Grammar-Based Compressed Indexes.
  28. Method of Mining Subtopics Using Dependency Structure and Anchor Texts.
  29. Parallel Suffix Array Construction for Shared Memory Architectures.
  30. Parikh Matching in the Streaming Model.
  31. Position-Aligned Translation Model for Citation Recommendation.
  32. Ranked Document Retrieval in (Almost) No Space.
  33. Relevance Feedback Method Based on Vector Space Basis Change.
  34. Semantic Document Representation: Do It with Wikification.
  35. Smaller Self-indexes for Natural Language.
  36. Space-Efficient Computation of Maximal and Supermaximal Repeats in Genome Sequences.
  37. Temporal Web Image Retrieval.
  38. The Longest Common Subsequence Problem with Crossing-Free Arc-Annotated Sequences.
  39. The Position Heap of a Trie.
  40. The Wavelet Matrix.
  41. Usage Data in Web Search: Benefits and Limitations.
  42. Variable-Length Codes for Space-Efficient Grammar-Based Compression.

STACS 2012

  1. Linear-Space Data Structures for Range Mode Query in Arrays.
  2. Tying up the loose ends in fully LZW-compressed pattern matching.

STOC 2012

  1. Strict fibonacci heaps.
  2. Tight lower bounds for the online labeling problem.

SWAT 2012

  1. A Fast Algorithm for Permutation Pattern Matching Based on Alternating Runs.
  2. Linear-Space Data Structures for Range Minority Query in Arrays.
  3. Sorted Range Reporting.
  4. String Indexing for Patterns with Wildcards.

Stringology 2012

  1. A Computational Framework for Determining Square-maximal Strings.
  2. A Multiobjective Approach to the Weighted Longest Common Subsequence Problem.
  3. An Efficient Parallel Determinisation Algorithm for Finite-state Automata.
  4. BlastGraph: Intensive Approximate Pattern Matching in Sequence Graphs and de-Bruijn Graphs.
  5. Correctness-by-Construction in Stringology.
  6. Failure Deterministic Finite Automata.
  7. LZW Data Compression on Large Scale and Extreme Distributed Systems.
  8. New and Efficient Approaches to the Quasiperiodic Characterisation of a String.
  9. Quasi-linear Time Computation of the Abelian Periods of a Word.
  10. Similarity Based Deduplication with Small Data Chunks.
  11. The Number of Cubes in Sturmian Words.

WABI 2012

  1. Comparing DNA Sequence Collections by Direct Comparison of Compressed Text Indexes.
  2. Distributed String Mining for High-Throughput Sequencing Data.
  3. Space-Efficient and Exact de Bruijn Graph Representation Based on a Bloom Filter.
  4. Succinct de Bruijn Graphs.

WALCOM 2012

  1. Linear Time Inference of Strings from Cover Arrays Using a Binary Alphabet - (Extended Abstract).

ACM Comput. Surv. 2012

  1. A comparison of index-based lempel-Ziv LZ77 factorization algorithms.

ACM Trans. Algorithms 2012

  1. Succinct ordinal trees based on tree covering.

ACM Trans. Inf. Syst. 2012

  1. Word-based self-indexes for natural language text.

Algorithmica 2012

  1. Fast Arc-Annotated Subsequence Matching in Linear Space.
  2. Lightweight Data Indexing and Compression in External Memory.
  3. Stronger Lempel-Ziv Based Compressed Text Indexing.
  4. Succinct Representation of Labeled Graphs.

Algorithms 2012

  1. A Note on Sequence Prediction over Large Alphabets.
  2. An Online Algorithm for Lightweight Grammar-Based Compression.

CoRR 2012

  1. (Really) Tight bounds for dispatching binary methods
  2. A Bijective String Sorting Transform
  3. A Fast Algorithm for Permutation Pattern Matching Based on Alternating Runs
  4. A New Algorithm for Data Compression Optimization
  5. A Note on Efficient Computation of All Abelian Periods in a String
  6. A memory versus compression ratio trade-off in PPM via compressed context modeling
  7. A refined Quicksort asymptotic
  8. Adaptive Techniques to find Optimal Planar Boxes
  9. Algorithms for Computing Abelian Periods of Words
  10. Algorithms for discovering and proving theorems about permutation patterns.
  11. An Entertaining Example for the Usage of Bitwise Operations in Programming
  12. Approximate pattern matching with k-mismatches in packed text
  13. BIN@ERN: Binary-Ternary Compressing Data Coding
  14. Better Space Bounds for Parameterized Range Majority and Minority
  15. Binary Jumbled String Matching: Faster Indexing in Less Space
  16. Bouma2 - A Quasi-Stateless, Tunable Multiple String-Match Algorithm
  17. Compact Binary Relation Representations with Rich Functionality
  18. Comparison of Bucket Sort and RADIX Sort
  19. Computing Lempel-Ziv Factorization Online
  20. Counting common substrings effectively
  21. Cross-Document Pattern Matching
  22. Data Structure Lower Bounds on Random Access to Grammar-Compressed Strings
  23. Deterministic Polynomial-Time Algorithms for Designing Short DNA Words
  24. Distance Measures for Sequences
  25. Efficient Algorithms for Finding Tucker Patterns
  26. Efficient LZ78 factorization of grammar compressed text
  27. Efficient algorithms for highly compressed data: The Word Problem in Generalized Higman Groups is in P
  28. Fast Packed String Matching for Short Patterns
  29. Fast algorithm finding the shortest reset words
  30. Faster Compact Top-k Document Retrieval
  31. Grammar-Based Construction of Indexes for Binary Jumbled Pattern Matching
  32. Improved in-place associative integer sorting
  33. Improving Compressed Counting
  34. In-place associative integer sorting
  35. In-place associative permutation sort
  36. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform
  37. Linear Time Lempel-Ziv Factorization: Simple, Fast, Small
  38. Linear-Space Substring Range Counting over Polylogarithmic Alphabets
  39. Lower bounding edit distances between permutations
  40. Lyndon Words and Short Superstrings
  41. Memory Efficient De Bruijn Graph Construction
  42. Necklaces, Convolutions, and X+Y
  43. New Algorithms for Position Heaps
  44. New algorithms for binary jumbled pattern matching
  45. Note on the Greedy Parsing Optimality for Dictionary-Based Text Compression
  46. On Approximating String Selection Problems with Outliers
  47. On Optimal Top-K String Retrieval
  48. On Top-k Search and Range Reporting
  49. On a New Method of Storing a Variable Size Array
  50. On the Complexity of Minimum Labeling Alignment of Two Genomes
  51. On the Value of Multiple Read/Write Streams for Data Compression
  52. On the combinatorics of suffix arrays
  53. On-line Indexing for General Alphabets via Predecessor Queries on Subsets of an Ordered List
  54. Optimal Dynamic Sequence Representations.
  55. Optimal compression of hash-origin prefix trees
  56. Pattern Matching in Multiple Streams
  57. Predecessor search with distance-sensitive query time
  58. Quasi-Succinct Indices
  59. Quasiperiodicities in Fibonacci strings
  60. QuickHeapsort: Modifications and improved analysis
  61. Ranked Document Retrieval in (Almost) No Space
  62. Sequential-Access FM-Indexes
  63. Simpler and Faster Lempel Ziv Factorization
  64. Simplified, stable parallel merging
  65. Some Novel Results From Analysis of Move To Front (MTF) List Accessing Algorithm
  66. Sorted Range Reporting
  67. Sorting and preimages of pattern classes
  68. Sorting distinct integer keys using in-place associative sort
  69. Sorting distinct integers using improved in-place associative sort
  70. Space-Time Trade-offs for Stack-Based Algorithms
  71. Sparse Suffix Tree Construction with Small Space
  72. SparseDTW: A Novel Approach to Speed up Dynamic Time Warping
  73. Speeding-up $q$-gram mining on grammar-based compressed texts
  74. Streaming Complexity of Checking Priority Queues
  75. String Trees
  76. Succinct Indices for Range Queries with applications to Orthogonal Range Maxima
  77. Succinct Posets
  78. TH*:Scalable Distributed Trie Hashing
  79. The Rightmost Equal-Cost Position Problem
  80. The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space
  81. Time and Space Efficient Lempel-Ziv Factorization based on Run Length Encoding
  82. Time-Space Trade-Offs for Longest Common Extensions

Comput. J. 2012

  1. Boosting Text Compression with Word-Based Statistical Encoding.
  2. The String-to-Dictionary Matching Problem.

IEEE ACM Trans. Comput. Biol. Bioinform. 2012

  1. An Efficient Alignment Algorithm for Searching Simple Pseudoknots over Long Genomic Sequence.

IEEE Trans. Knowl. Data Eng. 2012

  1. Practical Efficient String Mining.

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2012

  1. A Fast On-Line Algorithm for the Longest Common Subsequence Problem with Constant Alphabet.

Inf. Comput. 2012

  1. Bidirectional search in a string with wavelet trees and bidirectional matching statistics.

Inf. Process. Lett. 2012

  1. An efficient algorithm to test square-freeness of strings compressed by straight-line programs.
  2. Computing all subtree repeats in ordered trees.

Inf. Process. Manag. 2012

  1. Bidirectional delta files.

Inf. Retr. 2012

  1. Implicit indexing of natural language text by reorganizing bytecodes.

Int. J. Found. Comput. Sci. 2012

  1. Finding Characteristic Substrings from Compressed Texts.
  2. Parallel Algorithms for Mapping Short degenerate and Weighted DNA Sequences to a Reference genome.

J. Comput. Syst. Sci. 2012

  1. Ultra-succinct representation of ordered trees with applications.

J. Discrete Algorithms 2012

  1. An algorithm for mapping short reads to a dynamically changing genomic sequence.
  2. Dictionary-symbolwise flexible parsing.
  3. On left and right seeds of a string.
  4. String matching with alphabet sampling.
  5. Tree template matching in ranked ordered trees by pushdown automata.
  6. Worst-case efficient single and multiple string matching on packed texts in the word-RAM model.

SIAM J. Comput. 2012

  1. An Optimal Dynamic Data Structure for Stabbing-Semigroup Queries.

Softw. Pract. Exp. 2012

  1. Revisiting bounded context block-sorting transformations.

Theor. Comput. Sci. 2012

  1. LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations.
  2. New algorithms on wavelet trees and applications to information retrieval.
  3. String matching with variable length gaps.
  4. Succinct representations of permutations and functions.

Theory Comput. Syst. 2012

  1. Faster Approximate String Matching for Short Patterns.