SPIRE for Stringologist
- Adaptive Dynamic Bitvectors.
- Generalization of Repetitiveness Measures for Two-Dimensional Strings.
- Simultaneously Building and Reconciling a Synteny Tree.
- Online Computation of String Net Frequency.
- Faster Algorithms for Ranking/Unranking Bordered and Unbordered Words.
- Burst Edit Distance.
- On the Number of Non-equivalent Parameterized Squares in a String.
- Greedy Conjecture for the Shortest Common Superstring Problem and Its Strengthenings.
- Indexing Finite-State Automata Using Forward-Stable Partitions.
- Computing String Covers in Sublinear Time.
- Simple Linear-Time Repetition Factorization.
- Bounded-Ratio Gapped String Indexing.
- Space-Efficient SLP Encoding for O(log N)-Time Random Access.
- Another Virtue of Wavelet Forests.
- Faster and Simpler Online/Sliding Rightmost Lempel-Ziv Factorizations.
- Quantum Algorithms for Longest Common Substring with a Gap.
- All-Pairs Suffix-Prefix on Dynamic Set of Strings.
- Linear Time Reconstruction of Parameterized Strings from Parameterized Suffix and LCP Arrays for Constant-Sized Alphabets.
- Revisiting the Folklore Algorithm for Random Access to Grammar-Compressed Strings.
- LZ78 Substring Compression with CDAWGs.
- Faster Computation of Chinese Frequent Strings and Their Net Frequencies.
- Compressed Graph Representations for Evaluating Regular Path Queries.
- 2d Side-Sharing Tandems with Mismatches.
- On Computing the Smallest Suffixient Set.
- Bijective BWT Based Compression Schemes.
- Logarithmic-Time Internal Pattern Matching Queries in Compressed and Dynamic Texts.
- Engineering a Textbook Approach to Index Massive String Dictionaries.
- On Suffix Tree Detection.
- A Simple Grammar-Based Index for Finding Approximately Longest Common Substrings.
- Constant Time and Space Updates for the Sigma-Tau Problem.
- Linear-Time Computation of Generalized Minimal Absent Words for Multiple Strings.
- CAGE: Cache-Aware Graphlet Enumeration.
- Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph.
- Longest Common Prefix Arrays for Succinct k-Spectra.
- Count-Min Sketch with Variable Number of Hash Functions: An Experimental Study.
- Algorithms and Hardness for the Longest Common Subsequence of Three Strings and Related Problems.
- Optimal Wheeler Language Recognition.
- Binary Mixed-Digit Data Compression Codes.
- Sublinear Time Lempel-Ziv (LZ77) Factorization.
- Largest Repetition Factorization of Fibonacci Words.
- Space-Time Trade-Offs for the LCP Array of Wheeler DFAs.
- Compacting Massive Public Transport Data.
- String Covers of a Tree Revisited.
- Approximation and Fixed Parameter Algorithms for the Approximate Cover Problem.
- Compressibility Measures for Two-Dimensional Data.
- Frequency-Constrained Substring Complexity.
- Computing All-vs-All MEMs in Grammar-Compressed Text.
- Dynamic Compact Planar Embeddings.
- Chaining of Maximal Exact Matches in Graphs.
- Non-overlapping Indexing in BWT-Runs Bounded Space.
- Evaluating Regular Path Queries on Compressed Adjacency Matrices.
- From de Bruijn Graphs to Variation Graphs - Relationships Between Pangenome Models.
- Data Structures for SMEM-Finding in the PBWT.
- New Advances in Rightmost Lempel-Ziv.
- Approximate Cartesian Tree Matching: An Approach Using Swaps.
- Efficient Parameterized Pattern Matching in Sublinear Space.
- On the Number of Factors in the LZ-End Factorization.
- How Train-Test Leakage Affects Zero-Shot Retrieval.
- Accessing the Suffix Array via φ -1-Forest.
- Substring Complexities on Run-Length Compressed Strings.
- Quantum Time Complexity and Algorithms for Pattern Matching on Labeled Graphs.
- Sorting Genomes by Prefix Double-Cut-and-Joins.
- Computing the Parameterized Burrows-Wheeler Transform Online.
- Compressed String Dictionaries via Data-Aware Subtrie Compaction.
- Internal Masked Prefix Sums and Its Connection to Fully Internal Measurement Queries.
- Pattern Matching Under DTW Distance.
- The Complexity of the Co-occurrence Problem.
- Computing All-vs-All MEMs in Run-Length-Encoded Collections of HiFi Reads.
- On Representing the Degree Sequences of Sublogarithmic-Degree Wheeler Graphs.
- Genome Comparison on Succinct Colored de Bruijn Graphs.
- Reconstructing Parameterized Strings from Parameterized Suffix and LCP Arrays.
- Balancing Run-Length Straight-Line Programs.
- Maximal Closed Substrings.
- Subsequence Covers of Words.
- Online Algorithms for Finding Distinct Substrings with Length and Multiple Prefix and Suffix Conditions.
- Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors.
- On the Optimisation of the GSACA Suffix Array Construction Algorithm.
- On the Hardness of Computing the Edit Distance of Shallow Trees.
- Matching Patterns with Variables Under Edit Distance.
- KATKA: A KRAKEN-Like Tool with k Given at Query Time.
- Improved Topic Modeling in Twitter Through Community Pooling.
- Lower Bounds for the Number of Repetitions in 2D Strings.
- All Instantiations of the Greedy Algorithm for the Shortest Common Superstring Problem are Equivalent.
- On the Approximation Ratio of LZ-End to LZ77.
- r-Indexing the eBWT.
- String Covers of a Tree.
- Permutation-Constrained Common String Partitions with Applications.
- Longest Common Rollercoasters.
- Extracting the Sparse Longest Common Prefix Array from the Suffix Binary Search Tree.
- Position Heaps for Cartesian-Tree Matching on Strings and Tries.
- An LMS-Based Grammar Self-index with Local Consistency Properties.
- A Separation of γ and b via Thue-Morse Words.
- Exploiting Pseudo-locality of Interchange Distance.
- Unicode at Gigabytes per Second.
- On Stricter Reachable Repetitiveness Measures.
- findere: Fast and Precise Approximate Membership Query.
- TSXor: A Simple Time Series Compression Algorithm.
- Minimal Unique Palindromic Substrings After Single-Character Substitution.
- Computing the Original eBWT Faster, Simpler, and with Less Memory.
- Grammar Index by Induced Suffix Sorting.
- Efficient Construction of Hierarchical Overlap Graphs.
- Contextual Pattern Matching.
- Lyndon Words, the Three Squares Lemma, and Primitive Squares.
- Adaptive Exact Learning in a Mixed-Up World: Dealing with Periodicity, Errors and Jumbled-Index Queries in String Reconstruction.
- Computing Covers Under Substring Consistent Equivalence Relations.
- A Comparison of Empirical Tree Entropies.
- Practical Random Access to SLP-Compressed Texts.
- Measuring Controversy in Social Networks Through NLP.
- Approximating the Anticover of a String.
- Relative Lempel-Ziv Compression of Suffix Arrays.
- Multidimensional Period Recovery.
- Navigating Forest Straight-Line Programs in Constant Time.
- An Efficient Elastic-Degenerate Text Index? Not Likely.
- Efficient Enumeration of Distinct Factors Using Package Representations.
- Smaller Fully-Functional Bidirectional BWT Indexes.
- Towards Efficient Interactive Computation of Dynamic Time Warping Distance.
- Internal Quasiperiod Queries.
- On Repetitiveness Measures of Thue-Morse Words.
- Pre-indexing Pruning Strategies.
- Longest Square Subsequence Problem Revisited.
- Tailoring r-index for Document Listing Towards Metagenomics Applications.
- Lossless Image Compression Using List Update Algorithms.
- Fast, Small, and Simple Document Listing on Repetitive Text Collections.
- Rpair: Rescaling RePair with Rsync.
- Faster Repetition-Aware Compressed Suffix Trees Based on Block Trees.
- Fast Identification of Heavy Hitters by Cached and Packed Group Testing.
- Compact Data Structures for Shortest Unique Substring Queries.
- Space-Efficient Merging of Succinct de Bruijn Graphs.
- Direct Linear Time Construction of Parameterized Suffix and LCP Arrays for Constant Alphabets.
- BM25 Beyond Query-Document Similarity.
- Implementing the Topological Model Succinctly.
- An Index for Sequencing Reads Based on the Colored de Bruijn Graph.
- A Practical Alphabet-Partitioning Rank/Select Data Structure.
- Inducing the Lyndon Array.
- Weighted Shortest Common Supersequence Problem Revisited.
- Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings.
- Fast Cartesian Tree Matching.
- Bounds and Estimates on the Average Edit Distance.
- Online Algorithms on Antipowers and Antiperiods.
- Space- and Time-Efficient Storage of LiDAR Point Clouds.
- Searching Runs in Streams.
- Range Shortest Unique Substring Queries.
- A New Linear-Time Algorithm for Centroid Decomposition.
- On the Computation of Longest Previous Non-overlapping Factors.
- Adaptive Succinctness.
- Faster Dynamic Compressed d-ary Relations.
- Parallel External Memory Wavelet Tree and Wavelet Matrix Construction.
- Linear Time Maximum Segmentation Problems in Column Stream Model.
- An Optimal Algorithm to Find Champions of Tournament Graphs.
- Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search.
- Polynomial-Delay Enumeration of Maximal Common Subsequences.
- SACABench: Benchmarking Suffix Array Construction.
- On Longest Common Property Preserved Substring Queries.
- Minimal Absent Words in Rooted and Unrooted Trees.
- Run-Length Encoding in a Finite Universe.
- Network-Based Pooling for Topic Modeling on Microblog Content.
- COBS: A Compact Bit-Sliced Signature Index.
- Trickier XBWT Tricks.
- Indexed Dynamic Programming to Boost Edit Distance and LCSS Computation.
- Compressed Range Minimum Queries.
- Faster and Smaller Two-Level Index for Network-Based Trajectories.
- Searching for a Modified Pattern in a Changing Text.
- Better Heuristic Algorithms for the Repetition Free LCS and Other Variants.
- Longest Property-Preserved Common Factor.
- Optimal In-Place Suffix Sorting.
- New Structures to Solve Aggregated Queries for Trips over Public Transportation Networks.
- Recovering, Counting and Enumerating Strings from Forward and Backward Suffix Arrays.
- Adaptive Computation of the Discrete Fréchet Distance.
- Maximal Motif Discovery in a Sliding Window.
- Fast Wavelet Tree Construction in Practice.
- Towards a Compact Representation of Temporal Rasters.
- Truncated DAWGs and Their Application to Minimal Absent Word Problem.
- On Extended Special Factors of a Word.
- Compressed Communication Complexity of Longest Common Prefixes.
- Longest Common Prefixes with k-Errors and Applications.
- Fast and Effective Neural Networks for Translating Natural Language into Denotations.
- Computing Burrows-Wheeler Similarity Distributions for String Collections.
- Efficient Computation of Sequence Mappability.
- Block Palindromes: A New Generalization of Palindromes.
- Linear-Time Online Algorithm Inferring the Shortest Path from a Walk.
- Recoloring the Colored de Bruijn Graph.
- Early Commenting Features for Emotional Reactions Prediction.
- The Colored Longest Common Prefix Array Computed via Sequential Scans.
- Faster Recovery of Approximate Periods over Edit Distance.
- 3DGraCT: A Grammar-Based Compressed Representation of 3D Trajectories.
- Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression.
- On Suffix Tree Breadth.
- Tight Bounds for Top Tree Compression.
- Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries.
- Pattern Matching on Elastic-Degenerate Text with Errors.
- Succinct Partial Sums and Fenwick Trees.
- Order Preserving Pattern Matching on Trees and DAGs.
- Counting Palindromes in Substrings.
- Lightweight BWT and LCP Merging via the Gap Algorithm.
- Distinct Squares in Circular Words.
- Faster Practical Block Compression for Rank/Select Dictionaries.
- Fast Construction of Compressed Web Graphs.
- Efficient Compression and Indexing of Trajectories.
- On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation.
- Greedy Shortest Common Superstring Approximation in Compact Space.
- Regular Abelian Periods and Longest Common Abelian Factors on Run-Length Encoded Strings.
- Detecting One-Variable Patterns.
- Mining Bit-Parallel LCS-length Algorithms.
- Optimal Skeleton Huffman Trees.
- Constructing a Consensus Phylogeny from a Leaf-Removal Distance (Extended Abstract).
- A Self-index on Block Trees.
- Listing Maximal Independent Sets with Minimal Space and Bounded Delay.
- Longest Common Factor After One Edit Operation.
- Practical Implementation of Space-Efficient Dynamic Keyword Dictionaries.
- LZ78 Compression in Low Main Memory Space.
- Fast Label Extraction in the CDAWG.
- Pattern Matching for Separable Permutations.
- Efficient Representation of Multidimensional Data over Hierarchical Domains.
- Parallel Lookups in String Indexes.
- Lexical Matching of Queries and Ads Bid Terms in Sponsored Search.
- XBWT Tricks.
- Maximal Unbordered Factors of Random Strings.
- The Smallest Grammar Problem Revisited.
- Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries.
- Compact Trip Representation over Networks.
- Analyzing Relative Lempel-Ziv Reference Construction.
- Bookmarks in Grammar-Compressed Strings.
- Fast Classification of Protein Structures by an Alignment-Free Kernel.
- A Linear-Space Algorithm for the Substring Constrained Alignment Problem.
- GraCT: A Grammar Based Compressed Representation of Trajectories.
- Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array.
- AC-Automaton Update Algorithm for Semi-dynamic Dictionary Matching.
- LCP Array Construction Using O(sort(n)) (or Less) I/Os.
- Longest Common Abelian Factors and Large Alphabets.
- Efficient and Compact Representations of Some Non-canonical Prefix-Free Codes.
- Fully Dynamic de Bruijn Graphs.
- Dynamic and Approximate Pattern Matching in 2D.
- Inverse Range Selection Queries.
- Parallel Computation for the All-Pairs Suffix-Prefix Problem.
- RLZAP: Relative Lempel-Ziv with Adaptive Pointers.
- Fragmented BWT: An Extended BWT for Full-Text Indexing.
- Temporal Query Classification at Different Granularities.
- Efficient Term Set Prediction Using the Bell-Wigner Inequality.
- Adaptive Computation of the Swap-Insert Correction Distance.
- Sampling the Suffix Array with Minimizers.
- Temporal Analysis of CHAVE Collection.
- Beyond the Runs Theorem.
- Feasibility of Word Difficulty Prediction.
- Parallel Construction of Succinct Representations of Suffix Tree Topologies.
- Fast Online Lempel-Ziv Factorization in Compressed Space.
- Transforming XML Streams with References.
- A Compact RDF Store Using Suffix Arrays.
- Computing the Longest Unbordered Substring.
- Range LCP Queries Revisited.
- Chaining Fragments in Sequences: to Sweep or Not (Extended Abstract).
- Efficient Algorithms for Longest Closed Factor Array.
- Tight Bound for the Number of Distinct Palindromes in a Tree.
- Longest Common Prefix with Mismatches.
- Induced Sorting Suffixes in External Memory with Better Design and Less Space.
- Prefix and Suffix Reversals on Strings.
- Relative Select.
- Assessing the Efficiency of Suffix Stripping Approaches for Portuguese Stemming.
- ShRkC: Shard Rank Cutoff Prediction for Selective Search.
- Faster Exact Search Using Document Clustering.
- A Faster Algorithm for Computing Maximal \alpha -gapped Repeats in a String.
- Evaluating Geographical Knowledge Re-Ranking, Linguistic Processing and Query Expansion Techniques for Geographical Information Retrieval.
- Online Self-Indexed Grammar Compression.
- Improved Practical Compact Dynamic Tries.
- DeShaTo: Describing the Shape of Cumulative Topic Distributions to Rank Retrieval Systems Without Relevance Judgments.
- On Prefix/Suffix-Square Free Words.
- Fishing in Read Collections: Memory Efficient Indexing for Sequence Assembly.
- How Big is that Genome? Estimating Genome Size and Coverage from k-mer Abundance Spectra.
- Space-Efficient Detection of Unusual Words.
- Filtration Algorithms for Approximate Order-Preserving Matching.
- Selective Labeling and Incomplete Label Mitigation for Low-Cost Evaluation.
- A 3-Approximation Algorithm for the Multiple Spliced Alignment Problem and Its Application to the Gene Prediction Task.
- On the String Consensus Problem and the Manhattan Sequence Consensus Problem.
- Information-Theoretic Term Selection for New Item Recommendation.
- Improved Filters for the Approximate Suffix-Prefix Overlap Problem.
- A Compressed Suffix-Array Strategy for Temporal-Graph Indexing.
- Shortest Unique Queries on Strings.
- Order Preserving Prefix Tables.
- Performance Improvements for Search Systems Using an Integrated Cache of Lists+Intersections.
- Succinct Indexes for Reporting Discriminating and Generic Words.
- Online Pattern Matching for String Edit Distance with Moves.
- K 2-Treaps: Range Top-k Queries in Compact Space.
- Grammar Compressed Sequences with Rank/Select Support.
- Efficient Compressed Indexing for Approximate Top-k String Retrieval.
- Relative FM-Indexes.
- Indexed Matching Statistics and Shortest Unique Substrings.
- Strategic Pattern Search in Factor-Compressed Text.
- Simple and Efficient String Algorithms for Query Suggestion Metrics Computation.
- Alphabet-Independent Algorithms for Finding Context-Sensitive Repeats in Linear Time.
- Relative Lempel-Ziv with Constant-Time Random Access.
- Efficient Indexing and Representation of Web Access Logs.
- I/O-Efficient Dictionary Search with One Edit Error.
- Context-Aware Deal Size Prediction.
- Fast Construction of Wavelet Trees.
- Online Multiple Palindrome Pattern Matching.
- Sequence Decision Diagrams.
- Algorithms for Jumbled Indexing, Jumbled Border and Jumbled Square on Run-Length Encoded Strings.
- Accurate Profiling of Microbial Communities from Massively Parallel Sequencing Using Convex Optimization.
- Nowcasting with Google Trends.
- Compact Querieable Representations of Raster Data.
- Suffix Array of Alignment: A Practical Index for Similar Data.
- Discovering Dense Subgraphs in Parallel for Compressing Web and Social Networks.
- Faster Top-k Document Retrieval in Optimal Space.
- On Two-Dimensional Lyndon Words.
- You Are What You Eat: Learning User Tastes for Rating Prediction.
- Distributed Query Processing on Compressed Graphs Using K2-Trees.
- Fully-Online Grammar Compression.
- Query Processing in Highly-Loaded Search Engines.
- A Lempel-Ziv Compressed Structure for Document Listing.
- Using Mutual Influence to Improve Recommendations.
- Learning to Schedule Webpage Updates Using Genetic Programming.
- Order-Preserving Incomplete Suffix Trees and Order-Preserving Indexes.
- Faster Lyndon Factorization Algorithms for SLP and LZ78 Compressed Text.
- Minimal Discriminating Words Problem Revisited.
- Learning URL Normalization Rules Using Multiple Alignment of Sequences.
- Adaptive Data Structures for Permutations and Binary Relations.
- Simulation Study of Multi-threading in Web Search Engine Processors.
- Solving Graph Isomorphism Using Parameterized Matching.
- Faster Range LCP Queries.
- Pattern Discovery and Listing in Graphs.
- Efficient Approximation of Edit Distance.
- Adding Compression and Blended Search to a Compact Two-Level Suffix Array.
- Document Listing on Versioned Documents.
- Top-k Color Queries on Tree Paths.
- Position-Restricted Substring Searching over Small Alphabets.
- Lossless Compression of Rotated Maskless Lithography Images.
- Consolidating and Exploring Information via Textual Inference.
- Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs.
- Space-Efficient Construction of the Burrows-Wheeler Transform.
- Efficient LZ78 Factorization of Grammar Compressed Text.
- Basic Word Completion and Prediction for Hebrew.
- Active Microbloggers: Identifying Influencers, Leaders and Discussers in Microblogging Networks.
- Space-Efficient Computation of Maximal and Supermaximal Repeats in Genome Sequences.
- Method of Mining Subtopics Using Dependency Structure and Anchor Texts.
- Fast Multiple String Matching Using Streaming SIMD Extensions Technology.
- Configurations and Minority in the String Consensus Problem.
- Position-Aligned Translation Model for Citation Recommendation.
- Approximate Period Detection and Correction.
- Computing Discriminating and Generic Words.
- Clustering Heterogeneous Data with Mutual Semi-supervision.
- Impact of Regionalization on Performance of Web Search Engine Result Caches.
- Improved Grammar-Based Compressed Indexes.
- A Study on Novelty Evaluation in Biomedical Information Retrieval.
- Usage Data in Web Search: Benefits and Limitations.
- Improved Address-Calculation Coding of Integer Arrays.
- Semantic Document Representation: Do It with Wikification.
- Smaller Self-indexes for Natural Language.
- Approximate Function Matching under δ- and γ- Distances.
- Computing the Maximal-Exponent Repeats of an Overlap-Free String in Linear Time.
- Efficient Data Structures for the Factor Periodicity Problem.
- Parallel Suffix Array Construction for Shared Memory Architectures.
- Experiments on Pseudo Relevance Feedback Using Graph Random Walks.
- Variable-Length Codes for Space-Efficient Grammar-Based Compression.
- Eager XPath Evaluation over XML Streams.
- Compressed Representation of Web and Social Networks via Dense Subgraphs.
- Relevance Feedback Method Based on Vector Space Basis Change.
- The Wavelet Matrix.
- A Zipf-Like Distant Supervision Approach for Multi-document Summarization Using Wikinews Articles.
- Computing Maximum Number of Runs in Strings.
- Collection Ranking and Selection for Federated Entity Search.
- Ranked Document Retrieval in (Almost) No Space.
- The Position Heap of a Trie.
- Efficient Bubble Enumeration in Directed Graphs.
- Faster Algorithm for Computing the Edit Distance between SLP-Compressed Strings.
- Characterization and Extraction of Irredundant Tandem Motifs.
- The Longest Common Subsequence Problem with Crossing-Free Arc-Annotated Sequences.
- Dual-Sorted Inverted Lists in Practice.
- Parikh Matching in the Streaming Model.
- Temporal Web Image Retrieval.
- Grammar Precompression Speeds Up Burrows-Wheeler Compression.
- Compressed Suffix Trees for Repetitive Texts.
- Fast q-gram Mining on SLP Compressed Strings.
- Reference Sequence Construction for Relative Compression of Genomes.
- On-Line Construction of Position Heaps.
- Computing the Longest Common Prefix Array Based on the Burrows-Wheeler Transform.
- ESP-Index: A Compressed Index Based on Edit-Sensitive Parsing.
- Approximate Point Set Pattern Matching with L p -Norm.
- Indexing with Gaps.
- Constructing Strings at the Nano Scale via Staged Self-assembly.
- On Suffix Extensions in Suffix Trees.
- Compressed Text Indexing with Wildcards.
- A New Approach for Verifying URL Uniqueness in Web Crawlers.
- Navigating the User Query Space.
- COCA Filters: Co-occurrence Aware Bloom Filters.
- Near Real-Time Suffix Tree Construction via the Fringe Marked Ancestor Problem.
- Cross-Lingual Text Fragment Alignment Using Divergence from Randomness.
- When Was It Written? Automatically Determining Publication Dates.
- A Learned Approach for Ranking News in Real-Time Using the Blogosphere.
- Attribute Retrieval from Relational Web Tables.
- Succinct Gapped Suffix Arrays.
- Approximations and Partial Solutions for the Consensus Sequence Problem.
- Space Efficient Wavelet Tree Construction.
- Weighted Shortest Common Supersequence.
- Spaced Seeds Design Using Perfect Rulers.
- Improved Compressed Indexes for Full-Text Document Retrieval.
- Fixed Block Compression Boosting in FM-Indexes.
- Sparse Spatial Selection for Novelty-Based Search Result Diversification.
- Finding Frequent Elements in Compressed 2D Arrays and Strings.
- Fast Computation of a String Duplication History under No-Breakpoint-Reuse - (Extended Abstract).
- Detecting Health Events on the Social Web to Enable Epidemic Intelligence.
- Persistency in Suffix Trees with Applications to String Interval Problems.
- Enhancing Document Snippets Using Temporal Information.
- Discounted Cumulative Gain and User Decision Models.
- Query-Sets + + : A Scalable Approach for Modeling Web Sites.
- External Query Reformulation for Text-Based Image Retrieval.
- A Succinct Index for Hypertext.
- A Multi-faceted Approach to Query Intent Classification.
- A Knowledge-Based Semantic Kernel for Text Classification.
- Computing All Subtree Repeats in Ordered Ranked Trees.
- Candidate Document Retrieval for Web-Scale Text Reuse Detection.
- Approximate Regular Expression Matching with Multi-strings.
- Compressed Indexes for Aligned Pattern Matching.
- A Self-Supervised Approach for Extraction of Attribute-Value Pairs from Wikipedia Articles.
- A PTAS for the Square Tiling Problem.
- Extracting Powers and Periods in a String from Its Runs Structure.
- Training Parse Trees for Efficient VF Coding.
- The Gapped Suffix Array: A New Index Structure for Fast Approximate Matching.
- Colored Range Queries and Document Retrieval.
- Using Related Queries to Improve Web Search Results Ranking.
- Counting and Verifying Maximal Palindromes.
- On the Hardness of Counting and Sampling Center Strings.
- Why Large Closest String Instances Are Easy to Solve in Practice.
- Compressed Self-indices Supporting Conjunctive Queries on Document Collections.
- On Shortest Common Superstring and Swap Permutations.
- Algorithms for Finding a Minimum Repetition Representation of a String.
- Incremental Algorithms for Effective and Efficient Query Recommendation.
- Evaluation of Query Performance Prediction Methods by Range.
- Finite Automata Based Algorithms for the Generalized Constrained Longest Common Subsequence Problems.
- Temporal Analysis of Document Collections: Framework and Applications.
- Parameterized Searching with Mismatches for Run-Length Encoded Strings - (Extended Abstract).
- Mining Large Query Induced Graphs towards a Hierarchical Query Folksonomy.
- String Matching with Variable Length Gaps.
- Restricted LCS.
- Dual-Sorted Inverted Lists.
- Approximate String Matching with Stuck Address Bits.
- Faster Compressed Dictionary Matching.
- Querying the Web Graph - (Invited Talk).
- Standard Deviation as a Query Hardness Estimator.
- Relative Lempel-Ziv Compression of Genomes for Large-Scale Storage and Retrieval.
- Fast Bit-Parallel Matching for Network and Regular Expressions.
- Text Comparison Using Soft Cardinality.
- Fingerprinting Ratings for Collaborative Filtering - Theoretical and Empirical Analysis.
- Improved Fast Similarity Search in Dictionaries.
- Identifying SNPs without a Reference Genome by Comparing Raw Reads.
- Range Queries over Untangled Chains.
- Dynamic Z-Fast Tries.
- String Retrieval for Multi-pattern Queries.
- On Tag Spell Checking.
- CST++.
- Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval.
- Multiplication Algorithms for Monge Matrices.
- Computing Matching Statistics and Maximal Exact Matches on Compressed Full-Text Indexes.
- Succinct Representations of Dynamic Strings.
- The Frequent Items Problem, under Polynomial Decay, in the Streaming Model.
- On Entropy-Compressed Text Indexing in External Memory.
- Constant Factor Approximation of Edit Distance of Bounded Height Unordered Trees.
- Identifying the Intent of a User Query Using Support Vector Machines.
- Generalised Matching.
- Efficient Index for Retrieving Top-k Most Frequent Documents.
- A Linear-Time Burrows-Wheeler Transform Using Induced Sorting.
- A Last-Resort Semantic Cache for Web Queries.
- Succinct Text Indexing with Wildcards.
- Indexing Variable Length Substrings for Exact and Approximate Matching.
- Practical Algorithms for the Longest Common Extension Problem.
- Range Quantile Queries: Another Virtue of Wavelet Trees.
- Set Intersection and Sequence Matching.
- A Comparison of Data-Driven Automatic Syllabification Methods.
- Novel and Generalized Sort-Based Transform for Lossless Data Compression.
- Expectation of Strings with Mismatches under Markov Chain Distribution.
- k2-Trees for Compact Web Graph Representation.
- A Two-Level Structure for Compressing Aligned Bitexts.
- Consensus Optimizing Both Distance Sum and Radius.
- Use of Co-occurrences for Temporal Expressions Annotation.
- Sketching Algorithms for Approximating Rank Correlations in Collaborative Filtering Systems.
- Two-Dimensional Distributed Inverted Files.
- Improved Approximation Results on the Shortest Common Supersequence Problem.
- Towards a Theory of Patches.
- Fast Single-Pass Construction of a Half-Inverted Index.
- Directly Addressable Variable-Length Codes.
- Compressed Suffix Arrays for Massive Data.
- On-Line Construction of Parameterized Suffix Trees.
- Faster Algorithms for Sampling and Counting Biological Sequences.
- A Compressed Enhanced Suffix Array Supporting Fast String Matching.
- On-Demand Associative Cross-Language Information Retrieval.
- A Task-Based Evaluation of an Aggregated Search Interface.
- Syntactic Query Models for Restatement Retrieval.
- Efficient Language-Independent Retrieval of Printed Documents without OCR.
- New Perspectives on the Prefix Array.
- Indexed Hierarchical Approximate String Matching.
- An Efficient Linear Space Algorithm for Consecutive Suffix Alignment under Edit Distance (Short Preliminary Paper).
- Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections.
- Exact Distribution of a Spaced Seed Statistic for DNA Homology Detection.
- Approximate Runs - Revisited.
- Interchange Rearrangement: The Element-Cost Model.
- Improved Variable-to-Fixed Length Codes.
- The Effect of Weighted Term Frequencies on Probabilistic Latent Semantic Term Relationships.
- Mismatch Sampling.
- On the Structure of Small Motif Recognition Instances.
- Out of the Box Phrase Indexing.
- Approximated Pattern Matching with the L1, L2 and Linfinit Metrics.
- Pattern Matching with Pair Correlation Distance.
- Practical Rank/Select Queries over Arbitrary Sequences.
- Clique Analysis of Query Log Graphs.
- Term Impacts as Normalized Term Frequencies for BM25 Similarity Scoring.
- Sliding CDAWG Perfection.
- Context-Sensitive Grammar Transform: Compression and Pattern Matching.
- Engineering Radix Sort for Strings.
- “Search Is a Solved Problem” and Other Annoying Fallacies.
- Faster Text Fingerprinting.
- Some Approximations for Shortest Common Nonsubsequences and Supersequences.
- Self-indexing Natural Language.
- delta-gamma-Parameterized Matching.
- Speeding Up Pattern Matching by Text Sampling.
- Comparison of s-gram Proximity Measures in Out-of-Vocabulary Word Translation.
- Admission Policies for Caches of Search Engine Results.
- Exploiting Genre in Focused Crawling.
- Prefix-Shuffled Geometric Suffix Tree.
- A Pocket Guide to Web History.
- A Filtering Algorithm for k -Mismatch with Don’t Cares.
- A Fast and Compact Web Graph Representation.
- Algorithms for Weighted Matching.
- A Chaining Algorithm for Mapping cDNA Sequences to Multiple Genomic Sequences.
- Jump-Matching with Errors.
- A Web-Page Usage Prediction Scheme Using Weighted Suffix Trees.
- Approximate Swap and Mismatch Edit Distance.
- Extending Weighting Models with a Term Quality Measure.
- Optimal Self-adjusting Trees for Dynamic String Data in Secondary Storage.
- Estimating Number of Citations Using Author Reputation.
- Enhancing Educational-Material Retrieval Using Authored-Lesson Metadata.
- Efficient Text Proximity Search.
- Approximating Constrained LCS.
- Highly Frequent Terms and Sentence Retrieval.
- Local Transpositions in Alignment of Polyphonic Musical Sequences.
- Indexing a Dictionary for Subset Matching Queries.
- Efficient Computations of l1 and linfinity Rearrangement Distances.
- Approximate String Matching with Lempel-Ziv Compressed Indexes.
- Tuning Approximate Boyer-Moore for Gene Sequences.
- Implicit Compression Boosting with Applications to Self-indexing.
- Generalized LCS.
- Compact Set Representation for Information Retrieval.
- Edge-Guided Natural Language Text Compression.
- Discovering Context-Topic Rules in Search Engine Logs.
- On-Line Repetition Detection.
- MP-Boost: A Multiple-Pivot Boosting Algorithm and Its Application to Text Categorization.
- English to Persian Transliteration.
- Sparse Directed Acyclic Word Graphs.
- A Multiple Criteria Approach for Information Retrieval.
- Adaptive Query-Based Sampling of Distributed Collections.
- Improving Usability Through Password-Corrective Hashing.
- A Compressed Self-index Using a Ziv-Lempel Dictionary.
- Computing the Minimum Approximate lambda-Cover of a String.
- TreeBoost.MH: A Boosting Algorithm for Multi-label Hierarchical Text Categorization.
- A New Algorithm for Fast All-Against-All Substring Matching.
- Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution.
- Word-Based Correction for Retrieval of Arabic OCR Degraded Documents.
- Incremental Aggregation of Latent Semantics Using a Graph-Based Energy Model.
- The Intention Behind Web Queries.
- Efficient Lazy Algorithms for Minimal-Interval Semantics.
- Output-Sensitive Autocompletion Search.
- Phrase-Based Pattern Matching in Compressed Text.
- Structured Index Organizations for High-Throughput Text Querying.
- Efficient Algorithms for Pattern Matching with General Gaps and Character Classes.
- Matrix Tightness: A Linear-Algebraic Framework for Sorting by Transpositions.
- Dotted Suffix Trees A Structure for Approximate Text Indexing.
- Compact Features for Detection of Near-Duplicates in Distributed Retrieval.
- Analyzing User Behavior to Rank Desktop Items.
- Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory.
- Using String Comparison in Context for Improved Relevance Feedback in Different Text Media.
- Principal Components for Automatic Term Hierarchy Building.
- How to Compare Arc-Annotated Sequences: The Alignment Hierarchy.
- A Statistical Model of Query Log Generation.
- Mapping Words into Codewords on PPM.
- Fast Plagiarism Detection System.
- Faster Generation of Super Condensed Neighbourhoods Using Finite Automata.
- Cache-Conscious Collision Resolution in String Hash Tables.
- A Generalization of the Method for Evaluation of Stemming Algorithms Based on Error Counting.
- Normalized Similarity of RNA Sequences.
- XML Multimedia Retrieval.
- Classifying Sentences Using Induced Structure.
- Counting Lumps in Word Space: Density as a Measure of Corpus Homogeneity.
- Towards Real-Time Suffix Tree Construction.
- Evaluating Hierarchical Clustering of Search Results.
- Application of Clustering Technique in Multiple Sequence Alignment.
-
N-Gram Similarity and Distance.
- An Edit Distance Between RNA Stem-Loops.
- Deriving TF-IDF as a Fisher Kernel.
- Using the k-Nearest Neighbor Graph for Proximity Searching in Metric Spaces.
- Linear Time Algorithm for the Generalised Longest Common Repeat Problem.
- Practical and Optimal String Matching.
- Retrieval Status Values in Information Retrieval Evaluation.
- Lossless Filter for Finding Long Multiple Approximate Repetitions Using a New Data Structure, the Bi-factor Array.
- Composite Pattern Discovery for PCR Application.
- Necklace Swap Problem for Rhythmic Similarity Measures.
- Lydia: A System for Large-Scale News Analysis.
- Comparison of Representations of Multiple Evidence Using a Functional Framework for IR.
- A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching.
- Enhanced Byte Codes with Restricted Prefix Properties.
- Measuring the Difficulty of Distance-Based Indexing.
- Multi-label Text Categorization Using K-Nearest Neighbor Approach with M-Similarity.
- Utilizing Dynamically Updated Estimates in Solving the Longest Common Subsequence Problem.
- Approximate Matching in the Linfinity Metric.
- L1 Pattern Matching Lower Bound.
- Stemming Arabic Conjunctions and Prepositions.
- Counting Suffix Arrays and Strings.
- Compressed Perfect Embedded Skip Lists for Quick Inverted-Index Lookups.
- Restricted Transposition Invariant Approximate String Matching Under Edit Distance.
- Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences.
- A Multiple Graph Layers Model with Application to RNA Secondary Structures Comparison.
- Rank-Sensitive Data Structures.
- A Model for Information Retrieval Based on Possibilistic Networks.
- Computing Similarity of Run-Length Encoded Strings with Affine Gap Penalty.
- XML Retrieval with a Natural Language Interface.
- A Bit-Parallel Tree Matching Algorithm for Patterns with Horizontal VLDC’s.
- A Bilingual Linking Service for the Web.
- A Fast Algorithmic Technique for Comparing Large Phylogenetic Trees.
- Recommending Better Queries from Click-Through Data.
- Efficient Extraction of Structured Motifs Using Box-Links.
- Searching XML Documents Using Relevance Propagation.
- Linear Time Algorithm for the Longest Common Repeat Problem.
- Linear Nondeterministic Dawg String Matching Algorithm.
- Simple, Fast, and Efficient Natural Language Adaptive Compression..
- Information Extraction by Embedding HMM to the Set of Induced Linguistic Features.
- A Space-Saving Linear-Time Algorithm for Grammar-Based Compression.
- An Efficient Algorithm for the Longest Tandem Scattered Subsequence Problem.
- Efficient Computation of Balancedness in Binary Sequence Generators.
- Indexing Text Documents Based on Topic Identification.
- An Improvement and an Extension on the Hybrid Index for Approximate String Matching.
- Negations and Document Length in Logical Retrieval.
- Automatic Document Categorization Based on k-NN and Object-Based Thesauri.
- A Scalable System for Identifying Co-derivative Documents.
- On the Transformation Distance Problem.
- Simple Implementation of String B-Trees..
- Concurrency Control and I/O-Optimality in Bulk Insertion..
- Searching for a Set of Correlated Patterns.
- Metric Indexing for the Vector Model in Text Retrieval.
- Inferring Query Performance Using Pre-retrieval Predictors..
- Cross-Comparison for Two-Dimensional Text Categorization.
- Processing Conjunctive and Phrase Queries with the Set-Based Model.
- Alphabet Permutation for Differentially Encoding Text.
- Efficient One Dimensional Real Scaled Matching.
- A New Feature Normalization Scheme Based on Eigenspace for Noisy Speech Recognition.
- Longest Motifs with a Functionally Equivalent Central Block.
- Techniques for Efficient Query Expansion.
- An Alphabet-Friendly FM-Index.
- On Classification of Strings.
- Automaton-Based Sublinear Keyword Pattern Matching.
- On Asymptotic Finite-State Error Repair.
- Evaluating Relevance Feedback and Display Strategies for Searching on Small Displays.
- Evaluation of Web Page Representations by Content Through Clustering.
- Finding Cross-Lingual Spelling Variants.
- Dealing with Syntactic Variation Through a Locality-Based Approach.
- Fast Detection of Common Sequence Structure Patterns in RNAs.
- First Huffman, Then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index.
- Metric Indexes for Approximate String Matching in a Dictionary.
- Bit-Parallel Branch and Bound Algorithm for Transposition Invariant LCS.
- Motif Extraction from Weighted Sequences.
- DDOC: Overlapping Clustering of Words for Document Classification.
- New Algorithms for Finding Monad Patterns in DNA Sequences.
- An Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays for Alphabets of Non-negligible Size.
- Permuted and Scaled String Matching.
- Flexible and Efficient Bit-Parallel Techniques for Transposition Invariant Approximate Matching in Music Retrieval.
- New Refinement Techniques for Longest Common Subsequence Algorithms.
- Link Information as a Similarity Measure in Web Classification.
- Bit-Parallel Approximate String Matching Algorithms with Transposition.
- FindStem: Analysis and Evaluation of a Turkish Stemming Algorithm.
- Current Challenges in Bioinformatics.
- Memory-Adaptive Dynamic Spatial Approximation Trees.
- French Noun Phrase Indexing and Mining for an Information Retrieval System.
- What’s Changed? Measuring Document Change in Web Crawling for Search Engines.
- The Size of Subsequence Automaton.
- BFT: Bit Filtration Technique for Approximate String Join in Biological Databases.
- The Implementation and Evaluation of a Lexicon-Based Stemmer.
- Non-adjacent Digrams Improve Matching of Cross-Lingual Spelling Variants.
- Patterns on the Web.
- SCM: Structural Contexts Model for Improving Compression in Semistructured Text Databases.
- Alternative Algorithms for Bit-Parallel String Matching.
- A Practical Index for Genome Searching.
- A Three Level Search Engine Index Based in Query Log Distribution.
- Ranking Structured Documents Using Utility Theory in the Bayesian Network Retrieval Model.
- Large Edit Distance with Multiple Block Operations.
- A Bit-Parallel Suffix Automation Approach for (delta, gamma)-Matching in Music Retrieval.
- Improving Text Retrieval in Medical Collections Through Automatic Categorization.
- An Empirical Comparison of Text Categorization Methods.
- Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm.
- Processing of Huffman Compressed Texts with a Super-Alphabet.
- Distributed Query Processing Using Suffix Arrays.
- Using WordNet for Word Sense Disambiguation to Support Concept Map Construction.
- Linear-Time Off-Line Text Compression by Longest-First Substitution.
- (S, C)-Dense Coding: An Optimized Compression Code for Natural Language Text Databases.
- String Matching Problems from Bioinformatics Which Still Need Better Solutions (Extended Abstract).
- Processing Text Files as Is: Pattern Matching over Compressed Texts, Multi-byte Character Texts, and Semi-structured Texts.
- Focussed Structured Document Retrieval.
- A Framework for Generating Attribute Extractors for Web Data Sources.
- Enhancing the Set-Based Model Using Proximity Information.
- Faster String Matching with Super-Alphabets.
- Sorting by Prefix Transpositions.
- Firing Policies for an Arabic Rule-Based Stemmer.
- Compact Directed Acyclic Word Graphs for a Sliding Window.
- Probabilistic Proximity Searching Algorithms Based on Compact Partitions.
- String Matching with Metric Trees Using an Approximate Distance.
- The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives.
- Tree Pattern Matching for Linear Static Terms.
- Optimal Exact Strring Matching Based on Suffix Arrays.
- Efficient Computation of Long Similar Subsequences.
- Fully Dynamic Spatial Approximation Trees.
- From Searching Text to Querying XML Streams.
- Java MARIAN: From an OPAC to a Modern Digital Library System.
- A Theoretical Analysis of Google’s PageRank.
- Indexing Text Using the Ziv-Lempel Trie.
- Pattern Matching over Multi-attribute Data Streams.
- Multiple Example Queries in Content-Based Image Retrieval.
- Machine Learning Approach for Homepage Finding Task.
- On the Size of DASG for Multiple Texts.
- Towards a More Comprehensive Comparison of Collaborative Filtering Algorithms.
- Web Structure, Dynamics and Page Quality.
- Stemming Galician Texts.
- t-Spanners as a Data Structure for Metric Space Searching.
- Re-Store: A System for Compressing, Browsing, and Searching Large Documents (Invited Paper).
- Speeding-up Hirschberg and Hunt-Szymanski LCS Algorithms.
- On Using Two-Phase Filtering in Indexed Approximate String Matching with Application to Searching Unique Oligonucleotides.
- Design of a Graphical User Interface for Structured Documents Retrieval.
- A Stemming Algorithmm for the Portuguese Language.
- A Comparative Study of Topic Identification on Newspaper and E-mail.
- A Documental Database Query Language.
- Relating Web Characteristics with Link Based Web Page Ranking.
- On Compression of Parse Trees.
- A Model for the Representation and Focussed Retrieval of Structured Documents Based on Fuzzy Aggregation.
- Semantic Labeling - Unveiling the Main Components of Meaning of Free-Text (Invited Paper).
- Of Maps Bigger than the Empire (Invited Paper).
- Genome Rearrangements Distance by Fusion, Fission, and Transposition is Easy.
- Exact Distribution of Deletion Sizes for Unavoidable Strings.
- Speed-up of Aho-Corasick Pattern Matching Machines by Rearranging States.
- Adding Security to Compressed Information Retrieval Systems.
- Musical Sequence Comparison for Melodic and Rhythmic Similarities.
- Fast Categorisation of Large Document Collections.
- Evaluation of N-grams Conflation Approach in Text-Based Information Retrieval.
- Semantic Thesaurus for Automatic Expanded Query in Information Retrieval.
- A Subquadratic Algorithm for Cluster and Outlier Detection in Massive Metric Data.
- Using Edit Distance in Point-Pattern Matching.
- Storing Semistructured Data in Relational Databases.
- Compaction Techniques for Nextword Indexes.
- Using Semantics for Paragraph Selection in Question Answering Systems.
- An Efficient Bottom-Up Distance between Trees.
- Distributed Query Processing Using Partitioned Inverted Files.
- On-Line Construction of Symmetric Compact Directed Acyclic Word Graphs.
- Implementing Document Ranking within a Logical Framework.
- An Effective Clustering Algorithm to Index High Dimensional Metric Spaces.
- A New Approach for Approximating the Tranposition Distance.
- A Model and Software Architecture for Search Results Visualization on the WWW.
- Fast Calculation of Optimal Strategies for Searching with Non-Uniform Costs.
- Fully Compressed Pattern Matching Algorithm for Balanced Straight-Line Programs.
- Experiment Analysis in Newspaper Topic Detection.
- Fast Multipattern Search Algorithms for Intrusion Detection.
- Automatic Construction of Rule-Based Trees for Conceptual Retrieval.
- Speeding up Parallel Decoding of LZ Compressed Text on the PRAM EREW.
- DNA Processing in Ciliates - A Computational Point of View (invited abstract).
- A PRAM-on-Chip Vision (invited abstract).
- Parallel Search Using Partitioned Inverted Files.
- DelfosnetX: A Workbench for XML-Based Information Retrieval Systems.
- An Image Similarity Measure Based on Graph Matching.
- Combinatorial Methods for Approximate Pattern Matching under Rotations and Translations in 3D Arrays.
- New Approaches to Information Management: Attribute-Centric Data Systems (invited paper).
- Adding String Processing Capabilities to Data Management Systems.
- Hybrid Protein Model (HPM): A Method to Compact Protein 3D-Structure Information and Physicochemical Properties.
- Finding Repeats with Fixed Gap.
- Bit-Parallel Approach to Approximate String Matching in Compressed Texts.
- NFAs with Tagged Transitions, Their Conversion to Deterministic Automata and Application to Regular Expressions.
- A Survey of Longest Common Subsequence Algorithms.
- Virtual Test Tubes: A New Methodology for Computing.
- Computing with Membranes: P Systems with Worm-Objects.
- Online Construction of Subsequence Automata for Multiple Texts.
- Learning Profile in Routing: Comparison between Relevance and Gradient Back-Propagation.
- Prosodic Stress and Topic Detection in Spoken Sentences.
- Muninn: A Pragmatic Information Extraction System.
- A Word Stemming Algorithm for the Spanish Language.
- Rotation Invariant Histogram Filters for Similarity and Distance Measures between Digital Images.
- An Experiment Stemming Non-Traditional Text.
- A Model and a Visual Query Language for Structured Text.
- In-Place Length-Restricted Prefix Coding.
- Hyperdictionary: A Knowledge Discovery Tool to Help Information Retrieval.
- Evidence Accumulation with Competition in Information Retrieval.
- SST versus EST in Gene Recognition (Invited Paper).
- Reversal and Transposition Distance of Linear Chromosomes.
- Direct Pattern Matching on Compressed Text.
- Searching the Web: Challenges and Partial Solutions (Invited Paper).
- Information Overload - An IR Problem?
- New Approximation Algorithms for Longest Common Subsequences.
- Fast Approximate String Matching in a Dictionary.
- Efficient Search Techniques for the Inference of Minimum Size Finite Automata.
- A Linear Time Lower Bound on Updating Algorithms for Suffix Trees.