Omnibus Sequences, Coupon Collection, and Missing Word Counts

被引:6
|
作者
Abraham, Sunil [1 ]
Brockman, Greg [2 ]
Sapp, Stephanie [3 ]
Godbole, Anant P. [4 ]
机构
[1] Univ Oxford, Oxford, England
[2] MIT, Cambridge, MA 02139 USA
[3] Univ Calif Berkeley, Berkeley, CA 94720 USA
[4] E Tennessee State Univ, Johnson City, TN 37614 USA
基金
美国国家科学基金会;
关键词
Coupon collection; Omnibus sequences; Extreme value distribution;
D O I
10.1007/s11009-011-9247-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we study the properties of k-omnisequences of length n, defined to be strings of length n that contain all strings of smaller length k embedded as (not necessarily contiguous) subsequences. We start by proving an elementary result that relates our problem to the classical coupon collector problem. After a short survey of relevant results in coupon collection, we focus our attention on the number M of strings (or words) of length k that are not found as subsequences of an n string, showing that there is a gap between the probability threshold for the emergence of an omnisequence and the zero-infinity threshold for .
引用
收藏
页码:363 / 378
页数:16
相关论文
共 50 条
  • [1] Omnibus Sequences, Coupon Collection, and Missing Word Counts
    Sunil Abraham
    Greg Brockman
    Stephanie Sapp
    Anant P. Godbole
    [J]. Methodology and Computing in Applied Probability, 2013, 15 : 363 - 378
  • [2] MODERATE DEVIATIONS FOR WORD COUNTS IN BIOLOGICAL SEQUENCES
    Behrens, Sarah
    Loewe, Matthias
    [J]. JOURNAL OF APPLIED PROBABILITY, 2009, 46 (04) : 1020 - 1037
  • [3] Exact distribution of word counts in shuffled sequences
    Rodland, EA
    [J]. ADVANCES IN APPLIED PROBABILITY, 2006, 38 (01) : 116 - 133
  • [4] Word Match Counts Between Markovian Biological Sequences
    Burden, Conrad
    Leopardi, Paul
    Foret, Sylvain
    [J]. BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES (BIOSTEC 2013), 2014, 452 : 147 - 161
  • [5] Omnibus control charts for Poisson counts
    Weiß, Christian H.
    [J]. Computers and Industrial Engineering, 2024, 198
  • [6] The Distribution of Short Word Match Counts between Markovian Sequences
    Burden, Conrad J.
    Leopardi, Paul
    Foret, Sylvain
    [J]. BIOINFORMATICS 2013: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS, 2013, : 25 - 33
  • [7] OMNIBUS - THE 'WORD, THE IMAGE, THE GUN'
    IMLAH, M
    [J]. TLS-THE TIMES LITERARY SUPPLEMENT, 1991, (4617): : 22 - 22
  • [8] OPTIMALITY RESULTS FOR COUPON COLLECTION
    Brown, Mark
    Ross, Sheldon M.
    [J]. JOURNAL OF APPLIED PROBABILITY, 2016, 53 (03) : 930 - 937
  • [9] The coupon subset collection problem
    Adler, I
    Ross, SM
    [J]. JOURNAL OF APPLIED PROBABILITY, 2001, 38 (03) : 737 - 746
  • [10] COUNTS OF LONG ALIGNED WORD MATCHES AMONG RANDOM LETTER SEQUENCES
    KARLIN, S
    OST, F
    [J]. ADVANCES IN APPLIED PROBABILITY, 1987, 19 (02) : 293 - 351