Multilingual translation for zero-shot biomedical classification using BioTranslator

被引:4
|
作者
Xu, Hanwen [1 ]
Woicik, Addie [1 ]
Poon, Hoifung [2 ]
Altman, Russ B. [3 ,4 ,5 ]
Wang, Sheng [1 ]
机构
[1] Univ Washington, Sch Comp Sci & Engn, Seattle, WA USA
[2] Microsoft Res, Redmond, WA USA
[3] Stanford Univ, Dept Bioengn, Stanford, CA USA
[4] Stanford Univ, Dept Genet, Stanford, CA USA
[5] Chan Zuckerberg Biohub, San Francisco, CA USA
关键词
MEDICAL LANGUAGE SYSTEM; DRUG-SENSITIVITY; ONTOLOGY; PATHWAY; INTEGRATION; GENECARDS; LANDSCAPE; DISCOVERY; GENOMICS; GRAPH;
D O I
10.1038/s41467-023-36476-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Existing annotation paradigms rely on controlled vocabularies, where each data instance is classified into one term from a predefined set of controlled vocabularies. This paradigm restricts the analysis to concepts that are known and well-characterized. Here, we present the novel multilingual translation method BioTranslator to address this problem. BioTranslator takes a user-written textual description of a new concept and then translates this description to a non-text biological data instance. The key idea of BioTranslator is to develop a multilingual translation framework, where multiple modalities of biological data are all translated to text. We demonstrate how BioTranslator enables the identification of novel cell types using only a textual description and how BioTranslator can be further generalized to protein function prediction and drug target identification. Our tool frees scientists from limiting their analyses within predefined controlled vocabularies, enabling them to interact with biological data using free text.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Multilingual translation for zero-shot biomedical classification using BioTranslator
    Hanwen Xu
    Addie Woicik
    Hoifung Poon
    Russ B. Altman
    Sheng Wang
    [J]. Nature Communications, 14
  • [2] Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
    Zhang, Biao
    Williams, Philip
    Titov, Ivan
    Sennrich, Rico
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1628 - 1639
  • [3] TACKLING DATA SCARCITY IN SPEECH TRANSLATION USING ZERO-SHOT MULTILINGUAL MACHINE TRANSLATION TECHNIQUES
    Tu Anh Dinh
    Liu, Danni
    Niehues, Jan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6222 - 6226
  • [4] Preventing Author Profiling through Zero-Shot Multilingual Back-Translation
    Adelani, David Ifeoluwa
    Zhang, Miaoran
    Shen, Xiaoyu
    Davody, Ali
    Kleinbauer, Thomas
    Klakow, Dietrich
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8687 - 8695
  • [5] Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes
    Zheng, Yafang
    Lin, Lei
    Yuan, Yuxuan
    Shi, Xiaodong
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI, 2024, 14452 : 226 - 238
  • [6] An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation
    Raganato, Alessandro
    Vazquez, Raul
    Creutz, Mathias
    Tiedemann, Jorg
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8449 - 8456
  • [7] Zero-Shot Question Classification Using Synthetic Samples
    Fu, Hao
    Yuan, Caixia
    Wang, Xiaojie
    Sang, Zhijie
    Hu, Shuo
    Shi, Yuanyuan
    [J]. PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 714 - 718
  • [8] Zero-Shot Audio Classification using Image Embeddings
    Dogan, Duygu
    Xie, Huang
    Heittola, Toni
    Virtanen, Tuomas
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1 - 5
  • [9] ENABLING ZERO-SHOT MULTILINGUAL SPOKEN LANGUAGE TRANSLATION WITH LANGUAGE-SPECIFIC ENCODERS AND DECODERS
    Escolano, Carlos
    Costa-jussa, Marta R.
    Fonollosa, Jose A. R.
    Segura, Carlos
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 694 - 701
  • [10] Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents
    Zhang, Biao
    Bapna, Ankur
    Johnson, Melvin
    Dabirmoghaddam, Ali
    Arivazhagan, Naveen
    Firat, Orhan
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4176 - 4192